encyclopedia of physical science and technology - condensed matter

495
Bonding and Structure in Solids J. C. Phillips Lucent Technologies I. Introduction: Molecules and Solids II. Molecular Crystals III. Ionic Crystals and Electronegativity IV. Covalent Crystals and Directed Valence Bonds V. Mixed Covalent and Ionic Bonding VI. Metallic Bonding VII. Quantum Structural Diagrams VIII. Complete Quantum Structure Analysis IX. Chemical Bonding in Solids in the Third Millennium GLOSSARY Atom Smallest unit of an element. Bond Electronic configuration that binds atoms together Covalent bond Chemical bond formed by electron sharing. Crystals Solids in which the atoms are arranged in a pe- riodic fashion. Electronegativity Measure of the ability of an atom to attract electrons. Glass Solid in which the atoms are not arranged in a peri- odic fashion and which melts into a supercooled liquid when heated rapidly. Ionic bond Chemical bond caused by charge transfer. Metallic Material with high electrical conductivity at low frequency. Molecule Bonded atoms in a gas. Valence Number of electrons used by an atom to form chemical bonds. THE RELATIVE POSITIONS of atoms in molecules and solids are described and explained in terms of the ar- rangements of their nearest neighbors. Together with the chemical valences of the atoms as given by the periodic table, these arrangements of the bonding determine the structure and physical properties of solids. Both structure and properties can be used to separate solids into various classes where further quantitative trends can be systemi- cally described by structural diagrams. I. INTRODUCTION: MOLECULES AND SOLIDS The combinations of atoms found in the vapor phase are called molecules. Molecules containing a small number of atoms have been studied accurately and extensively. Most of our knowledge of chemical bonding between atoms comes from these studies. When atoms are condensed to 281

Upload: others

Post on 11-Sep-2021

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GKD Revised Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN002D-71 May 17, 2001 20:25

Bonding and Structure in SolidsJ. C. PhillipsLucent Technologies

I. Introduction: Molecules and SolidsII. Molecular CrystalsIII. Ionic Crystals and ElectronegativityIV. Covalent Crystals and Directed Valence BondsV. Mixed Covalent and Ionic Bonding

VI. Metallic BondingVII. Quantum Structural Diagrams

VIII. Complete Quantum Structure AnalysisIX. Chemical Bonding in Solids

in the Third Millennium

GLOSSARY

Atom Smallest unit of an element.Bond Electronic configuration that binds atoms togetherCovalent bond Chemical bond formed by electron

sharing.Crystals Solids in which the atoms are arranged in a pe-

riodic fashion.Electronegativity Measure of the ability of an atom to

attract electrons.Glass Solid in which the atoms are not arranged in a peri-

odic fashion and which melts into a supercooled liquidwhen heated rapidly.

Ionic bond Chemical bond caused by charge transfer.Metallic Material with high electrical conductivity at low

frequency.Molecule Bonded atoms in a gas.Valence Number of electrons used by an atom to form

chemical bonds.

THE RELATIVE POSITIONS of atoms in moleculesand solids are described and explained in terms of the ar-rangements of their nearest neighbors. Together with thechemical valences of the atoms as given by the periodictable, these arrangements of the bonding determine thestructure and physical properties of solids. Both structureand properties can be used to separate solids into variousclasses where further quantitative trends can be systemi-cally described by structural diagrams.

I. INTRODUCTION: MOLECULESAND SOLIDS

The combinations of atoms found in the vapor phase arecalled molecules. Molecules containing a small number ofatoms have been studied accurately and extensively. Mostof our knowledge of chemical bonding between atomscomes from these studies. When atoms are condensed to

281

Page 2: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GKD Revised Pages

Encyclopedia of Physical Science and Technology EN002D-71 May 17, 2001 20:25

282 Bonding and Structure in Solids

form solids, the atomic density is much greater, as re-flected by the number of atoms that are nearest neighborsof any given atom. This number is called the coordinationnumber. An example is the molecule NaCl, in which eachatom has one nearest neighbor. In solid NaCl each atomhas six nearest neighbors.

Solids in their pure forms are nearly always crystalline.A crystal is a periodic arrangement of atoms along lines,which in turn is repeated periodically along planes. Fi-nally, the planes are repeated periodically to form the crys-tal lattice.

Most of our knowledge of crystal structures comes fromthe diffraction of waves of photons, electrons, or neu-trons by lattice planes. Usually all the atomic positionsin the crystal can be determined this way. By comparingchemical trends in bond lengths in crystals with those inmolecules, one can often infer the nature of the electroniccharge distribution responsible for chemical bonding inthe crystal. From this it may be possible to predict the na-ture of chemical bonding at crystalline defects or even innoncrystalline solids, which are amorphous or glassy.

The structures of millions of solids are known bydiffraction. To understand these structures one begins bystudying the simplest cases and classifying them intogroups. The main groups are characterized as molecu-lar, metallic, ionic, and covalent. In most solids the actualbonding is a mixture to some degree of these differentkinds of chemical interaction. While most solids are com-plex, the inorganic solids, which are best understood be-cause they have had the widest technological applications,are usually either simple examples from a main group orare closely related to them. In contrast, organic and bio-logically important molecules may be quite complex. Thechemical and structural simplicity of technologically im-portant inorganic solids stems from the requirement ofavailability of techniques for production in bulk.

TABLE I Electronegativity Table of the Elements According to Pauling

Li Be B C N O F

1.0 1.5 2.0 2.5 3.0 3.5 4.0

Na Mg Al Si P S Cl

0.9 1.2 1.5 1.8 2.1 2.5 3.0

K Ca Sc Ti V Cr Mn Fe Co Ni Cu Zn Ga Ge As Se Br

0.8 1.0 1.3 1.5 1.6 1.6 1.5 1.8 1.8 1.8 1.9 1.6 1.6 1.8 2.0 2.4 2.8

Rb Sr Y Zr Nb Mo Tc Ru Rh Pd Ag Cd In Sn Sb Te I

0.8 1.0 1.2 1.4 1.6 1.8 1.9 2.2 2.2 2.2 1.9 1.7 1.7 1.8 1.9 2.1 2.5

Cs Ba La–Lu Hf Ta W Re Os Ir Pt Au Hg Tl Pb Bi Po At

0.7 0.9 1.1–1.2 1.3 1.5 1.7 1.9 2.2 2.2 2.2 2.4 1.9 1.8 1.8 1.9 2.0 2.2

Fr Ra Ac Th Pa U Np–No

0.7 0.9 1.1 1.3 1.5 1.7 1.3

Certain general techniques are widely used for describ-ing bonding and structure in solids. Tables of atomic radiiare available for ionic, covalent, and metallic bonding. De-viations of bond lengths from values predicted by theseradii of order 1–3% often reveal critical structural featuresof importance to material fabrication and properties. Thecohesion of solids can be connected to the cohesion ofthe elements. A binary solid AmBn is said to have heat offormation �Hf, which is the difference between m timesthe cohesive energy of A plus n times that of B minus thecohesive energy of AmBn . This heat of formation can beestimated with often remarkable accuracy from Pauling’stable of elemental electronegativities X (A). This is proba-bly the most widely used table in science apart from the pe-riodic table of the elements, and it is shown here as Table I.

II. MOLECULAR CRYSTALS

We now turn to the differently bonded main groups ofsolids. The molecular crystals are the simplest case, be-cause the intermolecular forces are typically much weakerthan the intramolecular ones. As a result the structure ofthe molecules, as reflected, for example, by bond lengthsand vibration frequencies, is almost the same in the solidas in the gas phase. Some examples of materials that formmolecular solids are the inert gases, diatomic halogens,closed-shell molecules such as methane, and many planararomatic molecules such as benzene. Typically, in molec-ular crystals the heat of fusion per molecule per bond is atleast 10 times smaller than the bond dissociation energy.

The binding forces that hold molecular crystals togethermay arise from electric dipoles if the molecules carry per-manent dipole moments (e.g., HCl). When the moleculeshave no permanent moment, binding arises from mutuallyinduced dipole moments (van der Waals interactions).

Page 3: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GKD Revised Pages

Encyclopedia of Physical Science and Technology EN002D-71 May 17, 2001 20:25

Bonding and Structure in Solids 283

The structures of molecular crystals are determined pri-marily by packing considerations and thus vary from ma-terial to material according to molecular shape. Molec-ular solids are generally poor conductors of electric-ity, and even the photoconductivity is generally smallunless metallic impurities are added to “sensitize” thematerial.

III. IONIC CRYSTALS ANDELECTRONEGATIVITY

Before discussing the structure of ionic crystals in detail,we shall familiarize ourselves with the concept of elec-tronegativity, defined by Pauling as “the ability of atomsin the bonded state to attract electrons to themselves.”Atoms in solids are in a variety of bonded states, and itis due to Pauling’s insight that we have come to realizethat the atomic electronegativity that he defined in termsof heat of formation (Section I) is indeed nearly constantfor each element. His idea is that in solids charge flowsfrom cations with smaller electronegativity to anions withgreater electronegativity and that the heat of formation re-sulting from this charge flow is proportional to (Xc− Xa)2

v,where Xc and Xa are the cation and anion electronegati-vities, respectively.

Ionic crystals are composed of cations and anions withvery large electronegativity differences, such as alkali met-als and halides, columns I and VII of the periodic table,respectively. In this case the charge transfer of valenceelectrons is almost complete, so that the core configu-rations become isoelectronic to those of inert-gas atoms(e.g., Na+ to Ne, Cl− to Ar). While some energy is re-quired to ionize the cations and transfer electrons to theanions, this energy is more than recovered thanks to thelarger electronegativity of the anions and the mutual at-traction of cations by their anion neighbors. In the caseof the alkali halides, the cohesive energies can be esti-mated within a few percentage points by assuming com-plete charge transfer and evaluating the electrostatic en-ergies (including ion polarization energies). A core–corerepulsive energy, required by the exclusion principle, com-pletes the calculation, which was first sketched around1910.

As one might expect, the overall features of the crystalstructures of ionic crystals are given quite well by pack-ing spherical cations and anions in the appropriate propor-tions indicated by their chemical formulas. However, theions are not quite the incompressible spheres suggestedby their isoelectronic analogy to inert-gas atoms. If theywere, one could use simple geometrical arguments (origi-nating around 1930) to predict a coordination number of 8(CsCl structure), 6 (NaCl structure), or 4 (ZnS structure).

These correspond to packing cations and anions of nearlyequal size (CsCl structure), and then successively largeranion/cation size ratios lead to increasing anion–anioncontacts, thus reducing coordination numbers. These “ra-dius ratio” rules do not actually describe the crystal struc-tures, as shown in Fig. 1. What this means is that theions should not be regarded as hard spheres, but rather ascenters of quantum mechanically determined electroniccharge distributions. Additional evidence for the break-down of classical electrostatic models is contained in theelastic constants of the alkali halides. If these models werecorrect, the elastic constants would satisfy certain rela-tions (the Cauchy relations) valid for central force inter-actions. These relations are not satisfied for most of thealkali halides, another indication of quantum mechani-cal interactions. Some simplified modern treatments ofthese and related problems are discussed in the followingsections.

FIGURE 1 The structures of the alkali halides M+X− as func-tions of classical ionic radii r+ and r−, respectively. (Coordinationnumbers in parentheses are those predicted by the classical ionicmodel.) In the upper left corner, for example, Li+I− is predictedby the classical model to have coordination number 4, but thesymbol indicates it is actually sixfold coordinated. Key: �, sixfoldcoordinated; �, eightfold coordinated; ❤, six- or eightfold coor-dinated. [From Phillips, J. C. (1974). In “Solid State Chemistry”(N. B. Hannay, ed.), Vol. 1, Plenum, New York.]

Page 4: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GKD Revised Pages

Encyclopedia of Physical Science and Technology EN002D-71 May 17, 2001 20:25

284 Bonding and Structure in Solids

FIGURE 2 The tetrahedrally coordinated diamond structure,which describes many technologically important semiconductorssuch as silicon. [From Phillips, J. C. (1970). Phys. Today 23 (Feb.),23.]

IV. COVALENT CRYSTALS ANDDIRECTED VALENCE BONDS

Whereas ionic crystals can be (at least roughly) describedin classical terms, structure and bonding in covalent crys-tals can be understood only in terms of quantum mechan-ical electron orbital wave functions. Prototypical covalentcrystals have the diamond structure. Many technologicallyimportant semiconductors such as silicon and germaniumhave this structure or a closely related one, the zinc blendor wurtzite structure. In these structures each atom is tetra-hedrally coordinated (Fig. 2).

The structure shown in Fig. 2 can be explained simplyin terms of directed valence electron orbitals. The valence

FIGURE 3 Sketch of electronic interactions between directed valence orbitals that produce an energy gap betweenbonding states (electrons shared between nearest neighbors) and antibonding states (no electron sharing). [FromPhillips, J. C. (1970). Phys. Today 23 (Feb.), 23.]

configuration of the atom is ns2 np2, with n = 3 for silicon.In the crystal this becomes ns np3 so that (counting elec-tron spin twofold degeneracy) both the ns and np levelsare half-filled. These four states can be combined to formfour directed valence orbitals with tetrahedral geometry.The wave functions on nearest neighbors can be combinedin phase to form bonding states or out of phase to formantibonding states. Then wave-function overlap producesan energy gap between these states (Fig. 3). This energygap is the basis of the technologically important electronicand optical properties of semiconductors.

The covalent energy gained by wave-function overlapor interference is much more sensitive to structural per-fection than is the energy associated with classical ionicinteractions. A very important consequence of this sensi-tivity is that it is possible to produce semiconductor crys-tals such as silicon in far purer and far more structurallyperfect states than has been possible with any other solid.It is possible to add impurities with designed concentra-tions and locations to tailor the chemical and mechanicaldesign of the solid with far greater precision and ease thanfor any other solid. Thus, the quantum mechanical natureof structure and bonding in covalent silicon is the key toits technological significance.

V. MIXED COVALENT ANDIONIC BONDING

Most semiconductors and insulators have neither purelycovalent nor purely ionic bonding, but their bonding isdescribed as a mixture of covalent and ionic effects. The

Page 5: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GKD Revised Pages

Encyclopedia of Physical Science and Technology EN002D-71 May 17, 2001 20:25

Bonding and Structure in Solids 285

way in which the mixture occurs is of great importanceboth scientifically and technologically. We shall discussseveral important examples.

The simplest case occurs for the tetrahedrally coordi-nated covalent structure shown in Fig. 2. This structurecontains two kinds of atomic sites: site A with only Bneighbors, and vice versa. In silicon and gemanium bothsites are occupied by the same atom, which has four va-lence electrons. However, one can occupy the two siteswith different atoms, such that the total number of valenceelectrons is eight per atom pair (formally represented byAN B8−N ). Many compounds of this type with N = 3 andN = 2 are known. Two examples with N = 3 are GaAsand InP and ternary and quaternary alloys (Ga, In) (As, P).These materials are transparent in the near infrared, andtheir optical properties can be adjusted by “band gap engi-neering.” They are important as high intensity, low powermonochromatic light sources or as light amplifiers (lasers).

The next interesting case is the triatomic material SiO2

(silica). The electronegativity difference between siliconand oxygen is large, so the bonding here contains a largeionic component. At the same time each oxygen atomcontains six valence electrons while silicon has four, sothe total number of valence electrons per molecular unitis 16 = 8 + 8. This favors covalent bonding. In the solideach silicon atom has tetrahedral oxygen neighbors, whileeach oxygen atom has two silicon neighbors, which isagain the coordination characteristic of covalent bonding.Silica is chemically stable and can be made very pure, formuch the same reasons that silicon can. This high purityis essential to technological applications in the context ofoptical fibers for communications.

Another feature of silica is that it can easily be cooledinto a solid state that is not crystalline but more like afrozen liquid. The state is called a glass. The ductility ofglasses at high temperatures is essential to the manufac-ture of optical fibers. However, glasses are also ductile ona molecular scale and so do not form molecular “cracks,”which would be arrays of broken bonds that might be elec-trically active and destructive to the electronic capabilitiesof semiconductor devices such as transistors. It is one ofnature’s most felicitous accidents that silicon electronicdevices can be packaged by simply oxidizing the surfacesof solid silicon to form a protective coating of silica, SiO2.The silica coating is not only chemically stable (because ofits covalent-ionic bonding), but is also mechanically stablebecause of the ductility of vitreous silica at the molecularlevel. Thus, the interface between the crystalline siliconelectronic device and the silica coating is itself almost per-fect. It does not store fixed charge, even when the thicknessof the silica is only a few molecular layers. This greatlyenhances the performance of microelectronic devices (in-tegrated circuits on silicon “chips.”

The last case is primarily ionic materials with a covalentcomponent. Many oxides in which the oxygen atoms arethree- to sixfold coordinated fall in this class, and this in-cludes many ceramic materials. These materials can havehigh melting points and good chemical stability, but theyare brittle and for this reason their range of technologicalapplications is limited.

VI. METALLIC BONDING

Broadly speaking, three kinds of elements are found inmetals. They are the simple s–p metallic elements fromthe left-hand side of the periodic table, such as lithium, alu-minum, and lead; the rare-earth and transition elementswith f and d valence electrons, such as titanium, iron,and nickel; and metalloid elements, such as carbon, sil-icon, and phosphorus, which may also in certain combi-nations form covalent solids. In metals the coordinationnumber (or number of nearest neighbors) is much larger(usually twice as large or more) than the number of va-lence electrons. This means that the directed valence bondsfound in molecules or in covalent crystals are much weaker(although not completely absent) in metals.

The high electrical and thermal conductivity of metalsis a result of the absence of a gap in the energy spec-trum between filled and empty electronic states. This highelectrical conductivity in turn reduces the contribution tocohesion associated with charge transfer because the inter-nal electric fields are limited by electronic redistributionor charge flow on an atomic scale. Thus, ionic interactionsare reduced in metals compared with ionic crystals.

The reduction of covalent or molecular bonding as wellas ionic bonding in metals presents a paradox. If nei-ther of these bonding mechanisms is fully effective, towhat forces do metals owe their cohesion? Modern quan-tum theory shows a complex correlation of the motion ofmetallic valence electrons, which reduces the Coulombrepulsive energy between these like charges while leavingalmost unchanged the attractive Coulomb interaction be-tween negatively charged electrons and positively chargedatom cores. It is this correlation energy that is primarilyresponsible for metallic cohesion.

From studies of the structure and cohesion of metalsit appears that d valence electrons (as in the transitionmetals) contribute almost as effectively to metallic cohe-sion as s and p electrons. The f electrons in rare-earthmetals, on the other hand, play a minor role in metalliccohesion but occasionally have magnetic properties. Tran-sition metals are notable for their strong magnetic proper-ties (iron, cobalt, and nickel), as well as their high meltingpoints and refractory properties, which result from thelarge number of combined s, p, and d valence electrons.

Page 6: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GKD Revised Pages

Encyclopedia of Physical Science and Technology EN002D-71 May 17, 2001 20:25

286 Bonding and Structure in Solids

The compound with the highest known melting point istungsten carbide (WC), an interesting combination of atransition element whose d levels are half-full with a met-alloid element whose s and p valence levels are half-full.Also, here tungsten is very large and carbon is very small,which makes possible an ionic contribution to the cohesiveand refractory properties.

VII. QUANTUM STRUCTURAL DIAGRAMS

The description of structure and bonding in solids givenin the preceding sections is largely qualitative, but it is afair (although abbreviated) account of most of what wasgenerally known as a result of quantum mechanical anal-ysis in the period from 1930 to 1960. Starting in 1960 amore quantitative description was developed that enablesus to inspect systematic trends in structure and bondingwith the aid of quantum structural diagrams.

With a structural diagram one assigns to each elementcertain characteristics and then treats these characteris-tics as configuration coordinates, which are used to con-struct structural maps. The natural classical configurationcoordinates are atomic size and electronegativity, as de-fined by Pauling (see Table I). To these we may add thenumber of valence electrons per atom. One then takes aclass of binary compounds, say AN BM , with the samevalue of P = N + M and uses size differences (or ratios)as well as electronegativity differences as Cartesian coor-dinates. If the characteristics or configuration coordinateshave genuine value for describing structure and bonding,compounds composed of different elements A and B butwith similar values of their Cartesian coordinates, shouldhave the same crystal structure. Put somewhat differently,the structural map should separate into simple regions,with each region containing compounds with the samecrystal structure.

Early attempts to construct structural maps of this kindusing classical coordinates were only partially successful;as many as 10 or 20% or more of the compounds weremisplaced. From this failure most workers concluded thatthe problem of structure and bonding in solids, and espe-cially in metals where the number of known compoundsexceeds 104, was simply too complex to solve in any sim-ple way. Finding a solution was left to the indefinite future,when computers became large enough and quantum me-chanical methods accurate enough to predict structures ona case-by-case basis.

Recent research has shown that the idea of structural di-agrams is itself valid but that previous failures arose fromthe use of largely classical coordinates. In addition to thenumber of valence electrons per atom (a quantum con-cept), one must also use other quantum variables to replacethe classical variables of atomic size and electronegativity.

This has been done in several ways, which are substan-tially equivalent. The simplest case is AN BM compoundswhere A and B have only s and p valence electrons andN + M = P = 8, which means that the s and p valencelevels are half-full. In this case one can separate ionic andcovalent crystal structures by separating the average en-ergy gap between occupied and empty electronic statesinto ionic and covalent components, represented by C andEh, respectively. Both NaCl (ionic) and diamond, silicon,and germanium (covalent) crystals (Fig. 2) belong in thisgroup, with C/Eh = 0 in the latter and C /Eh largely inthe former. The quantum structural diagram for AN B8−N

nontransition-metal compounds shown in Fig. 4 not onlyis a huge improvement on the classical structure diagramshown in Fig. 1 but also is an exact separation of covalentand ionic crystal structures.

FIGURE 4 The separation of the energy gap shown in Fig. 3 intocovalent and ionic components (Eh and C, respectively) generatesa structural map that separates fourfold- and sixfold-coordinatedANB8−N crystals perfectly (no transition or rare-earth elements).The structures and coordination numbers (in parentheses) are asfollows: �, diamond, zinc blend (4); , wurtzite (4); �, rock salt(6); ❤, rock salt/wurtzite (6, 4). [From Cohen, M. L., Heine, V.,and Phillips, J. C. (1982), Sci. Am. 246 (6), 82.]

Page 7: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GKD Revised Pages

Encyclopedia of Physical Science and Technology EN002D-71 May 17, 2001 20:25

Bonding and Structure in Solids 287

FIGURE 5 A general separation of ANB8−N crystal structuresutilizes quantum coordinates defined for all elements includingrare-earth and transition metals. Compounds containing the latterare indicated by open symbols. [From Villars, P. (1983). J. Less-Common Met. 92, 215.]

To extend this analysis to transition and rareearth met-als as well as compounds in which the valence shell isnot exactly half-full is a monumental taks that includes∼1000 AB compounds, ∼1000 AB2 compounds, andmore than 1000 AB3 and A3B5 compounds, as well asmore than 7000 ternary compounds. The correct quan-tum coordinates for these 10,000 compounds have beenidentified from a field of 182 candidate coordinates, someclassical and some quantum coordinates. All the best co-ordinates are found to be quantum coordinates, and theseturn out to be the atomic ionization potential and a suitablydefined quantum core size. The result for AN B8−N com-pounds (where A or B or both may be transition or rare-earth elements) is shown in Fig. 5. It is representative ofthe best global analysis of structure and bonding in solidsavailable in 1992. This structural map is 97% successful.

In addition to binary compounds one can use diagramsto analyze ternary compounds. Ternary ionic compoundsusually contain two kinds of cations, and their structuresare determined by cation radius ratios. Ternary metal-lic compounds are more complex, and their structuresare determined by valence electron numbers, size differ-ences, and electronegativity differences, much as for thebinary compounds in Fig. 5. Many structure–property re-lationships can be recognized with these diagrams whichconveniently display general trends in both binaries andternaries.

VIII. COMPLETE QUANTUMSTRUCTURE ANALYSIS

On a case-by-case basis a full discussion of structure andbonding in a given solid can be achieved using the most ad-vanced computational techniques combined with the mostsophisticated computers. Work with sufficient precisionand flexibility to describe the structure of solid surfaces,point defects, and solid transitions under high pressuresbecame available in selected cases in the 1980s. An ex-cellent example is shown in Fig. 6, which gives the to-tal energy of crystalline silicon in different crystal struc-tures as a function of volume. From these curves transitionpressures and volumes can be obtained from the tie-line(common tangent) construction due (∼100 years ago) toGibbs. It is interesting that all the results shown in Figs. 4,5, and 6 are based on a particular approach to the quantumstructure of solids that is known as the pseudopotentialmethod.

FIGURE 6 A plot of the total energy of silicon crystals in differentcrystal structures as a function of atomic volume. At atmosphericpressure the diamond structure has the lowest energy, but at pres-sure of hundreds of thousands of atmospheres silicon is morestable in other structures. Such high pressures can be producedin the laboratory, and they are also found at great depths belowthe earth’s surface. [From Chang, K. J., and Cohen, M. L. (1984).Phys. Rev. B 30, 5376.]

Page 8: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GKD Revised Pages

Encyclopedia of Physical Science and Technology EN002D-71 May 17, 2001 20:25

288 Bonding and Structure in Solids

IX. CHEMICAL BONDING IN SOLIDSIN THE THIRD MILLENNIUM

The evolution of microelectronic devices towards smallerand smaller dimensions will soon reach the level of 2.5 nm(25A) or less, which is basically the molecular level. Atthis level the concepts of chemical bonding discussed herecease to be only theoretical abstractions and become valu-able tools for guiding microelectronic device design andmanufacture. A remarkable aspect of much recent researchis that it demonstrates that both macroscopic and quantumideas of materials can be implemented at the molecularlevel when the processes involved are well controlled.

SEE ALSO THE FOLLOWING ARTICLES

CRYSTALLOGRAPHY • EXCITONS, SEMICONDUCTOR •FERROMAGNETISM • GLASS • QUANTUM MECHANICS

• SOLID-STATE CHEMISTRY • SOLID-STATE ELECTRO-CHEMISTRY • SUPERCONDUCTIVITY • VALENCE-BOND

THEORY • X-RAY ANALYSIS

BIBLIOGRAPHY

Adams, D. M. (1974). “Inorganic Solids,” Wiley, New York.Chang, K. J., and Cohen, M. L. (1984). Phys. Rev. B 30, 5376.Cohen, M. L., Heine, V., and Phillips, J. C. (1982). Sci. Am. 246(6), 82.Pauling, L. (1960). “Nature of the Chemical Bond,” Cornell Univ. Press,

Ithaca.Phillips, J. C. (1970). The chemical bond and solid state physics, Phys.

Today 23 (February), 23.Phillips, J. C. (1974). In “Solid State Chemistry” (N. B. Hannay. ed.),

Vol. 1: The Chemical Structure of Solids, Plenum, New York.Tosi, M. P. (1964). Solid-State Phys. 16, 1.Villars, P. (1983). J. Less-Common Met. 92, 215.Villars, P. (1985). J. Less-Common Met. 109, 93.Villars, P., and Phillips, J. C. (1988). Phys. Rev. B 37, 2345.Wigner, E. P., and Seitz, F. (1955). Solid-State Phys. 1, 1.

Page 9: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN002J-99 May 17, 2001 20:50

Chemical PhysicsRichard BersohnBruce J. BerneColumbia University

I. Properties of Individual and Pairs of MoleculesII. Collective Properties

GLOSSARY

Born–Oppenheimer approximation A quantum me-chanical explanation for the approximate separation ofmolecular energy into electronic, vibrational, and rota-tional energies.

Electric multipole moment If the charge density of asystem is ρ(r, θ, φ) where r, θ, φ are spherical polarcoordinates, then the lth multipole moments are the setof averages

∫ρ(r, θ, φ)r1Y m

1 (θ, φ) dV . The momentsof a spherically symmetric charge distribution arezero.

Green–Kubo relations Expressions for transport coeffi-cients such as viscosity, thermal conductivity, and rateconstants in terms of time correlation functions.

Molecular dynamics method A method for simulatingthe properties of many-body systems based on solvingclassical equations of motion.

Monte Carlo method A method for simulating the equi-librium properties of many-body systems based on ran-dom walks.

Normal coordinates The coordinates of a vibrating sys-tem that oscillate with a single frequency.

Partition function A sum over quantum states used to de-termine thermodynamic properties from the quantummechanical energy levels.

Path integral methods A formulation of quantum me-

chanics and quantum statistical mechanics developedby Feynman.

Radial distribution function The average density offluid atoms as a function of distance from a given fluidatom.

Raman scattering An inelastic scattering of a photon bya molecule; the difference in energy between the inci-dent and scattered photon is a difference of molecularenergy levels.

Spectroscopy The measurement of energy levels.Statistical mechanics A general theory of many parti-

cle systems that relates bulk properties to microscopicproperties.

Time correlation functions A function that describes thecorrelation between properties of a system at differenttimes.

CHEMICAL PHYSICS is the physics of the individualand collective properties of molecules. However, the dis-tinction between chemistry and chemical physics is largelya matter of emphasis. The approach of the chemical physi-cist is theoretical. He searches for underlying theoreti-cal principles, and the molecules that he uses are often ameans to an end, whereas the synthetic chemist usuallyconsiders the molecules that he synthesizes and their re-actions as ends in themselves. This article on chemical

739

html
Page 10: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH

Encyclopedia of Physical Science and Technology EN002J-99 May 17, 2001 20:50

740 Chemical Physics

physics is divided into two sections, one on phenomenawhich depend primarily on the properties of individualand pairs of molecules and the other on phenomena whichare primarily collective.

I. PROPERTIES OF INDIVIDUALAND PAIRS OF MOLECULES

Studies in chemical physics can be loosely classified asspectroscopic, structural, and dynamic. Spectroscopy isconcerned with the determination of molecular energy lev-els. Structural studies are aimed at finding the distributionof particles within a molecule and molecules within a liq-uid or solid. The location of the nuclei defines the structureof the molecule, that is, the distances between nuclei andthe angles between internuclear vectors. The distributionof electrons is intimately connected with the forces thathold the atoms together. Dynamics involves the relation ofthe rate of molecular transformations and changes of statecaused by collision to the intra- and intermolecular forces.

A. Molecular Spectroscopy

Spectroscopy is the measurement of energy level differ-ences. This is most usually accomplished by measuring thefrequencies of light absorbed or emitted by a molecule, butit is sometimes done by measurements of the energy of anincident photon or particle together with a measurementof a scattered photon or particle. For example, the fre-quency of scattered light may differ from that of the inci-dent light. The absolute value of the frequency differenceis a difference of energy levels of the molecule dividedby Planck’s constant. This phenomenon, called Ramanscattering, has many analogs. Electron loss spectroscopyis extensively used to measure vibrational frequencies ofsurfaces. The difference in energy between incident andscattered electrons is, in general, a quantized energy left inthe solid. When very slow (“cold”) neutrons are scatteredby a warm liquid or solid, the scattered neutrons movefaster than those in the incident beam.

In some spectroscopies the scattered particle whose en-ergy is measured is not the same as the incident particle.For example, in photoelectron spectroscopy an incidentphoton with known energy whose wavelength is in theXUV (<100 nm) or X-ray region produces a photoelec-tron. Careful measurements of the kinetic energy of theelectron yield energy level spacings in the positive ion. Inphotodissociation spectroscopy, when a molecule is dis-sociated into fragments by light, measurement of the ki-netic energy of a fragment yields the internal energy of thefragments.

Before discussing spectroscopy, it might be appropri-ate to warn and console the reader about the multitude of

units of energy used to describe spectroscopic phenom-ena. The SI unit of energy is the joule (1 J = 107 ergs),but this unit is rarely used by spectroscopists. Just asthe nuclear spectroscopist uses one million electron volts(1 MeV = 9.65 × 1010 J) as a unit of energy, the X-rayspectroscopist uses electron volts (1 eV = 9.65 × 104 J).The ultraviolet, visible, and infrared spectroscopists usethe cm−1 unit, which is not an energy unit but a reciprocalof a wavelength (1 cm−1 × h/c = 11.96 J). The microwaveand radiofrequency spectroscopists use the megahertz(1 MHz × h = 3.99 × 10−4 J), and nuclear magnetic res-onance (NMR) spectroscopists often use a dimensionlessquantity, the relative frequency shift in parts per million.Some spectroscopists use units of energy, others units ofreciprocal wave length, and still others frequency units.Indeed the reference to spectral features by their wavelengths is still a very common practice, although it is tobe deplored. There are natural reasons for these differ-ent choices of energy units but they are bewildering to thebeginner. In a way, these units are like the light year, a non-standard unit convenient to the astronomer but to nobodyelse.

Molecular spectroscopy is generally divided into fourbranches, each corresponding to a different type of motionand, in general, to a different frequency range. Electronicspectroscopy is the study of the differences in electronicenergy levels which occur in atoms and molecules; thecorresponding frequencies are 1014−1017 Hz. Vibrationalspectroscopy is the study of molecular vibrations, whosefrequencies are typically of the order of 1012−1014 Hz. Ro-tational spectroscopy is the study of the rotational energylevels of molecules; corresponding rotational frequenciesare typically in the range of microwaves, 1010−1012 Hz.Finally, NMR spectroscopy is the study of the magneticfields acting on a nucleus.

The given divisions are not quite as sharp as represented.Frequency ranges overlap, and electronic spectroscopy isoften a source of information of vibrations and rotationsas well as electronic motion. Nevertheless, this classifi-cation of spectra is of fundamental importance. The the-oretical justification for this classification is the Born–Oppenheimer approximation. This is an approximationthat exploits the fact that the electron is 104−105 timeslighter than a typical nucleus. Classically speaking, theelectron revolves around the molecule so rapidly that dur-ing an electron period the nuclei do not have time to reactto the different positions of the electron and barely move.Thus, electronic energies can be calculated with the as-sumption that the nuclei are stationary. They are not sta-tionary, of course, but the electronic energy levels can (inprinciple) be calculated for any given arrangement of thenuclei. The arrangement or configuration that producesthe least electronic energy is the equilibrium structure ofthe molecule.

Page 11: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH

Encyclopedia of Physical Science and Technology EN002J-99 May 17, 2001 20:50

Chemical Physics 741

Let us pause to consider the coordinates used to describethe shape and location of an isolated set of N atoms. Theconfiguration of a molecule with N atoms is determinedby 3N coordinates. Of these, six are external coordinatesin that they describe the location of the center of mass andthe spatial orientation of the molecule. However, the elec-tronic energy is independent of these coordinates, for ex-ample, the position and orientation of the molecule. Thus,there are 3N − 6 internal coordinates (3N − 5 for a linearmolecule) which are, in general, vibrations.

The electronic energy, considered as a function of thenuclear coordinates, behaves as a potential energy functionfor the nuclear motion. In a stable molecule, the atoms exe-cute small vibrations about the position of the minimum ofelectronic energy. This statement implies that, in general,for each electronic state the molecule will have a differentequilibrium structure. In mathematical terms, the poten-tial in the nth electronic state can be expanded in powersof the displacement about the equilibrium position,

Vn(Q) = Vn(Q0) +∑

i

(∂V

∂Qi

)0

(Qi − Qi0)

+ 1

2

∑i

∑j

(∂2Vn

∂Qi∂Q j

)0

× (Qi − Qi0)(Q j − Q j0)

where Q is the set of vibrational coordinates and Q0 is theirequilibrium value. If Q0 is truly the equilibrium position,then all the first derivatives in the preceding equation willvanish. If the Qi are chosen to be normal coordinates, thenall the cross derivatives vanish as well and the followingequation is obtained:

Vn(Q) = Vn(Q0) + 1

2

3N−6∑i=1

∂2Vn

∂Q2i0

(Qi − Qi0)2 + · · ·

A normal coordinate is a coordinate that classically willvary sinusoidally in time with a single frequency. In con-trast, an arbitrary nonnormal coordinate will vary in timewith a number (up to 3N − 6) of frequencies. The quan-tities (∂2Vn/∂Q2

i )0 are the force constants, the curvaturesof the potential with respect to the normal coordinates i .From this definition, it is clear that the force constant maybe different in different electronic states and indeed eventhe normal coordinates may differ. The simplest exampleis a diatomic molecule where the equilibrium internucleardistance is different in two different electronic states.

1. Electronic Spectroscopy

The electronic spectroscopy of atoms is clearly separatedby energy domains into transitions of valence electronsand transitions of inner shell electrons. When two or moreatoms combine to form a molecule, to a good approxi-mation, core electrons remain core electrons but valence

electrons occupy three types of states: bonding, nonbond-ing, and antibonding. For example, in the water moleculethere are two core electrons in the oxygen inner shell, twopairs of bonding electrons, and two pairs of nonbondingelectrons. Lower energy electronic transitions are often ex-plained by promotion of electrons from nonbonding to an-tibonding states. At somewhat higher energies, electronsin bonding states may be promoted to antibonding states.At still higher energies, one has Rydberg series, which area set of atomiclike transitions in which valence electronsare excited to progressively higher energy states wherethe electron circum navigates the ion core at, typically, aconsiderable distance. Excitations of inner shell electronsrequire photons with energies in the X-ray region.

Electronic spectra are identified not only by their ener-gies but also by two closely related quantities: their inten-sity and their polarization. Electromagnetic fields causetransition in molecules by exciting resonances in the elec-tronic or nuclear motion. The interaction energy is almostinvariably expanded in multipoles, that is, in powers of thesize of the molecule divided by the wavelength of light.This ratio is so small that for practical purposes only theelectric and magnetic dipoles interacting with the externalelectric and magnetic fields, respectively, are considered.There is a fairly sharp distinction between the domain ofspectroscopy in which the electric and magnetic dipolemoments are important in causing transitions. The elec-tric dipole moment matrix elements are typically of theorder of ea0, the electronic charge multiplied by the Bohrradius. The magnetic dipole moment matrix elements aretypically on the order of eh/mc, the electronic charge mul-tiplied by the Compton wavelength. However, h/mc is afactor of e2/hc � 1

137 smaller than a0 = h2/me2. Transitionprobabilities are classically proportional to the square ofthe time derivative of the appropriate dipole moment, andquantum mechanically to the square of the matrix elementbetween the initial and final states of that dipole momentoperator. Hence, magnetic dipole transition probabilitiesare typically four to five orders of magnitude weaker thanelectric dipole transitions. Magnetic dipole transitionsare therefore rarely observed in molecular spectroscopyexcept in paramagnetic systems where only the mag-netic dipole interaction can accomplish magnetic momentreorientation.

The polarization of an electric dipole transition is thedirection along which the external time-dependent electricfield must lie in order to cause the transition. Transitions inan atom placed in a potential, independent of angle, wouldbe unpolarized in the sense that the transition probabilitywould be independent of direction of the time-dependentoptical field. In all molecules except those with tetrahe-dral or octahedral symmetry, the absorption is anisotropic;in some directions there will be no absorption. The vec-tor µi f , the dipole moment matrix element between the

Page 12: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH

Encyclopedia of Physical Science and Technology EN002J-99 May 17, 2001 20:50

742 Chemical Physics

initial and final states, is often called the transition dipole.In a linear molecule, the transition dipole must lie eitherparallel or perpendicular to the molecular axis. In a pla-nar molecule, typical of the majority of dyes, the transitiondipole must lie either in the plane in some particular direc-tion or perpendicular to the plane. The polarization of anabsorption is most often determined by measuring the ab-sorption of polarized light by an oriented set of moleculesas in a single crystal.

2. Vibrational Spectroscopy

Infrared and Raman spectroscopy are used to determinethe 3N − 6 vibrational frequencies of a molecule with Natoms (3N − 5 if the molecule is linear). There are bothexperimental and theoretical problems. On the experimen-tal side, the intensity of a vibrational transition dependson the square of the rate of change of the dipole mo-ment during the vibration. However, certain modes haveno accompanying change in dipole moment and thereforeelectric dipole transitions are forbidden; examples of thesemodes are the symmetric stretch of CO2 or the vibration ofN2. Raman spectroscopy often can supply the missing in-formation. On the theoretical side, the 3N − 6 vibrationalfrequencies once determined must be interpreted in termsof (3N − 6)/(3N − 7)/2 force constants, that is, secondderivatives of the potential function. One instructive pointhas emerged: Vibrational frequencies of most moleculesdisplay very little differences in the condensed phase andin the gas phase. Obvious exceptions to this rule are vibra-tions in such strongly interacting molecules as the watermolecule.

One of the frontiers of chemical physics is the vibra-tionally highly excited “hot” molecule. Vibrations are nolonger neatly divided into a set of harmonic normal modeoscillations. Vibrations are strongly anharmonic. Vibra-tions and rotations are no longer clearly separated and thenature of the states are difficult to describe. Yet it is pre-cisely these energy-rich molecules that are the precursorsto chemical reactions, whether they are unimolecular orbimolecular.

3. Radiofrequency Spectroscopy

Radiofrequency spectroscopy comprises many branchesincluding nuclear magnetic resonance, nuclear quadrupoleresonance, electron spin resonance, atomic beam reso-nance, optical pumping of atoms, and microwave spec-troscopy of rotating molecules. The results of the last tech-nique are described in the section on molecular structure.Nuclear magnetic resonance has contributed vastly moreto chemistry than the other techniques.

The development of nuclear magnetic resonance is anextraordinary chapter in the history of science in which the

original goal, while eventually achieved, was dwarfed byan unforseen discovery. Nuclei with odd numbers of neu-trons or protons have a net quantized angular momentumIh where I = 1

2 , 1, 32 , . . . . The nuclear angular momen-

tum implies a rotation of positive charge which in turnproduces a magnetic moment. The existence of nuclearmagnetic moments was manifested by the hyperfine struc-ture of atoms, an interaction energy between the nuclearmagnetic moment and a magnetic field generated by theelectrons. Nuclear magnetic moments had also been de-tected by atomic beam experiments in which atomic beamswere deflected in different directions by inhomogeneousmagnetic fields; radio-frequency transitions changed therelative intensities. However, the nuclear magnetic mo-ment could be determined only if the effective magneticfield of the electron could be calculated; in the late 1940s,when NMR was developed, computers adequate to calcu-late these fields did not exist. Nuclear magnetic resonancemeasurements on solid or liquid samples seemed like theperfect solution. The moments µ could be derived fromthe simple equation hν = µB where h is Planck’s con-stant, B is the magnetic induction, and ν the frequencyused to excite the transition between the 2I + 1 magneticsublevels of the nucleus.

In practice, it was found that the apparent magneticmoment varied with the chemical environment of the nu-clei. The effect was due to a local magnetic susceptibility,which, in general, increased with the number of atomicelectrons. For the particular case of the protons 1

1H and136C, resonance fields (or frequencies) differ at various

nonequivalent positions and this fact serves as a deci-sive structural tool for organic molecules. As importantas the structure determination is, the really unique featureis the ability to measure exchange rates between differ-ent environments. For example, cyclohexane has six axialand six equatorial hydrogen atoms that interchange as themolecule converts from one chair form through the boatform to a second equivalent chair form. At room temper-ature, only a single resonance line is seen because the twosites interchange at a rate much faster than the frequencydifference between them. When the liquid is cooled, therate decreases and two lines are seen. The contributionsof NMR to solid-state physics, all branches of chemistry,biochemistry, and even medicine through the imaging ofinternal organs have been over-whelming. The originalgoal targets, the magnetic moments of the stable nuclei,have all been measured and tabulated in reference booksbut a theory has not been developed to explain them.

B. Molecular Structures

Molecular structure is most often determined by scatteringmethods. A beam of photons or material particles (chosen

Page 13: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH

Encyclopedia of Physical Science and Technology EN002J-99 May 17, 2001 20:50

Chemical Physics 743

so that their wave-lengths are comparable to close inter-nuclear distances) is directed at the sample. At particularscattering angles, constructive interference will occur be-tween the waves scattered by pairs of atoms or, in a crystal,planes of atoms. By measuring the relative scattered in-tensities at the various angles at which constructive inter-ference occurs, a structure may be inferred. The structureof a condensed phase is probed by X-rays or neutrons;the structure of a gas molecule or a surface is commonlydetermined by electron scattering.

In several types of spectroscopy, molecular rotationalenergies are determined from which the three moments ofinertia can be deduced. By measuring the spectra of thesame molecule with different nuclear isotopes, additionalsets of moments of inertia can be obtained.

Molecular structure is the distribution of nuclei andelectrons within a molecule. The location of the nuclei, of-ten just called the structure, is the molecular architecture.Closely connected to this architecture is the distributionof electrons circumnavigating the heavy nuclei. The tech-niques for determining the nuclear positions are based oninterferences in scattering (diffraction) or spectroscopicmeasurements of moments of inertia.

1. Diffraction Methods

When an incident wave is scattered by a molecule, thescattered wave consists of a sum of contributions, largeand small, from each of the electrons and nuclei in themolecule. The scattered intensity, which is the square ofthe scattered amplitude, will consist of a sum of squaresof the individual amplitudes plus a sum of products ofscattered amplitudes originating from different particles.The former sum is independent of the distances betweenparticles, whereas the latter is structure dependent. Theconstructive interference between the different amplitudesgives rise to significant diffraction peaks when the inter-particle distances are comparable to the wavelength of theincident wave. In practice, this means that the wavelengthmust be of the order of an angstrom, (1 A = 10−10 m). Themost common “particles” used for scattering have been theX-ray (λ ∼ 1–2 A) and the velocity-selected slow neutron(λ ∼ 1 A). In a few cases, especially for diffraction fromsurfaces, hydrogen atoms and helium atoms have beenused.

X-ray diffraction from single molecular crystals hasbeen our most important source of knowledge of molecu-lar structure. The assumption, in most cases solidly con-firmed by experiment, is that the forces that hold moleculestogether in the solid are much weaker than the interatomicforces that determine the molecular structure. Therefore,the structure in the solid will be negligibly different in thesolid, liquid, and gas phases. There are, of course, excep-

tions to this rule such as PCl5, which is PCl+4 PCl−6 in thesolid state, but the rule is generally valid.

Nuclei scatter X-rays to a negligible extent because theyare so heavy that they do not react to the time-varyingelectric field of the photon. The electrons will contributea total scattering amplitude proportional to:

f =∫

ρ(r)eis · r dV

where r is a vector whose origin is at the nucleus,ρ(r) isthe electron density at the point r, s = 2π (k − k0), and dVis a volume element. The wave vectors k and k0 are of thescattered incident X-rays, respectively. The atomic scat-tering factor f varies with the atom, being larger for atomswith more electrons. The amplitude for scattering from anentire unit cell of a crystal is represented (approximately)as a sum over atomic scattering amplitudes:

F =∑

j

f j eis·R j

where f j is the scattering amplitude of the j th atomswhose mean position is at the point R j within the unitcell. There will be constructive interference (that is, Braggpeaks) whenever s · (R j − Rk) is an integral multiple of2π . There will be a large number of planes of atoms inthe crystal for which this Bragg condition will be satisfiedand therefore the crystal structure is in principle overde-termined.

As is shown by the previous equation, the atomic scat-tering amplitude is roughly proportional to the numberof electrons in the atom. Thus, it is particularly difficultto observe scattering interferences from hydrogen atoms.Neutron-scattering amplitudes depend on nuclear crosssections which are of comparable magnitudes for all nu-clei. An identical expression for F is obtained except thatthe f ′s are now nuclear scattering amplitudes. Becausethese amplitudes are small, neutron diffraction is carriedout with condensed phases and not with gases.

Electron diffraction has also been used for determina-tion of molecular structure. Electrons interact far morestrongly with matter than X-rays and can penetrate onlya few hundred angstroms into a crystal. They have beenused, therefore, for structural measurements of solid sur-faces and gaseous molecules. The fact that the gaseousmolecules exhibit all orientations to an incoming electronbeam greatly reduces the information content of the exper-iment. In other words, only systems with a small numberof structural parameters can be effectively studied. Thescattering amplitudes for individual atoms include a nu-clear as well as an electronic term, but the essence of thestructurally dependent interference remains the same.

Page 14: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH

Encyclopedia of Physical Science and Technology EN002J-99 May 17, 2001 20:50

744 Chemical Physics

2. Structure Inferred from Moments of Inertia

At very high resolution, electronic spectra exhibit rota-tional structure from which the moments of inertia of boththe ground and excited electronic states can be extracted.Similarly, vibrational spectra taken at very high resolutionwill yield the moments of inertia in the ground and theexcited vibrational states. Rotational Raman spectra haveyielded moments of inertia of small symmetric molecules.The most extensive and accurate source by far of momentof inertia data is microwave spectroscopy. This techniqueis applicable to any polar gaseous molecule.

Frequencies in microwave spectroscopy that are di-rectly proportional to inverse moments of inertia are usu-ally measured to five or six significant figures. It wouldthus appear that bond distances are determined to betterthan 0.001 A. In reality, there are usually more structuralparameters required than measured moments of inertia.For example, in CH3I, a pyramid-shaped molecule, thestructure is determined by two bond distances and a bondangle. Because of the threefold symmetry axis, only twomoments of inertia are measurable. The usual solution isan isotopic substitution (for example, deuterium for hy-drogen). It is often found, however, that a unique structurecannot be found consistent with all the moments of inertia.The reason is that the different vibrational wave functionsin the isotopically substituted molecules yield slightly dif-ferent average structures.

C. Dynamics of Molecular Processes

A system at equilibrium is characterized by a balanceof opposing forces and reactions so that its macroscopicproperties are independent of time. Time-dependent pro-cesses are caused by deviations from thermal equilibrium.For example, if there is a spatial gradient in temperature,electric potential, momentum, or concentration, equilib-rium is restored by the transport of energy, charge, mo-mentum, or identity, respectively. The ratios between thefluxes of these properties and the corresponding gradi-ent are called the transport coefficients—namely, the co-efficients of thermal conductivity, electrical conductivity,viscosity, and diffusion, respectively.

Even in a spatially uniform system there can be substan-tial departures from equilibrium. The system may not be inan equilibrium distribution over its internal states and oftenapproaches equilibrium (“relaxes”) exponentially with arate characteristic of each degree of freedom, which is thereciprocal of the relaxation time. For example, there areelectronic relaxation times, vibrational relaxation times,rotational relaxation times, and even nuclear or electronicspin lattice relaxation times. There are great variations inthe rate of approach to equilibrium of these various de-grees of freedom, varying from 1012 sec−1 to 10−3 sec−1.

When the chemical composition of a system changeswith time, it is less clear how to express this change math-ematically. When dealing with a reaction, for example, ofthe form:

A + B →← X1 →← X2 · · · →← Xn → C + D

where A and B are reactants, C and D are products, andthe Xi are intermediates, known or unknown, it can oftenfit the observations to an empirical formula with one ormore terms of the following form:

d(C)/dt = k(A)a(B)b(C)c(D)d

where a, b, c, and d are not necessarily integer constantsand the parentheses indicate concentrations of the en-closed species. The constant k is called a rate constant.The complexity of this equation is due to the complexityof the reaction mechanism. If the reaction were really ele-mentary (that is, if there were no intermediates), then theequation,

d(C)/dt = k(A)(B)

would express the fact that the reaction takes place as aconsequence of collisions of A and B molecules.

Except for radiative and nonradiative decay of electron-ically excited molecules, the approach to equilibrium of aset of molecules is always accomplished by intermolecularinteractions of some sort. In a low-pressure gas, the cen-tral event is a bimolecular collision. A collision changesthe states of the pair of molecules involved in it. The mostgentle change of state is the elastic scattering that involvesonly a rotation of the relative momentum of the pair. Theprobability of such an event is expressed by a differentialcross section, dσ/d�, which gives the number of the pairswhose relative momentum is scattered into d� (at a cer-tain orientation, �) per unit time, per unit incident flux.The integral of this differential cross section over all solidangles is called the total scattering cross section.

In perhaps the most famous experiment of atomicphysics, Rutherford demonstrated that the deflection ofα particles could be connected with the Coulomb law offorce between nuclei and α particles. If the potential en-ergy was known in advance, classical or quantum equa-tions of motion could be used to determine the differen-tial scattering cross section. In principle, the process canbe reversed; that is, the theoretical potential energy act-ing between the two scatterers can be deduced from theexperimental differential cross section. Indeed in princi-ple if the scatterers are small enough, preferably atomicin nature, the potential energy can be calculated usingthe Schrodinger equation of quantum mechanics. A largeamount of research in atomic scattering may be brieflysummarized as follows. Atoms interact with a very steeplyrising short-range potential and a relatively weak long-range attraction. The long-range attraction gives rise to

Page 15: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH

Encyclopedia of Physical Science and Technology EN002J-99 May 17, 2001 20:50

Chemical Physics 745

very large scattering at small angles superimposed on amuch less intense isotropic scattering due to the short-range repulsion.

Elastic collisions can relax translational energies, pro-ducing a Boltzmann distribution of translational energies.However, relaxations of internal degrees of freedom canonly be accomplished by inelastic collisions in which en-ergy is exchanged between internal degrees of freedomand translation or between internal degrees of freedom ofboth molecules. The efficiency of these inelastic collisionsexpressed by the magnitudes of the inelastic cross sectionsdepends on the nature and strength of the interaction be-tween the molecules. Specifically, the intermolecular po-tential must depend on the molecular orientations in orderto cause changes of rotational state and on an internal nor-mal coordinate in order to cause changes in vibrationalstates. As a general rule, rotational equilibrium is usuallyachieved with only five to ten collisions, but vibrationalequilibrium requires hundreds to thousands of collisions.The reason is simply that intermolecular potential energiesare much more dependent on molecular orientation than onthe relatively small vibrational amplitudes. A more subtlepoint is the following. The probability of an inelastic colli-sion transferring an amount of energy hω between a moreor less random translation and an internal degree of free-dom is proportional to the spectral density of the randominteraction at the frequency ω. The typical frequency asso-ciated with translational motion during a collision wouldbe the reciprocal of a collision duration. This frequencyis just right for rotations, too small for vibrations, and toolarge for nuclear magnetic energy level separations. Be-cause V → T (vibration to translation) relaxation is slow,another form of vibrational relaxation occurs earlier whichis called V → V′. These V → V′ transitions between vi-brational states of nearly the same energy, for example,

2HBr(v = 1) = HBr(v = 0) + HBr(v = 2)

have much larger cross sections than a V → T processsuch as:

HBr(v = 1) + M = HBr(v = 0) + M

where M is another molecule.There is a hierarchy of relaxation problems in dilute

gases. One begins with macroscopic transport processeswhose transport coefficients can be expressed in terms ofaverages of scattering cross sections. These in turn can,in principle, be calculated from an intermolecular poten-tial energy. Spectroscopically the evolution of an initiallynonequilibrium distribution over internal states can be fol-lowed toward a Boltzmann distribution. The relaxationrate can be expressed in terms of inelastic scattering crosssections that can, again in principle, be calculated fromthe intermolecular potential energy function.

The chemical reaction itself can be treated as a colli-sion problem in mechanics. The measured temperature-dependent rate constant for a chemical reaction, k(T ),is an average over the Boltzmann distributions for bothtranslation and internal energies. There is not enough in-formation in the function k(T ) to permit a potential en-ergy function, a surface to be extracted by an inversion.An area of research, chemical dynamics is the study ofreactions in which some form of state selection of the re-actants or the products has been carried out. The state se-lection can take many forms. If the reactant molecules areformed into molecular beams, the direction and sometimesalso the magnitude of the relative velocity will be knownand the differential reaction cross section can be deter-mined. A reactant can be excited to a specific vibrationalstate and the effect on the cross section can be measured.Similarly the vibrational and rotational state distributionof product molecules can sometimes be determined bylaser-induced fluorescence, multiphoton ionization, or in-frared emission. When sufficient information is obtained,it should ultimately be possible to extract a potential sur-face on which the reactants move and are transformed intoproducts. The assumption here is that the set of reactantmolecules and the set of product molecules correspond todifferent potential minima on the same surface. If this isnot the case, the reaction is said to be nonadiabatic. Thetheoretical problem is more complicated; two potentialsurfaces must be extracted as well as a perturbing poten-tial that couples them.

D. Lasers in Chemical Physics

The laser is a light source possessing intensity, direction-ality, coherence, monochromaticity, and (often) tunabil-ity. In the past few decades, lasers have altered the faceof most branches of experimental chemical physics. Theresulting enhanced knowledge of the physical world hasgreatly extended the applicability of theory.

Lasers are of two types, cw (steady state) and pulsed.The cw laser can be made extremely narrow in frequency,down to KHz and even less. This narrow line width per-mits resolution of the rotational structure of all but theheaviest gas-phase molecules and the hyperfine structureof molecules with nonzero spin. The laser line widthis typically narrower than the Doppler-broadened spec-tral line of a gas-phase molecule or atom. This givesrise to a new spectroscopy called Doppler spectroscopy,which measures the distribution of the velocity compo-nent in the direction of the probing laser. In solution,the narrow line width permits measuring the width of theRayleigh scattered light, thus permitting rapid measure-ment of the molecular diffusion coefficient and molecularmass.

Page 16: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH

Encyclopedia of Physical Science and Technology EN002J-99 May 17, 2001 20:50

746 Chemical Physics

Pulsed lasers have durations in the nanosecond, picosec-ond, and even femtosecond domains. Strong lamps, bothcw and pulsed, have been used in photochemistry for along time, and their average power can be comparablewith that of a laser. However, the short pulse durationof a ns laser enables the generation by photodissociationof a much higher instantaneous concentration of reactiveintermediates such as atoms or radicals. These or theirsubsequent reaction products are in sufficient concentra-tion that their nascent quantum state distribution can beprobed. This measurement must be done under single-collision conditions in a time much shorter than the timerequired for thermalization of the rovibrational distribu-tion. No such measurements are possible with a conven-tional light source.

The simultaneous or nearly simultaneous absorption oftwo or more quanta is a phenomenon unique to the laser.When a strong infrared laser is focused on a molecule,sometimes not just two but a large, somewhat indeter-minate number of quanta result in a super hot molecule.Focusing a visible or ultraviolet laser on a molecule oftenresults in ionization, nonresonant or resonant. The latterprocess is called resonance-enhanced multiphoton ioniza-tion (REMPI). With its aid, a molecule in a given rovibra-tional ground state, which is extremely difficult to detect,is converted to an easily detectible ion. A technique of ionimaging has been developed in which a molecular frag-ment or reaction product is ionized and then projecteddown a tube by a strong electric field. An array detectorgenerates a pattern of rings. The radius of the ring yieldsthe transverse velocity of the detected molecule, and thevariation of intensity around the ring is a measure of theanisotropy of molecules with this particular speed.

Another nonlinear optical phenomenon unique to thelaser is second harmonic generation in which strong lightat frequency ω is partially converted to light at frequency2ω. This is observed primarily in media lacking a center ofsymmetry and is especially suited to probing the interfacebetween two media. A variation of the technique, sum fre-quency generation, permits measurement of the infraredabsorption of the interface layer.

Picosecond lasers make possible the measurement ofextremely fast processes, such as the lifetime of an elec-tron in a conduction band in a semiconductor. The useof femtosecond (fs) pulses allows the observation in realtime of even faster processes, such as the separation offragments following an initial excitation. The separationtime is typically <100 fs for a direct allowed transition,but it can be longer if the fragments are both heavy or ifthe dissociation is indirect (i.e., from a lower electronicstate than the one originally excited). Consistent with theuncertainty principle, a fs pulse is broad in frequency. Forexample, a 50-fs pulse is about 1000 cm−1 wide. Recently,

it has been shown that by varying the relative phases of thedifferent modes of which the fs pulse is composed a mul-tiphoton dissociation of a metal carbonyl can be tailoredto alter the relative yields of different photodissociationchannels.

We finish with two examples of accepted verities whichthe laser has demolished. The statement “A light beamcannot resolve objects whose separation is less than itswavelength,” is true in the farfield region where r λ butis not true in the opposite nearfield region where r � λ. Itis now possible to excite molecules by laser light and thenobserve emission from individual molecules. Instead ofobserving an average excited state lifetime, the lifetimesof individual molecules are measurable. The action of oneenzyme molecule can be measured instead of the averagebehavior of an ensemble of enzyme molecules.

Another truism is that “proteins cannot fly.” In otherwords, large biomolecules cannot be studied by massspectroscopy because they have no vapor pressure andif heated will be destroyed. Embedding the protein orpolynucleotide between thin crystal layers and then pulseevaporating the crystallite with a ns laser produces gas-phase macromolecules. These can be ionized and theirmasses accurately determined. An important new windowon biochemistry has been opened.

II. COLLECTIVE PROPERTIES

Liquids and solids are systems that contain on the orderof Avagodro’s number (6 × 1023) of strongly interactingmolecules. How do the macroscopic properties of suchsystems depend on the properties of the constituent iso-lated molecules? The properties of collective systems canbe roughly divided into structural properties and dynamicproperties. The theoretical study of such collective prop-erties falls within the province of statistical mechanics.

The macroscopic properties of condensed systems areof central importance to almost all fields of science. Theseproperties are described by phenomenological theoriessuch as thermodynamics, hydrodynamics, electrodynam-ics, and chemical kinetics. Nevertheless, it should beclear that the laws of thermodynamics, the thermodynamicequation of state for given materials, the laws of hydrody-namics and electrodynamics, and the properties germaneto these as well as all of chemical kinetics should be deriv-able from the underlying classical or quantum mechanicaltheory of atoms and molecules (in fact, the whole of chem-istry should be predictable from mechanics.)

A. Statistical Mechanics

Statistical mechanics is a molecular theory of macroscopicsystems. It provides the bridge between the microscopic

Page 17: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH

Encyclopedia of Physical Science and Technology EN002J-99 May 17, 2001 20:50

Chemical Physics 747

world of nuclei and electrons and the macroscopic world ofthe phenomenological theories. Starting with the Hamil-tonian of the system, the laws of equilibrium thermody-namics, hydrodynamics, electrodynamics, and chemicalkinetics can be derived. The Hamiltonian H is the classi-cal energy as a function of the momenta and positions ofall the particles in the system. It consists of the sum of thekinetic and the potential energies.

Statistical mechanics can be subdivided into equilib-rium and nonequilibrium statistical mechanics. The for-mer deals with systems in thermodynamic equilibrium,whereas the latter deals with the time evolution of macro-scopic systems.

1. Equilibrium Statistical Mechanics

The thermodynamic equation of state of a system canbe derived from an expression that relates the Helmholtzfree energy AN (T, V ) to the microscopic properties of thesystem,

AN (T, V ) = −kT ln QN (T, V )

where T is the thermodynamic temperature (degreesKelvin), V is the volume, and

QN (T, V ) =∑

n

exp(−En/kT )

is the canonical partition function. Thus, to derive the ther-modynamic equation of state (that is, the relationship be-tween the free energy, the number of particles, the temper-ature T , and the volume V ), QN (T, V ) must be calculated.In these equations k is Boltzmann’s constant, the sumgoes over all quantum mechanical states of the system,and {En} is the set of energy eigenvalues correspondingto these states. These energy levels are found by solvingthe Schrodinger equation of quantum mechanics, namely,

H�n(q1, . . . , qN ) = En�n(q1, . . . , qN )

where �n(q1, . . . , qN ) is an energy eigenfunction of thesystem and {q1, . . . , qN } are the coordinates specifying thepositions of the particles. Thus, to predict the thermody-namic properties the allowed energies for the system at thegiven volume have to be determined. In very dilute gases itis possible to calculate the thermodynamic properties veryaccurately. This is already an impressive achievement, butliquids are more difficult to treat. In quantum mechanics,every particle has wavelike properties. It follows that theatoms and molecules that liquids are composed of occupya region in space characterized by a diameter proportionalto the particle’s thermal deBroglie wavelength, a quan-tity inversely proportional to (mT )1/2. Only at very low

temperatures does the matter wave of one molecule in-terfere with the matter wave of another molecule. Thisinterference can give rise to astounding properties as itdoes in liquid He which displays superfluidity (a purelyquantum mechanical phenomenon). Because most liquidsexist at high temperatures, it is possible to ignore thesequantum effects and to treat them using classical statis-tical mechanics—that is, statistical mechanics based onclassical mechanics. Although the treatment of stronglyinteracting systems is still very difficult in classical statis-tical mechanics, progress has been made on a variety offronts. Statistical mechanical perturbation theory allowsthe properties of fluids to be calculated using the exact re-sults for a simple reference system such as the hard spherefluid. This is a fluid consisting of spheres that are not al-lowed to interpenetrate. The structure and thermodynamicproperties of fluids can be determined by integral equa-tions. The behavior of systems undergoing phase trans-formations can be determined by renormalization grouptechniques. The properties of real imperfect gases can bedetermined by diagrammatic techniques.

What is meant by the term “the structure of liquids”?Imagine an instantaneous photograph of a liquid. Theatoms are packed together in a noncrystalline arrange-ment; that is, there is no long-range order. Nevertheless,the positions of nearby atoms are correlated. Because theatoms cannot overlap, the minimum distance from oneatomic center to another is the atomic diameter. Thus,around an atom there will be an exclusion sphere outsideof which near neighbors are found, but these neighborsonce again define a region of space which excludes nextnearest neighbors and so on. The radial distribution func-tion gives the average density of atoms as a function ofdistance from a given atom in the system. Due to the pre-viously stated packing effects, this function will exhibitpeaks and troughs as a function of distance. At very largedistances it will be equal to the bulk density because thepresence of an atom at one place will not be felt by anotheratom very far from it. These functions can be measured bydiffraction methods such as neutron scattering and X-rayscattering. They can also be determined from statisticalmechanics.

2. Nonequilibrium Statistical Mechanics

In hydrodynamics, the time evolution of fluid flow dependsnot only on thermodynamic properties, but also on suchtransport properties as shear and bulk viscosities, ther-mal conductivity, mutual diffusion coefficients, etc. Thesecharacterise the transport of momentum, heat, and mass.In electrodynamics, the response of systems to the impo-sition of electric and magnetic fields depends on dielectric

Page 18: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH

Encyclopedia of Physical Science and Technology EN002J-99 May 17, 2001 20:50

748 Chemical Physics

response functions of the system, and in chemical kineticsthe approach to equilibrium is described by empirical ratelaws and chemical reaction rate constants. Expressions forthe transport coefficients in terms of the microscopic dy-namics of the quantum or classical systems can be derivedby modern statistical mechanics. Moreover, it is now pos-sible to derive the equations of macroscopic rate processessuch as hydrodynamics and chemical kinetics starting withthe Hamiltonian of the system and using the methods ofnonequilibrium statistical mechanics.

Transport coefficients can be expressed in terms of timecorrelation functions or covariance functions of sponta-neous fluctuations in equilibrium systems. For example,the self diffusion coefficient can be expressed as:

D = 1

3

∫ x

0dt〈v(0) · v(t)〉

where 〈v(0) · v(t)〉 is the autocorrelation function of thevelocity of a labeled particle, v. Such expressions fortransport coefficients are called Green–Kubo relations.All transport coefficients and chemical rate constants canbe expressed as time integrals of time correlation func-tions; that is, as Green–Kubo relations. In addition, spec-troscopic band shapes, NMR, and ESR line shapes anddifferential cross sections for the scattering of light andthermal neutrons can be expressed as space–time Fouriertransforms of appropriate time correlation functions. Thetheory of such processes is called linear response theory.To treat nonlinear processes it is necessary to use mode–mode coupling theory.

B. Numerical Statistical Mechanics

One of the great developments in statistical mechanics dur-ing the past several decades is the development of methodsfor simulating strongly interacting systems such as liquids,liquid crystals, solids, and glasses on computers. The for-malism of statistical mechanics provides exact analyticalexpressions. Because it is often impossible to evaluatethese expressions analytically, numerical methods mustbe used. In fact, prior to the development of computersvery little progress was made in understanding liquidsand amorphous solids. There are two major techniquesused in computer simulation—molecular dynamics (MD)and Monte Carlo (MC)—and these have contributed enor-mously to the theory of condensed systems.

1. Molecular Dynamics

A starting point for the simulation of a classical liquid isthe Hamiltonian of the system. If there are N particles inthe system, 6N equations of motion must be solved: onefor each position coordinate and one for each momen-

tum coordinate. Starting with a given set of positions andmomenta, these equations are solved by finite differencetechniques on a computer. At the end of each time inter-val the new positions and momenta of all the particles arerecorded. After very many time steps this gives a trajectoryof the whole many-body system. Thermodynamic prop-erties are found by time averaging dynamical propertiesover the trajectory. Likewise, time correlation functionsand thereby transport coefficients, rate constants, differ-ential scattering cross sections, spectral line shapes, etc.can be determined from the trajectory. This method iscalled molecular dynamics. Recently, it has been possi-ble to simulate nonequilibrium systems by clever appli-cations of constraints and boundary conditions. Becausesimulations are done on finite systems, periodic boundaryconditions are usually adopted to bypass unusual surfaceeffects. Nevertheless, it is also possible to simulate sur-face properties. In order to use molecular dynamics, theremust be knowledge of the forces that exist between themolecules. Only in the simplest systems is it possible todetermine the forces between two molecules (either by ex-periment or by ab initio quantum chemical calculations).Thus, molecular modeling is required before moleculardynamics can be applied.

2. Monte Carlo Simulations

If there is interest only in equilibrium properties, an al-ternative to molecular dynamics is the Monte Carlo tech-nique. This ingenious method is based on the theory offinite Markov chains. In classical statistical mechanics,the probability distribution for finding the particles at agiven position in configuration space is proportional to theBoltzmann factor, exp(−U/kT ), where U is the potentialenergy of the system in that configuration. In Monte Carlo,an initial configuration is chosen. A particle is next movedto a new position. This move is either accepted or rejectedbased on a certain criterion. This process is repeated foreach particle. The criterion for acceptance or rejection isbased on a random number generator in such a way thatthe configurations thus generated are distributed accord-ing to the Boltzmann factor. These sampled configurationsgive a trajectory in configuration space that looks like arandom walk. Averages of position-dependent propertiesover this random walk trajectory then give thermodynamicproperties.

Simulation techniques have been applied to a wide vari-ety of many-body systems. For example, hydrogen bond-ing in water and aqueous solutions, including protein solu-tions, has been studied in this way. Such methods are oftenused as experimental tests of theoretical assumptions. Inmany cases, totally new phenomena were first observedusing computer simulations and only later observed in real

Page 19: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH

Encyclopedia of Physical Science and Technology EN002J-99 May 17, 2001 20:50

Chemical Physics 749

systems. The methods of statistical mechanics and numer-ical statistical mechanics are applicable to many fieldsoutside of chemistry. For example, the question of quarkconfinement in high-energy physics can be formulated asa problem in statistical mechanics that can be treated byMonte Carlo techniques using the Feynman path integralrepresentation. Another example involves the fragmenta-tion of nuclei that can be regarded as a liquid droplet ofnucleons.

With the advent of supercomputers these methods willcontinue to play a very large role in the study of collec-tive phenomena. One day very complex reactions will besimulated on computers.

SEE ALSO THE FOLLOWING ARTICLES

ATOMIC PHYSICS • ATOMIC SPECTROMETRY • CHEMICAL

KINETICS, EXPERIMENTATION • KINETICS (CHEMISTRY)

• LASERS • NUCLEAR MAGNETIC RESONANCE • PHYSI-CAL CHEMISTRY • POTENTIAL ENERGY SURFACES • STA-TISTICAL MECHANICS • SUPERCOMPUTERS • TRANSITION

PROBABILITIES AND ATOMIC LIFETIMES

BIBLIOGRAPHY

Berne, B. J., and Pecora, R. (1976). “Dynamic Light Scattering: With Ap-plications to Chemistry, Biology and Physics,” Wiley (Interscience),New York.

Berne, B. J., ed. (1976). “Modern Theoretical Chemistry,” Vol. 6, Parts Aand B, Plenum, New York.

Bernstein, R. B., and Levine, R. D. (1987). “Molecular Reaction Dy-namics and Chemical Reactions,” Oxford University Press, London.

Chandler, D. (1987). “Introduction to Modern Statistical Mechanics,”Oxford University Press, New York.

Forster, D. (1975). “Hydrodynamic Fluctuation, Broken Symmetry andCorrelation Functions,” W. A. Benjamin, Reading, MA.

McQuarrie, D. A. (1975). “Statistical Mechanics,” Harper & Row, NewYork.

Page 20: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LDK/LSK/MAG P2: FYK/FFU QC: FYD Final Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN003C-118 June 13, 2001 21:8

Cohesion ParametersAllan F. M. BartonMurdoch University

I. Main Classes of Cohesion ParametersII. Limitations of Cohesion ParametersIII. Cohesion Parameters and Other Solvent ScalesIV. Applications of Cohesion ParametersV. Evaluation of Cohesion Parameters

VI. Selected ValuesVII. Current Status

GLOSSARY

Chameleonic behavior Capacity of the molecules ofa compound to assume a character similar to theirenvironment by intermolecular or intramolecularassociation.

Cohesion parameter Quantity with dimensions of squa-re root of pressure used to characterize the cohesiveand adhesive properties of materials.

Cohesive energy Thermodynamic quantity describingthe sum of molecular effects in a material that causeit to remain in a condensed state.

Cohesive pressure (cohesive energy density) Ratio ofthe cohesive energy to the volume for a material.

Dispersion forces Intermolecular forces present in allmaterials, whether polar or not, arising from the fluc-tuating molecular dipoles that result from the positivenucleus and negative electrons.

Hansen parameters Cohesion parameters resulting fromthe subdivision of the Hildebrand parameter intodispersion (nonpolar), polar, and hydrogen bondingcontributions.

Hildebrand parameter, solubility parameter Squareroot of the cohesive pressure of a liquid.

Homomorph The homomorph of a polar molecule is anonpolar molecule having the same size and shape.

Interaction cohesion parameters Cohesion parametersresulting from the subdivision of the Hildebrand pa-rameter into dispersion (nonpolar), orientation, induc-tion, acid, and base contributions.

Regular solution A solution (solvent plus solute) thathas a completely random molecular distribution de-spite possible specific interactions between solute andsolvent molecules.

THE TERM COHESION PARAMETER came into usein the early 1980s as a general term for a class of quantitieswith dimensions of (pressure)1/2. In the simplest applica-tion of the concept, the cohesion parameter values of eachof two pure materials are multiplied together to yield anumerical value for the cohesive pressure expected whenthe materials are mixed or the adhesive pressure expectedwhen they are in contact.

233

Page 21: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LDK/LSK/MAG P2: FYK/FFU QC: FYD Final

Encyclopedia of Physical Science and Technology EN003C-118 June 13, 2001 21:8

234 Cohesion Parameters

Cohesion parameters provide a simple method of corre-lating and predicting the cohesive and adhesive propertiesof substances from a knowledge of the properties of theindividual components only. There are, of course, nu-merous more sophisticated theories and techniques forthis purpose, but none is as easy to use in practicalapplications.

Cohesion parameters provide estimates of the enthalpychanges on mixing and do not explicitly treat any devi-ations from ideal entropy changes. They are based onJ. H. Hildebrand’s definition of a special class of “reg-ular” solutions, those involving an ideal entropy changeon mixing despite the existence of enthalpy changes, butare now applied far more widely.

The simplest cohesion parameter is the Hildebrand pa-rameter, commonly called the solubility parameter, val-ues of which are now being included by manufacturers ofpolymers and solvents in the specification sheets of theirproducts.

I. MAIN CLASSES OF COHESIONPARAMETERS

Pure materials exist as condensed phases (e.g., liquids,crystals, glasses, rubbers) over certain ranges of tem-perature and pressure because in some circumstancesthese states are more stable than the gaseous state: Thereare energetic advantages in the molecules being packedclosely together. In these condensed phases, strong at-tractive forces exist between the molecules. The cohesiveenergy −U is positive (equal and opposite in sign to theinternal energy), and the cohesive energy per unit volumeV of the material is defined as the cohesive pressure orcohesive energy density, given the symbol c.

Similarly, many mixtures also exist as homogeneous,condensed phases for the same reason: There is a signifi-cant cohesive pressure maintaining that state in existence.The basis of the cohesion parameter approach to mixturesis that a material with a high cohesive pressure requiresmore energy for dispersal than is gained by mixing it witha material of low cohesive pressure, so immiscibility (sep-aration of phases) results, but two materials with similarcohesive pressure values gain cohesive energy on disper-sal, so mixing occurs.

The change in cohesive pressure associated with theprocess of mixing two components i and j with their re-spective cohesive pressures ic and jc is given by the inter-change cohesive pressure, i jA,

i jA = ic + jc − 2i j c, (1)

where i jc is the cohesive pressure characteristic of the in-termolecular forces acting between molecules of type i

and type j . This equation can be understood in a simpleway by considering what happens when unit volumes ofcomponents i and j are mixed: Two i– j interactions areformed for each pair of i–i and j– j interactions broken.

A. Hildebrand Parameter

At pressures below atmospheric pressure (i.e., for tem-peratures below the boiling point), the molar cohesive en-ergy −U of a liquid can be taken as equal to the molarenthalpy of vaporization �H less the pressure–volumework (which for a vapor with ideal gas behavior is RT permole, where R is the molar gas constant and T the absolutetemperature). This is the basis of the original definition byJoel H. Hildebrand and Robert L. Scott of the solubilityparameter, hereafter called the Hildebrand parameter δ:

δ = c1/2 = (−U/V )1/2 = [(�H − RT )/V ]1/2. (2)

This parameter was intended for use only with nonpolar,nonassociating liquid systems that form regular solutions,but its use has been extended to all types of material. Forall liquids, the value of the Hildebrand parameter may bedetermined directly from thermodynamic data by meansof this equation, but it is only for regular solutions thatpredictions using the Hildebrand parameter are reliable.

A regular solution has an ideal entropy of formation,that is, a random molecular distribution, despite the exis-tence of interactions that lead to a nonideal enthalpy of for-mation (heat of mixing). This means that regular mixturesare restricted to those systems in which only dispersionforces are acting. (Dispersion forces, or London forces,arise from the fluctuating dipoles that result from the pos-itive nucleus and negative electron “cloud” in every atom.They occur in all systems and are distinct from forces as-sociated with molecular polarity and “chemical” interac-tions between molecules.) For systems like this, withoutthe orientation and ordering effects of polar molecules,the cohesive pressure between unlike molecules is givento a good approximation by the geometric mean of thecohesive pressures of the individual components,

i jc = (ic j c)1/2. (3)

From a combination of Eqs. (1), (2), and (3), it followsthat the interchange cohesive pressure is given by Eq. (4):

i jA = (ic1/2 − jc1/2)2 = (iδ − jδ)2. (4)

By means of this fundamental equation, it is possibleto describe the thermodynamics of the mixing process interms of Hildebrand parameters. For example, the molefraction activity coefficient fx at infinite dilution of com-ponent j with molar volume jV in component i is givenby

Page 22: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LDK/LSK/MAG P2: FYK/FFU QC: FYD Final

Encyclopedia of Physical Science and Technology EN003C-118 June 13, 2001 21:8

Cohesion Parameters 235

RT ln jf ∞x = j V i jA = j V (iδ − jδ)2. (5)

Equations such as this hold exactly for regular solu-tions and are good approximations for many other usefulsystems.

On the basis of the assumtions made in the derivationof cohesion parameter expressions, the effective Hilde-brand parameter δ of a binary solvent mixture is volume-wise proportional to the Hildebrand parameter values ofits components, so

δ = iφiδ + jφ jδ, (6)

where φ is the volume fraction, defined by

iφ =iV ix

jV ix + iV ix, (7)

where x is the mole fraction and V is the molar volume.

B. Component Cohesion Parameters

In addition to the dispersion forces described above forregular solutions, most chemical systems also exhibit po-lar interactions and specific interactions (e.g., hydrogenbonding). To be generally useful, theories and models aim-ing to systematize and predict the behavior of matter mustdeal with molecular interactions on the basis of their na-tures or origins as well as their strengths. The cohesiveproperties characteristic of the condensed states of mat-ter are produced by various intermolecular forces, and thecohesive pressures ic, jc, and i jc represent the resultant ef-fect of all these forces acting between molecules of typesi and j .

For this reason, five interaction cohesion parametershave been introduced to describe the properties of mate-rials in greater detail than is possible with the Hildebrandparameter. These are dispersion, orientation, induction,Lewis acid, and Lewis base cohesion parameters.

Dispersion forces, occurring in all molecules, whetherpolar or not, give rise to a dispersion cohesive pressure icd

and a corresponding dispersion cohesion parameter iδd ina pure material i :

icd = iδ2d . (8)

The nonpolar, dispersive interactions between unlikemolecules of type i and type j provide a contribution tothe cohesive pressure that is based on the geometric meanof the individual values:

i jcd = (icd

jcd)1/2 = iδd

jδd. (9)

A simple interpretation of this geometric-mean behavioris that the interaction is of a “symmetrical” nature: Eachmember of a pair of molecules interacts by virtue of thesame molecular property (the polarizability). It followsthat

i jAd = iδ2d + jδ2

d − 2iδdjδd = (

iδd − jδd)2

. (10)

Orientation effects result from dipole–dipole (orKeesom) interactions and occur between molecules thathave permanent dipole moments. The orientation cohesivepressure of a pure material i is denoted ico, and the cor-responding orientation cohesion parameter iδo is definedby

ico = iδ2o . (11)

Like dispersion forces, these are symmetrical interac-tions, depending on the same property of each molecule,which in this case is the dipole moment. It follows thatthe geometric-mean rule is obeyed closely for orientationinteractions between unlike molecules. For “ideal” polarmolecules, which may be represented by spherical forcefields with small ideal dipoles at their centers, this con-tribution to the cohesive pressure in mixtures of i and jmolecules is

i jco = (jco

jco)1/2 = iδo

jδo (12)

and the interchange cohesive pressure due to orientationis

i jAo = (iδo − jδo

)2. (13)

Dipole induction effects arise from dipole-inducedforces (Debye interactions) occurring between moleculeswith permanent dipole moments and any other neighbor-ing molecules, whether polar or not, and resulting in aninduced nonuniform charge distribution. In contrast to dis-persion and orientation interaction, dipole induction inter-actions are “unsymmetrical,” involving the dipole momentof one molecule and the polarizability of the other. Thus,the cohesive pressure term for induction in a pure mate-rial i involves the product iδi

jδd, where iδi is the inductioncohesion parameter, and in a mixture of i and j ,

i jci = iδijδd + jδi

iδd. (14)

It can be shown, therefore, that the interchange cohesivepressure due to induction is

i jAi = 2iδiiδd + 2 jδi

jδd − 2 jδiiδd − 2iδi

jδd (15)

i jAi = 2(

iδd − jδd)(

iδi − jδi). (16)

Lewis acid–base or electron donor–acceptor interac-tions can be denoted

δ− δ+

A + : D →← A · · · D

Lewis acid Lewis base

(electron pair (electron pair

acceptor) donor)

(17)

Page 23: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LDK/LSK/MAG P2: FYK/FFU QC: FYD Final

Encyclopedia of Physical Science and Technology EN003C-118 June 13, 2001 21:8

236 Cohesion Parameters

The Lewis acid–base complex is formed by an overlapbetween a filled electron orbital of the donor with a vacantorbital in the acceptor and differs from a “normal” chem-ical bond in that only one molecule supplies the pair ofelectrons. These interactions are unsymmetrical, involvingdonor and acceptor with different roles, so it is necessaryto use two separate parameters to characterize these inter-actions, a Lewis acid cohesion parameter δa and a Lewisbase cohesion parameter δb. The acid–base interchangecohesion pressure is

i jAab = 2(

iδa − jδa)(

iδb − jδb). (18)

Hydrogen bonding interactions are a special type of Lewisacid–base reactions with the electron acceptor being aBronsted acid (proton acid):

—X—H + : Y— →← —X—H · · · Y—

Electron pair Electron pair

acceptor; donor; proton

proton donor acceptor

(19)

One of the assumptions central to the cohesion parameterapproach to the properties of materials is that the vari-ous contributions to the cohesive pressure of a substance(either pure or mixed) are additive, so the interchange co-hesive pressure for a mixing process is

i jA = i jAd + i jAo + i jAi + i jAab. (20)

For a pure substance, the total cohesion parameter is

iδ2 = iδ2d + iδ2

o + 2iδiiδd + 2iδa

iδb. (21)

It is clear that this total cohesion parameter is identical tothe Hildebrand parameter, which can be determined fromEq. (2).

C. Hansen Parameters

A much simpler three-component cohesion parametermethod was developed by C. M. Hansen on an empiricalbasis, the parameters being determined either experimen-tally or by semiempirical equations. This assumes thatthe cohesion pressure is made up of a linear combinationof contributions from nonpolar or dispersion interactions(δ2

d), polar interactions (δ2p), and hydrogen bonding or sim-

ilar specific association interactions (δ2h). The Hildebrand

parameter (which can be determined from thermodynamicproperties) for any material i is related to the Hansenparameters by

iδ2 = iδ2d + iδ2

p + iδ2h . (22)

The interchange cohesive pressure associated with themixing of i and j is

i jA = (iδd − jδd

)2 + (iδp − jδp

)2 + (iδh − jδh

)2. (23)

It is important to note that the Hansen parameter methodignores the unsymmetrical nature of the induction andacid–base components of the cohesive pressure. In particu-lar, with hydrogen bonding there is no means of separatingthe proton donor and proton acceptor capabilities of anymaterial. (Of course, this is true also of the Hildebrand pa-rameter, but Hansen parameters appear to take hydrogenbonding into account, in a manner that is often misleading;see Section II.C.)

Despite these theoretical shortcomings, Hansen param-eters have been fairly widely used in the polymer andcoatings industries. Frequently, the three-component pa-rameters have been plotted on three mutually perpendicu-lar coordinates. A solubility “volume” in Hansen space isthen drawn up for each solute and compared with the pointlocations in this space of each solvent. Equation (23) hasbeen modified by doubling the scale on the dispersion axeswith the aim of providing “spheres” of solubility for eachsolute. The distance of the solvent coordinates (iδd,

iδp,iδh)

from the center point ( jδd,jδp,

jδh) of the solute sphere ofsolubilities then is

i jR = [4(

iδd − jδd)2 + (

iδp − jδp)2 + (

iδh − jδh)2]1/2

(24)

or

i jA = (iδd − jδd

)2 + 0.25[(

iδp − jδp)2 + (

iδh − jδh)2]

.

(25)

This distance i jR can be compared with the radius iR of thesolute solubility sphere, and if i jR < jR the likelihood ofthe solvent i dissolving the solute j is high. The incorpo-ration of the numerical factor 4 in Eq. (24) does not appearto be necessary to provide a spherical interaction volume,and an equation based on Eq. (23) is just as satisfactory:

i jr = i jA1/2 = [(iδd − jδd

)2 + (iδp − jδp

)2

+ (iδh − jδh

)2]1/2. (26)

In some applications, only two of the three Hansen pa-rameters are used, so that the location of solvents canbe displayed on two-dimensional maps (e.g., Fig. 1) andcompared with solute solubility regions.

D. Other Cohesion Parameters

There are several variations on Hildebrand parameters,Hansen parameters, and interaction cohesion parameters,introduced in attempts to achieve a compromise betweensimplicity of operation and validity of prediction.

Page 24: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LDK/LSK/MAG P2: FYK/FFU QC: FYD Final

Encyclopedia of Physical Science and Technology EN003C-118 June 13, 2001 21:8

Cohesion Parameters 237

FIGURE 1 Hansen parameter δp–δh locations for various classes of organic compound. [Adapted from Klein, E.,Eichelberger, J., Eyer, C., and Smith, J. (1975). Water Res. 9, 807.]

Hansen parameters must be represented in three dimen-sions, but it is possible to use fractional cohesive pressuresplotted on a triangular chart,

cd = δ2d

/δ2; cp = δ2

p

/δ2; ch = δ2

h

/δ2 (27)

where δ2 is given by Eq. (22). It is also possible touse triangular representation with fractional cohesionparameters,

fd = δd

δd + δp + δh;

fp = δp

δd + δp + δh; (28)

fh = δh

δd + δp + δh

Triangular representations make the excessively simpli-fying assumption that the total or Hildebrand parameter

Page 25: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LDK/LSK/MAG P2: FYK/FFU QC: FYD Final

Encyclopedia of Physical Science and Technology EN003C-118 June 13, 2001 21:8

238 Cohesion Parameters

δ is constant for all materials and that it is the relativemagnitude of the three contributions (dispersion forces,polar interactions, hydrogen bonding) that determines theextent of miscibility.

The Hildebrand parameter has been subdivided in otherways. One approach was to divide it into two main com-ponents, defining a nonpolar cohesion parameter δd and apolar parameter δo. These are related by

δ2 = δ2d + δ2

o . (29)

This approach neglects both induction interactions and,more important, specific interactions. The induction “cor-rection” can be taken into account to some extent by meansof the factor i jb, as shown in the following expression forthe interchange cohesive pressure:

i jA = (iδd − jδd

)2 + i jb(

iδo − iδo)2

. (30)

This is equivalent to Eq. (25) with the δh term omitted.

II. LIMITATIONS OF COHESIONPARAMETERS

Cohesion parameters provide one of the simplest methodsof correlating and predicting the cohesive and adhesiveproperties of interacting materials from a knowledge ofthe properties of the individual components. It is there-fore to be expected that there will be severe limitationson their use. What is surprising is that this simple cor-relation method works as well as it does in practice. Itis really a semiquantitative version of the statement “likedissolves like.” In fact, of course, whether any particularcorrelation or prediction method is seen to “work” dependson the precision that is expected in the application beingconsidered.

A. Burrell Hydrogen Bonding Classes

In the use of Hildebrand parameters, the existence of hy-drogen bonding is the most obvious cause of discrepan-cies from “regular” behavior. H. Burrell was one of thefirst to attempt to deal with the hydrogen bonding factorin the application of Hildebrand parameters to practicalsystems by the simple expedient of dividing solvents intothree classes according to their hydrogen bonding capaci-ties, on the assumption that complete miscibility can occuronly if the degree of hydrogen bonding is comparable inthe components:

1. Liquids with weak or poor hydrogen bondingcapacity, including hydrocarbons, chlorinatedhydrocarbons, and nitrohydrocarbons

2. Liquids with moderate hydrogen bonding capacity,including ketones, esters, ethers, and glycolmonoethers

3. Liquids with strong hydrogen bonding capacity, suchas alcohols, amines, acids, amides, and aldehydes.

This classification is still widely used in practical ap-plications, and a few typical examples are presented inTables I and II (Section VI).

B. Geometric-Mean Corrections

One of the specific assumptions in the development of co-hesion parameter expressions is the geometric-mean ap-proximation [Eq. (3)]. There are several equations thatpermit some correction to be made for derivations fromthis behavior, for example,

i jc = (ic jc)1/2(1 − i jl), (31)

where i jl is a dimensionless constant of the order of 0.01to 0.1, characteristic of a given pair of materials. When thisis incorporated, the empirical expression for interchangecohesive pressure corresponding to Eq. (4) becomes

i jA = (iδ − jδ)2 + 2i jl iδ jδ. (32)

The value of the correction term can be estimated fromeither liquid- or gas-phase data.

C. Chameleonic Behavior

The most important situation requiring caution in the useof cohesion parameters, particularly Hildebrand param-eters and Hansen parameters, is that in which electrondonor–acceptor interactions within components are verydifferent from those between components. Common ex-amples are systems involving hydrogen bonding: alcohols(particularly methanol), carboxylic acids, water, primaryand secondary amines, and glycol ethers. Most commonly,it is hydrogen bonding within the pure component thatcauses a user the most problems, as one tends to overlookthis, whereas one is more on the alert for new interactionswhen components are mixed.

Although this has long been understood in a generalway, it was put most clearly by K. L. Hoy, who proposedthe term chameleonic for those compounds that havethe capacity to assume a character similar to theirenvironment.

By dimerization or intramolecular association, whatwould otherwise be a polar material can behave in a non-polar manner, thus minimizing the energy. Examples arecarboxylic acids (structure I), glycol ethers (II and III),and diols (IV). It is clear that the cohesion parameters

Page 26: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LDK/LSK/MAG P2: FYK/FFU QC: FYD Final

Encyclopedia of Physical Science and Technology EN003C-118 June 13, 2001 21:8

Cohesion Parameters 239

SCHEME 1

of associated and dissociated forms are very different,so the particular value exhibited will depend on the sit-uation in which the compound occurs. Structures suchas tetramers (V) and chains (VI) have been proposed formethanol and methanol–ethanol mixtures.

In the case of water, it is necessary to distinguish be-tween systems of water with low-permittivity organic liq-uids (i.e., associated water, with limited interaction be-tween water and organic) and systems of water withhydrogen bonding organic liquids, where there is moreintermolecular association at the expense of intramolecu-lar association.

The difficulty in all such examples is that Hildebrandparameters and Hansen parameters are trying to indicatetwo or more things with one number or one set ofnumbers:

1. The extent of cohesion within the pure compound2. The potential for cohesion in a mixture.

When one says that the Hansen hydrogen bonding pa-rameter δh of ethanol is 20 MPa1/2, it means that withinpure ethanol δ2

h = 400 MPa is the extent of cohesion dueto hydrogen bonding, but it is also inferred that δh reflectsthe hydrogen bonding capability in a mixture.

It is clear that this will work best when the two com-ponents of the mixture are similar, and as the differ-ences increase, situations arise where the effective δh maybe much less than or much greater than that in a pureliquid.

As far as hydrogen bonding is concerned, some purecompounds are both proton donors and proton accep-tors. Here, the extent of cohesion due to association in a

Page 27: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LDK/LSK/MAG P2: FYK/FFU QC: FYD Final

Encyclopedia of Physical Science and Technology EN003C-118 June 13, 2001 21:8

240 Cohesion Parameters

mixture generally can have any value up to that in a purecompound. Examples are alcohols, carboxylic acids, wa-ter, and primary and secondary amines.

In contrast, many compounds have a dominant capacityto accept protons: ketones, aldehydes, esters, ethers, ter-tiary amines, aromatic hydrocarbons, and alkenes. Here,the potential for cohesion in mixtures due to hydro-gen bonding is greater than that in pure compounds.There are also some that are proton donors only, suchas trichloromethane.

It is clear that neither Hildebrand parameters norHansen parameters are adequate to handle this problemin a quantitative way. There are two possible approaches:

1. To treat each associated species in both the pure liq-uids and the mixtures as new compounds, with formationconstants that can be evaluated. This type of approach istraditional and well established, but rather cumbersome.

2. To use a full set of interaction cohesion parameterswith separate Lewis acid and base contributions as well asdispersion, induction, and orientation terms. This methodhas the greater practical potential.

On reference to Eq. (20), it can be seen that when i jAab

is large and negative, exothermic mixing (with evolutionof heat) may be explained, in contrast to the restriction toathermic or endothermic processes, when only dispersionand polar forces exist. Thus, the answer to the criticismthat Hildebrand parameters cannot cope with molecularassociation or exothermic interactions is that they shouldbe expressed in the form of interaction cohesion param-eters. The price to be paid is greater complexity in bothevaluation and use.

III. COHESION PARAMETERS ANDOTHER SOLVENT SCALES

Parameters describing and correlating the solventcapacities of liquids have been based on a great varietyof chemical and physical properties. Some are measuresof liquid “basicity.” Others are direct determinations of thesolubility of a representative solute in a range of liquids,for example, the solubility of hydrogen chloride in thesolvents at 10◦C.

Once a solubility scale has been established, it is nec-essary to determine the position on it of any required sol-vent. If the solubility scale has a theoretical basis, it maybe possible to calculate values from information on otherproperties, but if it is an empirical scale, direct testing isusually required.

Because Hildebrand parameter values can be deter-mined readily and are widely available, they have been

used in conjunction with various other theoretical or em-pirical parameters to provide more effective predictionsof solvent properties. Several examples of the parametersthat have been used in this way to form “hybrid” cohesionparameter scales are now described.

A. Hydrogen Bonding Parameters

The Burrell hydrogen bonding classification (Section II.A)has been developed further by assigning quantitative val-ues to the hydrogen bonding capabilities of liquids andplotting graphs of Hildebrand parameters against hydro-gen bonding capacities.

An alternative quantitative measure of the capacity oforganic liquids to donate and accept hydrogen bonds issound velocity in solvent–paper systems. Paper fibers areheld together largely by hydrogen bonds, and on wettingpaper with a liquid, most disruption of the fiber bondingoccurs in the presence of those solvents that preferentiallyform fiber–liquid hydrogen bonds. As a result, the velocityof sound through the paper, which depends on the degreeof bonding, decreases as the liquid hydrogen bonding ca-pability increases. Water, as the most common solvent fordisrupting fiber–fiber bonds in paper, was chosen as a ref-erence standard, and the hydrogen bonding parameter wasdefined as

100 × sound velocity in water-soaked paper

sound velocity in liquid-soaked paper

B. Electrostatic Parameters

The cohesive properties of materials are closely relatedto the various electrostatic properties, such as dipole mo-ment, relative permitivity, polarizability, ionization poten-tial, and refractive index. Both dipole moment and relativepermittivity are of particular importance: In the absence ofspecific association, the dipole moment tends to determinethe orientation of solvent molecules around molecular so-lutes, while dissolution of ions is promoted by high relativepermittivity of the solvent. The electrostatic factor, whichis defined as the product of dipole moment and relativepermittivity, takes both effects into account and providesa basis for the classification of solvents. The fractionalpolarity is also noteworthy: The natures of molecular in-teractions have been discussed in terms of the fractionof the total interactions due to dipole–dipole or orienta-tion effects. This fractional polarity can be calculated fromthe dipole moment, polarizability, and ionization potentialonce certain assumptions have been made.

C. Spectroscopic Parameters

Numerous parameters have been developed on the ba-sis of spectroscopic measurements. One direct hydrogen

Page 28: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LDK/LSK/MAG P2: FYK/FFU QC: FYD Final

Encyclopedia of Physical Science and Technology EN003C-118 June 13, 2001 21:8

Cohesion Parameters 241

bonding parameter is based on the effect that a liquid hason a small amount of alcohol introduced into that liquid:The greater the extent of hydrogen bonding between theliquid and the alcohol hydroxyl groups, the weaker theO—H bond and the lower the frequency of the infraredradiation absorbed by that bond. If deuterated methanolor ethanol is used, the O–D stretch band is in a spectro-scopic region with little interference, permitting detectionat low alcohol concentrations. The extent of the shift tolower frequencies of the O–D stretching infrared absorp-tion of deuterated alcohol in the liquid under study thusprovides a measure of its hydrogen bond acceptor capa-bility. The spectrum of a solution of deuterated methanolor ethanol is compared with that of a solution in benzeneor other reference liquid, and the hydrogen bonding pa-rameter is defined as 10% of the O–D absorption shiftexpressed in wave numbers. The choice of the referenceliquid is important. Benzene is not an inert solvent, thearomatic electron system having some hydrogen acceptorproperties, and there is even the possibility of some hydro-gen bonding between methanol and tetrachloromethane,another reference solvent. It appears that alkanes such ascyclohexane, heptane, and isooctane may be preferable asstandards.

Spectroscopic hydrogen bonding parameters form aspecial case of a more general type of parameter describedby such names as “electron donating power” and “electronaccepting power.”

D. Empirical Solvent Scales

There are many empirical tests in common use for quan-tifying solvent behavior. One group of tests describes the“solvent power” or “strength” of liquid hydrocarbons. Theaniline point or aniline cloud point, which is based onthe fact that aniline is a poor solvent for aliphatic hydro-carbons and an excellent one for aromatics, is defined asthe minimum equilibrium solution temperature for equalvolumes of aniline and solvent. An approximately quan-titative correlation exists between Hildebrand parametersand the aniline point for hydrocarbons, but it is of lim-ited value. The kauri–butanol number (KB) is a measureof the tolerance of a standard solution of kauri resin in1-butanol to added hydrocarbon diluent. There is an ap-proximately linear relationship between the Hildebrandparameter and the KB number for hydrocarbons withKB > 35:

δ/MPa1/2 = 0.040 KB + 14.2. (33)

The solvent power of “chemical” or “oxygenated” liq-uids (such as alcohols, ketones, esters, and glycol ethers)is much greater than that of hydrocarbons, and differentscales are necessary for their description. The dilution ra-

tio is widely used since it is a direct measure of the tol-erance of a solvent–resin mixture to added diluent, andqualitative correlations exist between dilution ratios andHildebrand parameters. The heptane number of hydrocar-bon solvents is a measure of the relative solvent powerof high-solvency hydrocarbons in the presence of resinsnot soluble in heptane. The wax number and other mis-cibility numbers may also be correlated with the solventHildebrand parameter.

Another type of empirical solvent classification schemehas been developed in connection with chromatogra-phy, where it is useful to distinguish solvent strength or“polarity” from solvent “selectivity.” Gas–liquid chro-matographic methods are particularly convenient forquantitative characterization of the solvent properties ofthe stationary phase, whether this is a liquid or a polymer(see Section V.A). One approach is to define a polarityindex, a measure of the capacity of a liquid to dissolve orinteract with various polar test solutes. There is, in general,a good correlation between the polarity index values andHildebrand parameter values, but liquids such as diethylether and triethylamine that are strong proton acceptorsbut have no proton donor capacity have Hildebrand pa-rameter values similar to those of alkanes, although theyshow up as moderately polar on the polarity index scale.This is because Hildebrand parameters are based on pureliquid properties, while the polarity index is based on theinteractions between different liquids.

E. Hybrid Scales

Often cohesion parameters are included in hybrid solventscales which incorporate several kinds of parameters. Forexample, a “universal solubility” treatment includes theHildebrand parameter as a measure of the energy neces-sary to create a cavity in the bulk liquid to accommo-date a solute molecule. This contribution is combinedwith parameters allowing for solute–solvent dipole in-teractions, hydrogen-bond donor acidity, hydrogen-bondacceptor basicity, and an empirical coordinate covalencyparameter.

IV. APPLICATIONS OF COHESIONPARAMETERS

A. Liquids

For a pair of regular liquids, the infinite dilution activitycoefficient expression in terms of Hildebrand parametersis given by Eq. (5). For dilute solutions with specific inter-actions and size effects, resulting in nonzero enthalpies ofmixing and nonideal entropies of mixing, the expressionsbecome more complex but are still useful.

Page 29: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LDK/LSK/MAG P2: FYK/FFU QC: FYD Final

Encyclopedia of Physical Science and Technology EN003C-118 June 13, 2001 21:8

242 Cohesion Parameters

For systems composed of two liquids that have sub-stantial but incomplete mutual miscibility, cohesion pa-rameters may be used in combination with other empiricalexpressions. Although satisfactory data correlation is pos-sible, prediction of the exact extent of mutual solubility isdifficult.

Cohesion parameter methods have been used for thecorrelation of activity coefficients of a wide range of sys-tems. One of the most flexible approaches to the corre-lation and prediction of activity coefficients has been toinclude terms for some or all of the following:

� Regular enthalpy of mixing from Hildebrandparameters

� Polar orientation and induction effects� Entropy effects from hydrogen bonding equilibria� Gibbs free energy term for the breaking of hydrogen

bonds in association interactions

Each component is thus characterized by several param-eters. The overall average error in prediction for 845 liter-ature data points for ∼300 systems was 25% in the infinitedilution activity coefficient, falling to better than 9% forall saturated hydrocarbons in all solvents, adequate forscreening and for some design purposes.

One of the early applications of regular-solution the-ory was the discussion of activity coefficient ratios andequilibrium constants for complex formation in terms ofHildebrand parameters. There has been considerable de-bate as to the correct basis of the equilibrium constant(concentration, mole fraction, volume fraction, or mo-lality), different methods proving preferable for varioussystems. Extension to complexes of ionic species (as insolvent extraction systems) has also occurred.

A common application of solvents is in the separa-tion of the components of mixtures, by countercurrentliquid–liquid extraction or by extractive distillation orby azeotropic distillation. These solvent-aided separationprocesses can be planned with the aid of cohesion param-eters. The selectivity k Si j of a solvent k toward a dilutemixture of i and j , derived from relationships such asEq. (5), is given by

RT ln kSi j = iV ikA − jV jkA, (34)

with contributions from dispersion, orientation, induction,entropy, and acid–base terms. In general, the choice ofa noninteracting solvent with a cohesion parameter thatdiffers significantly from the cohesion parameters of thecomponents to be separated enhances the solvent selectiv-ity, although it reduces the solvent capacity. It is thereforenecessary to reach a compromise such that the liquid hasan adequate capacity but retains good selectivity and im-

miscibility between the phases. However, if a liquid can beobtained that specifically interacts with the component tobe extracted, then both the selectivity and the solvent ca-pacity are increased. For example, in the case of an alkanei (for which both iδa and iδb are negligible), an aromatichydrocarbon j ( jδa negligible), and an electron-acceptingliquid k (kδb negligible), the term for the specific interac-tion is given by 2 jV jδb

kδa.Another common situation in which cohesion parame-

ters have been employed involves solvent extraction of anelectrically neutral ion pair or complex used to transportionic species distributed in relatively dilute solution be-tween two very dissimilar phases such as hydrocarbon andwater. As in all other applications of cohesion parameters,it should not be expected that distribution ratios can be pre-dicted in detail by means of cohesion parameters, and evenif the infinite dilution values are approximately correct,these have limited applicability. Rather, the emphasis ison formulating broad trends and establishing correlations.

B. Gases

In many situations, such as the dissolution of oxygen inwater, there are negligible specific interactions when gasesdissolve in liquids, and it is reasonable to expect cohesionparameters, even simple Hildebrand parameters, to pro-vide methods of correlation and prediction.

J. H. Hildebrand and others have long pointed out theclose relationship that exists between the logarithm ofthe mole fraction gas solubility, log jxs, and the solventHildebrand parameter iδ. For example, the values of logjxs for nitrogen, argon, methane, ethylene, and ethane werefound to be linear functions of iδ for the normal primaryalcohols as solvents, but it was subsequently shown thatsome curved lines became straight when iδ2 was used inplace of iδ:

−ln jxs = −ln jxideal + ( jV/RT )(iδ − jδ)2, (35)

where V is the average partial molar volume of the gasin a range of liquids. There are several similar equations,all tacitly or explicitly assuming that the gaseous solute iscondensed to a hypothetical “liquid” state (with hypothet-ical liquid–state Hildebrand parameter and molar volume)before mixing with the solvent.

The solubilities of gases such as tetrafluoromethane,sulfur hexafluoride, and carbon dioxide deviate fromstraight lines because of various properties.

Another simple application of cohesion parameters togases is the plotting of gas solubility directly against sol-vent Hildebrand parameter values. A relatively sharp max-imum is usually obtained at a Hildebrand parameter valuecorresponding to that of the hypothetical liquid form ofthe dissolved gas.

Page 30: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LDK/LSK/MAG P2: FYK/FFU QC: FYD Final

Encyclopedia of Physical Science and Technology EN003C-118 June 13, 2001 21:8

Cohesion Parameters 243

As well as this situation of a gas possessing a hypo-thetical liquid-like molar volume when it dissolves in aliquid at low pressures, there is also the case of gasesthat liquefy or achieve liquid-like molar volumes becauseof low temperatures and/or high pressures. The solventproperties of compressed gases, especially carbon diox-ide, for biochemicals and polymers have been receivingmore attention recently because of their application inhigh-pressure gas chromatography and supercritical fluidchromatography.

Cohesion parameter concepts are being utilized forstudies of cryogenic liquids, refrigerants, and aerosol pro-pellants, as well as fundamental dense gas and vapor–liquid properties. For example, it is reported that evenat 25 K above their critical temperatures, compressedgaseous helium (δ ≈ 8 MPa1/2) and xenon (δ ≈ 16 MPa1/2)at 200 MPa pressure separate into two phases because oftheir divergent cohesion properties.

C. Solids

The regular-solution theory and the Hildebrand parameterare based on the enthalpy changes occurring when liquidsare mixed. In order to extend these concepts to solutionsof crystalline solids in liquids, it is necessary to estimatethe thermodynamic activity of the solid referred to a hypo-thetical liquid subcooled below its melting point. If this isdone, expressions such as Eq. (5) can be used to evaluatecohesion parameters of nonvolatile solutes or to estimatetheir solubilities.

The Hildebrand parameter, defined in Eq. (2), has a liq-uid state basis so this hypothetical liquid reference stateis necessary when the method is extended to solids. Useof the vaporization enthalpy or sublimation enthalpy of asolid at 25◦C in Eq. (2) uncorrected for the crystal–liquidtransition enthalpy change does not yield a Hildebrandparameter. It is another type of cohesion parameter thatshould not be confused with any of those defined inSection I.

Despite this complication, the solubility ( jxs on the molefraction scale) of a solid j in a liquid i has been shownto vary regularly with the Hildebrand parameter of thesolvent, plots of log jxs against iδ being nearly linear oronly slightly curved.

It is possible to derive more informative thermodynamicexpressions that involve various assumptions, and severalof these predict that, for solutions that approach regularbehavior, plots of log jxs against (iδ − jδ)2 should be ap-proximately linear, and this relatively simple method isprobably the most widely used method of correlating solidsolubilities in terms of Hildebrand parameters.

If iδ, jδ, and jV (for the subcooled liquid j correspond-ing to the solid of interest) as well as the melting point and

entropy of melting are known for approximately regularsystems, it is possible to calculate the data necessary forphase diagrams, either temperature against ixs plots forbinary systems or triangular diagrams with temperaturecontours for ternary systems.

The more detailed and informative interaction cohesionparameters have not been widely used in the correlationof solid–liquid solubilities. Hansen parameters have beenextended to ionic solids, but the determination of theirvalues has not been pursued to any great extent. Anyextension of interaction cohesion parameters to ionicsystems in the future would be very valuable in dealingwith solid–liquid systems.

D. Polymers

The parameter most commonly used in the discussion ofpolymer solutions is the polymer–liquid interaction pa-rameter χ . This parameter has been associated with var-ious theoretical treatments, but it can be considered ageneral, dimensionless parameter reflecting intermolec-ular forces between a particular polymer and a particularliquid. It was introduced as an enthalpy of dilution with noentropy component but has been considered subsequentlyto be a Gibbs free energy parameter. As originally formu-lated, the polymer–liquid interaction parameter was ex-pected to be inversely dependent on absolute temperatureand independent of polymer concentration, but as now em-pirically defined it depends in an unspecified way on tem-perature, solution composition, and polymer chain length.

It can be shown that the enthalpy part of the interac-tion parameter, χH, is related to the interchange cohesivepressure by

χH = i jAiV/RT . (36)

If Hildebrand parameters are used, from Eq. (4),

χH = (iV/RT )(iδ − jδ)2 (37)

and in terms of Hansen parameters

χH = (iV/RT )[(

iδd − jδd)2 + (

iδp − jδp)2 + (

iδh − jδh)2]

.

(38)

The enthalpy component of the polymer–liquid interac-tion parameter is therefore closely related to the cohesionparameters of the components of the polymer solution.

Cohesion parameter predictions of thermodynamicproperty values must be corrected to allow for the sub-stantial size differences between polymer and solventmolecules. However, if only semiquantitative “compat-ibility” information is required, cohesion parameters ofamorphous polymers may be used in the same way as thoseof liquids. When crystalline polymers are considered, it is

Page 31: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LDK/LSK/MAG P2: FYK/FFU QC: FYD Final

Encyclopedia of Physical Science and Technology EN003C-118 June 13, 2001 21:8

244 Cohesion Parameters

FIGURE 2 Hansen parameter three-dimensional model showing the extent of overlap between poly(phenylene oxidephosphonate ester) (spherical) and cellulose acetate. [Adapted from Cabasso, I., Jaguer-Grodzinski, J., and Vofsi, D.(1977). In “Polymer Science and Technology” (D. Klempner and K. C. Frisch, eds.), Vol. 10, p. 1, Plenum, New York.]

in principle necessary to calculate the activity of the crys-talline solid relative to the real or hypothetical amorphousmaterial at the same temperature. In practice, the distinc-tion between amorphous and crystalline polymers is notclear-cut, and nonthermodynamic empirical methods arefrequently used in the estimation of polymer cohesionparameters.

It is not correct to assume that the “best” solvent fora polymer is necessarily the corresponding monomer orlow molecular weight liquid polymer (“oligomer”) madeup of the same repeating units. Such liquids usually havecohesion parameter values lower than those of the poly-mers because their molar volumes are higher. Rather, thebest solvent is another liquid with a cohesive energy ratherhigher than that of the monomer but with a cohesive pres-sure the same as that of the polymer.

In one of the original applications of cohesion parame-ters to polymer solutions, the liquid Hildebrand param-eter was combined with the Burrell hydrogen bondingclassification (Section II.A). Hansen’s three-componentcohesion parameter system was also first developed forpolymer–liquid systems.

As well as the assumption of spherical Hansen parame-ter “volumes” of polymer–liquid interaction, as described

in Section I.C and illustrated in Fig. 2, it is possible topresent this information in the form of irregular volumes.Three-dimensional models, stereographs, sets of projec-tions, and contour maps have all been used. In many casesthe solubility behavior of a polymer can be adequatelyrepresented in two dimensions by δ and δh or by δ andδp. Another useful type of plot involves the subdivision ofthe Hildebrand parameter into a “volume-dependent” partδv = (δd + δp)1/2 and a “residual” or hydrogen bondingpart δh. Triangular diagrams also provide a method of con-veying information in three properties on two-dimensionalplots by making the excessively simple assumption that theHildebrand cohesion parameter is uniform for all materi-als and that it is the relative magnitude of the three con-tributions that determines the extent of miscibility. This isillustrated in Fig. 3.

Also widely used for polymers are hybrid maps of co-hesion parameters with other quantities such as hydro-gen bonding parameter, dipole moment, and fractionalpolarity.

Cohesion parameters are widely used to predict poly-mer solubility or swelling in liquids, but some liquids alsoaffect the mechanical properties of polymers by environ-mental stress cracking or crazing. Whatever the cause of

Page 32: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LDK/LSK/MAG P2: FYK/FFU QC: FYD Final

Encyclopedia of Physical Science and Technology EN003C-118 June 13, 2001 21:8

Cohesion Parameters 245

FIGURE 3 Limiting boundary of solubility for cellulose nitrate interms of fractional cohesion parameters for dispersion ( fd), polar(fp), and hydrogen bonding (fh) effects. [Adapted from Gardon,J. L., and Teas, J. P. (1976). In “Treatise on Coatings” (R. R. Myersand J. S. Long, eds.) Vol. 2, Chap. 8, Dekker, New York.]

these effects, there is a surprisingly close correlation withthe liquid Hildebrand parameter (Fig. 4).

E. Rate and Transport Properties

Many varied theoretical and empirical scales are avail-able for the description of the effect of solvents on therates of chemical processes in solution, cohesion pa-rameters forming only one. Because the transition statetheory can be interpreted in terms of pseudothermo-dynamic properties, including volume and enthalpy, itis also possible to attribute cohesion parameter valuesto the activated state. Cohesion parameters are closelyrelated to the internal pressures in the system, and fornonpolar reactions (as well as for polar reactions in non-polar solvents) the internal pressure of solvents influ-ences reaction rates in the same direction as does externalpressure.

The activated complex in a chemical reaction is consid-ered to have properties that approach those of the productsof the reaction, and there is a general rule that, if the re-action is one in which the products are of higher cohesionthan the reactants, it is accelerated by solvents of highcohesive pressure. Conversely, if the solvent is similar tothe reactants in cohesion properties, the rate tends to belower. For polar or ionic processes the effect of the solventis more complex, and the cohesion parameter concept isless useful.

Viscous flow can be regarded as a rate process in whichmolecules move into holes or voids in the liquid. Accord-ing to this model, the activation energy for viscous flowis related to the energy required to form such a hole andhence to the cohesion parameter of the liquid. One suc-cessful application was a relationship between Hildebrandparameters and the initial slopes of plots of dilute so-

FIGURE 4 Critical strain εc for crazing (open circles) or cracking(filled circles) of poly(2,6-dimethyl-1,4-phenylene oxide) as a func-tion of the Hildebrand parameter δ of the solvent. The εc values inair are in the band near εc = 1.5%. [Adapted from Bernier, G. A.,and Kambour, R. P. (1968). Macromolecules 1, 393.]

lution viscosities against mole fraction composition. Ina very different kind of system, the dropping points ofgreases made from lithium stearate soap and oils have beenshown to be a function of the Hildebrand parameters of theoils.

The limiting viscosities of dilute polymer solutions havealso received attention from this point of view. The viscos-ity of a dilute solution is a maximum in the best solvent,that is, in one in which the cohesion parameters of solventand polymer are comparable. In a “good” solvent the poly-mer molecules are “unfolded” or “uncoiled,” obtaining tothe maximum extent the more favorable polymer–liquidinteractions (and therefore resulting in the greatest vis-cosity), while in a “poor” solvent the polymer moleculesremain folded because of the more favorable intramolec-ular interactions. Viscosity–cohesion parameter correla-tions have proved successful for such dilute polymersolutions.

Although the viscosity of a dilute polymer solution isa minimum in poor solvents, as the polymer concentra-tion is increased there is a changeover in behavior: Theviscosity exhibits a maximum in poor solvents at higherconcentration. There is an aggregation or clustering ofpolymer molecules as a preliminary to phase separationwhen solvents are used that have cohesion parameters nearthe limits of the miscibility range for the polymer. Thisrather complex behavior can be described in terms of en-tropy and free volume effects, which are not present innonpolymer systems, and it should not be expected that itcan be fully correlated by means of cohesion parametersthat provide direct information only on the enthalpy aspectof the interactions.

Page 33: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LDK/LSK/MAG P2: FYK/FFU QC: FYD Final

Encyclopedia of Physical Science and Technology EN003C-118 June 13, 2001 21:8

246 Cohesion Parameters

Other transport properties have also been correlated bymeans of cohesion parameters, including gas, liquid, andsolid diffusion, permeation in polymers, reverse osmosisin membranes, and the time-dependent mechanical prop-erties of polymers.

F. Surfaces

As a result of the close relationship between solubility orphase separation and surface activity, cohesion parametersare useful for the characterization of heterogeneous as wellas homogeneous systems. Interfacial free energy (“surfacetension”) is the excess free energy due to the existence ofan interface, arising from imbalanced molecular forces.These forces in the bulk of the material are the origin ofcohesive properties, so the close link between adhesionand cohesion is easy to understand.

Various empirical or semiempirical equations have beenused to link cohesion parameters with surface tension.There is considerable theoretical and experimental justi-fication for subdividing surface free energy into additivecomponents analogous to the cohesion parameter compo-nents on the basis of the types of molecular interaction. Infact, Hansen considered the characterization of surfaces interms of the Hansen parameters of the liquids that spreadspontaneously on them. Ideas such as these have beenextended to practical problems of wetting, dewetting, ad-hesion, lubrication, adsorption, colloids, emulsions, andfoams.

G. Biological Systems

Solubility is of major importance in biochemical pro-cesses, and correlations with cohesion properties havebeen explored for such purposes as transport of moleculesthrough biological tissues, rationalization of physico-chemical influences on biological responses, formulationof drugs in liquid dosage form, design of insecticides, andproperties of biocompatible materials.

One particularly interesting example is the action of flu-orinated ethers as inhalation convulsants, on the one hand,or as anesthetics, on the other. Those such as hexafluo-rodiethyl ether with Hildebrand parameters of less than15 MPa1/2 are powerful convulsants, whereas those withδ > 15 MPa1/2 such as methoxyflurane are anesthetics. Al-though all the ethers dissolve equally well in bulk lipidsand have similar octanol–water partition values, they dis-solve differentially into specific local microenvironmentsor subregions, which can be considered to have Hildebrandparameter values different from that of the bulk. This in-dicates that the cohesion parameter concept is valuableboth in models concerned with general partition, concen-tration, or activity and in models that assume more specificmechanisms.

V. EVALUATION OF COHESIONPARAMETERS

A. Thermodynamic Calculations and InverseGas Chromatography

From the definition of the Hildebrand parameter [Eq. (2)],it is apparent that it is necessary to determine both theenthalpy of vaporization and the molar volume for itsevaluation. There is rarely much difficulty in finding areliable value for the molar volume of a liquid, and solidscan be treated as subcooled liquids with molar volumesextrapolated from the liquid state values. Frequently, themain problem is obtaining the enthalpy of vaporizationat the temperature of interest, usually 25◦C. Direct ex-perimental information is frequently unavailable, and ex-trapolation methods or even empirical calculations are of-ten necessary, based on such properties as boiling point,corresponding states, activity coefficients, and associationconstants. The “RT” correction in Eq. (2) assumes that thevapor is ideal, and although gas law corrections may beapplied, even at the normal boiling point the correction isusually negligible.

Polymers and solids pose particular problems becausethe enthalpy of vaporization is unavailable. Interactions ofpolymers or noncrystalline solids (particularly those usedas plasticizers) with liquids can be studied conventionallyby using the polymer or solid as the stationary phase ingas chromatography columns. The activity coefficients atinfinite dilution of volatile liquids in the polymer may bedetermined by this “inverse gas chromatography” usingmobile phases to investigate the properties of stationaryphases, rather than the reverse. From these activity coef-ficients, cohesion parameters may be estimated for poly-mers and some organic solids and liquids.

Hildebrand parameters, Hansen parameters, and themore detailed interaction cohesion parameters can beevaluated from inverse gas chromatography results, butto date this has not been widely practiced.

The thermodynamic quantity internal pressure, givenby

π = (∂U/∂V )T = (∂p/∂T )V − p (39)

and directly accessible from experiment for such non-volatile materials as polymers [to which Eq. (2) cannotbe applied] as well as mixed systems, can also providecohesion parameter values.

B. Empirical Methods

A list of liquids can be compiled with a gradation ofHildebrand parameter values to form a “solvent spec-trum.” In its most common form it includes subdivision

Page 34: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LDK/LSK/MAG P2: FYK/FFU QC: FYD Final

Encyclopedia of Physical Science and Technology EN003C-118 June 13, 2001 21:8

Cohesion Parameters 247

into categories of hydrogen bonding capacity, as indicatedin Table I. The Hildebrand parameter of a solute can thenbe taken as the midpoint of the range of solvent Hilde-brand parameters that provides complete miscibility orthe particular value that provides maximum solubility, ormaximum swelling in the case of a cross-linked polymer.An ASTM test method for polymer solubility ranges usesmixtures of solvents to provide a spectrum of closelyspaced Hildebrand parameters. Other physical proper-ties that can be used as well as solubility and swellinginclude viscosity and related properties such as greasedropping points. A range of semiempirical equations isalso available for correlation and prediction of cohesionparameters.

C. Homomorph Methodsfor Hansen Parameters

In any multicomponent cohesion parameter system suchas Hansen parameters, there arises the problem of eval-uating the components of the Hildebrand parameter sep-arately. One obvious approach is to compare the prop-erties of compounds that differ only in the presence orabsence of a certain group. Here, the homomorph con-cept is useful: The homomorph of a polar molecule is anonpolar molecule having very nearly the same size andshape.

For liquids, the Hansen dispersion parameter obtainedby homomorph methods can be subtracted from the to-tal cohesion pressure using Eq. (22), with the remainderbeing split into Hansen hydrogen bonding and polar pa-rameters so as to optimize the description of the solubilityand swelling behavior of a range of liquids and polymers.Both empirical methods and group methods (see next sec-tion) can be used. Once the three Hansen parameters foreach liquid are evaluated, the Hansen parameters for eachpolymer can be obtained.

This method may distort the relative magnitudes of theintermolecular forces, but as pointed out in Section I.C,the theoretical bases of Hansen parameters are not goodin any case.

Interaction cohesion parameters could, in principle, beevaluated in a similar way, but there has been little activityin this area.

D. Group Contribution Methods

Many properties of materials change in a regular way withincreasing chain length in a homologous series, and someproperties are conveniently linear. The miscibility behav-ior of materials depends to a large extent on the cohesiveand volume properties, specifically the molar cohesiveenergy −U and the molar volume V , and these quanti-

ties, together with their ratios and products, can be esti-mated in terms of standard contributions from groups ofatoms.

The molar cohesive energy can be represented by thesummation of atomic or group contributions:

−U = −∑

z

zU . (40)

Hildebrand parameters can be calculated from

δ =(−U

V

)1/2

=(

−∑

z

zU

/ ∑z

z V

)1/2

. (41)

Also useful are the group molar attraction constants z Fdefined by ∑

z

z F = −∑

z

zU∑

z

z V = δV (42)

so

δ =∑

z

z F

/ ∑z

z V . (43)

Values of z F and z V have been tabulated for the mostcommon organic molecular groups.

VI. SELECTED VALUES

It is neither appropriate nor practicable to provide herea comprehensive compilation of values, but exhaus-tive tables are included in the Handbook of Solubil-ity Parameters and Other Cohesion Parameters listedin the bibliography. Rather, listed in Table I are typi-cal values for some liquids whose Hildebrand parametervalues, Hansen parameter values, and interaction cohe-sion parameter values are known with reasonable re-liability. The Burrell hydrogen bonding classification(Section II.A) is also included. There is considerable vari-ation in the Hansen parameters reported for water. Astudy of the solubilities of a range of organics in wa-ter suggests δd = 20, δp = 18, δh = 18, and δt = 32 MPa1/2

rather than the results in Table I, which are more con-sistent with the behavior of water in organic liquids.This variability in Hansen parameter values is a fun-damental problem associated with the use of the sin-gle parameter δh, rather than the pair of acid and baseparameters.

For polymers, interaction cohesion parameters are yet tobe determined in any detail, so the values given in Table IIare restricted to ranges of polymer Hildebrand parameters(for use with solvents of specified Burrell hydrogen bond-ing class) and sets of Hansen parameters [together with theinteraction radius iR of Eq. (25)]. Table III presents pre-ferred Hildebrand parameter values for some well-studied

Page 35: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LDK/LSK/MAG P2: FYK/FFU QC: FYD Final

Encyclopedia of Physical Science and Technology EN003C-118 June 13, 2001 21:8

248 Cohesion Parameters

TABLE I Typical Hildebrand Parameter, Hansen Parameter, and Component Cohesion Parameter Values

Component cohesion Hansenparameters (MPa1//2) parameters (MPa1//2)

BurrellHildebrand hydrogen Molarparameter bonding volume

Liquid (δt/MPa1/2) δd δo δi δa δb δd δp δh class (cm3 mol−1)

Pentane 14.5 14.5 0 0 0 0 14.5 0.0 0.0 Poor 115

Hexane 14.9 14.9 0 0 0 0 14.9 0.0 0.0 Poor 131

Diethyl ether 15.3 13.7 5 1 0 6 14.5 2.9 5.1 Moderate 105

Cyclohexane 16.8 16.8 0 0 0 0 16.8 0.0 0.2 Poor 108

Ethyl acetate 18.2 14.3 8 2 0 6 15.8 5.3 7.2 Moderate 98

Toluene 18.2 18.2 0 0 0 1 18.0 1.4 2.0 Poor 107

Tetrahydrofuran 18.6 15.5 7 2 0 8 16.8 5.7 8.0 Moderate 82

Benzene 18.8 18.8 0 0 0 1 18.4 0.0 2.0 Poor 89

Acetone 19.6 13.9 10 3 0 6 15.5 10.4 7.0 Moderate 74

Chlorobenzene 19.8 18.8 4 0.6 0 2 19.0 4.3 2.0 Poor 102

Bromobenzene 20.2 19.6 3 0.4 0 2 20.5 5.5 4.1 Poor 105

1,4-Dioxane 20.7 16.0 11 2 0 2 19.0 1.8 7.4 Moderate 86

Pyridine 21.7 18.4 8 2 0 10 19.0 8.8 5.9 Strong 81

Acetophenone 21.7 19.6 6 1 0 7 19.6 8.6 3.7 Moderate 117

Benzonitrile 21.9 18.8 7 2 0 5 17.4 9.0 3.3 Poor 103

Propionitrile 22.1 14.1 14 4 0 4 15.3 14.3 5.5 Poor 71

Quinoline 22.1 21.1 4 0.6 0 9 19.4 7.0 7.6 Strong 118

N ,N -Dimethylacetamide 22.1 16.8 10 3 0 9 16.8 11.5 10.2 Moderate 92

Nitroethane 22.5 14.9 12 5 0 2 16.0 15.5 4.5 Poor 71

Nitrobenzene 22.7 19.4 7 2 0 2 20.0 8.6 4.1 Poor 103

N ,N -Dimethylformamide 24.1 16.2 13 5 0 9 17.4 13.7 11.3 Moderate 77

Dimethylsulfoxide 24.5 17.2 13 4 0 11 18.4 16.4 10.2 Moderate 71

Acetonitrile 24.7 13.3 17 6 0 8 15.3 18.0 6.1 Poor 53

Nitromethane 26.4 14.9 17 6 0 3 15.8 18.8 5.1 Poor 54

Water (see text) 48 13 31 21 34 22 16 16 42 Strong 18

polymers. For solid materials, in general only Hildebrandparameters are available (Table IV).

VII. CURRENT STATUS

Theoretical development of this topic appears to havereached something of a plateau, but there is an increasing

TABLE II Typical Hildebrand Parameter and Hansen Parameter Values for Polymers, δ (MPa1//2)

Hildebrand parameterranges in liquids of Burrell

Hansen parameter hydrogen bonding class

Polymer (manufacturer) δd δp δhjR Poor Moderate Strong

Pentalyn� 255 alcohol-soluble resin (Hercules) 17.6 9.4 14.3 10.6 18–21 15–22 21–30

Pentalyn� 830 alcohol-soluble rosin resin (Hercules) 20.0 5.8 10.9 11.7 17–19 16–22 19–23

Cellulose nitrate, 0.5 sec 15.4 14.7 8.8 11.5 23–26 16–30 26–30

Cellolyn� 102 pentaerythritol ester of rosin, modified (Hercules) 21.7 0.9 8.5 15.8 16–21 17–22 21–24

Versamid� 930 thermoplastic polyamide (General Mills) 17.4 −1.9 14.9 9.6 — — 19–23

Poly(methyl methacrylate) (Rohm and Haas) 18.6 10.5 7.5 8.6 18–26 17–27 —

number of practical applications of cohesion parametersand of computational methods simplifying the process.

An excellent example of the determination and applica-tion of Hildebrand parameters to a “new” solvent and itscompatibility with polymers is provided by 1,8-cineole.This compound, present at high levels in the leaf oils ofsome eucalypts, is proposed as a replacement for the sol-vent 1,1,1-trichloroethane (which is now known to cause

Page 36: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LDK/LSK/MAG P2: FYK/FFU QC: FYD Final

Encyclopedia of Physical Science and Technology EN003C-118 June 13, 2001 21:8

Cohesion Parameters 249

TABLE III Preferred Hildebrand ParameterValues for Selected Polymers

Polymer δ/MPa1//2

Polyacrylonitrile 26

Polybutadiene 17.0

Poly(butyl acrylate) 18.5

Cellulose acetate 24

Cellulose nitrate 21

Polychloroprene 18.5

Poly(dimethylsiloxane) 15.5

Ethyl cellulose 20

Polyethylene 17.0

Poly(ethylene oxide) 24

Poly(ethyl methacrylate) 18.5

Polyisobutylene 16.5

Polyisoprene, natural rubber 17.0

Poly(methyl acrylate) 20.5

Poly(methyl methacrylate) 19.0

Polypropylene 16.5

Polystyrene 18.5

Poly(tetrafluoroethylene) 13

Poly(vinyl acetate) 20

Poly(vinyl chloride) 19.5

stratospheric ozone depletion). On the basis of calcula-tions such as those described in Section V, a Hildebrandparameter of 18 MPa1/2 for cineole was deduced, whichis within the range of values suggested by the polymersolubilities. This is close to the value for trichloroethane(17 MPa1/2), successfully predicting the efficacy of cine-ole as a replacement solvent.

Further, in developing a new solvent or solvent blend,it is also necessary to determine what polymers are likelyto be affected adversely if exposed to the liquid or vapor.Table V shows that those polymers having Hildebrandparameters within 1 MPa1/2 unit of the cineole value aresoluble, and within 3 MPa1/2 units there can be significantswelling, which is useful as an initial guide. However, theHildebrand parameters of polymers (such as polyethylene)showing good resistance to cineole, despite having similarHildebrand parameter values, demonstrate the limitationsof such predictions.

A demonstration of the great variety and extent ofapplications of cohesion parameters is provided bythe results of an internet search using Alta Vista,<http://www.altavista.com/cgi-bin/query?pg=aq>, con-ducted in June 1999 for the expression (“solubilityparameter∗” or “cohesion parameter∗” or “Hildebrandparameter∗” or “Hansen parameter∗”), yielding 460 hits.

The information resulting from this search has now beenorganized and collected at the two sites, <http://www.

mallee.com/parameters.html> and <http://wwwchem.murdoch.edu.au/staff/barton/parameters.html>, with con-venient hyperlinks to most of the sites found in the search.

Motivation for providing free internet information ofthe kind seen in these sites is determined by commercialconsiderations through the opportunity for attracting po-tential clients:

� Computer modeling and simulation software forchemical systems often include the ability to estimatecohesion parameters, a major growth area.

� Chemical manufactures incorporate cohesionparameter values (of either the Hildebrand or Hansenvariety) in material safety data sheets.

� A few educational institutions provide cohesionparameter information as a component of chemical orpolymer science, for example, University ofMissouri—Rolla <http://www.umr.edu/∼wlf/> anddiscussion lists such as <http://www.hmco.com/college/chemistry/resourcesite/digests/chemedl/cedjun96/ msg00149.htm>.

� Some publishers and conference organizers providingtitles or abstracts of papers to be presented or pub-lished include cohesion parameter topics, but full textsare rarely available.

TABLE IV Typical Hildebrand Parameter Valuesfor Predominantly Covalent Crystalline Solids(Assumed to be Subcooled Liquids)

Solid δ/MPa1//12

Alcohol: 1-hexadecanol 20

Aliphatic acids 18–22

Amines, anilines, amides 20–30

Aromatic hydrocarbons 20–22

Barbituric acid derivatives 23–28

Benzoic acid, substituted benzoic acids 23–29

Cholesterol 19

Choleseryl esters 15–19

Cortisone and related compounds 27–30

Halogen compounds of Sn, As, Sb, Bi 23–30

Iodine 29

Lipids 18–27

Metal soaps 18–19

Methyl xanthines, including caffeine 24–29

Norethindrone and derivatives 20–22

Phenols, including antioxidants and 19–22nitrophenols

Phosphorus 27

Sulfonamides 25–30

Sulfur 26

Testosterone and derivatives 19–20

Page 37: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LDK/LSK/MAG P2: FYK/FFU QC: FYD Final

Encyclopedia of Physical Science and Technology EN003C-118 June 13, 2001 21:8

250 Cohesion Parameters

TABLE V Relative Resistance of Polymers to 1,8-Cineole(δ == 18 MPa1/2)

Effect of cineoleon polymer

(4-month continuous δ

exposure) Polymer (MPa1//12)

Soluble Natural rubber 17.0

Polystyrene 18.5

Styrene-butadiene elastomer —

Strongly swollen Neoprene 18.5(>100%)

Polyurethane rubber 20–21

Silicone rubber 15.5

Little swelling Acrylic-styrene-acrylonitrile ∼22(<10%)

Polyester urethane ∼20

Polyester urethane ∼20

Polycarbonate/acrylic-styrene- —acrylonitrile

Fluoro elastomer —

Resistant Acrylonitrile-butadiene-styrene —

Acetal 21–22

Nylon-6 24–32

Nylon-6,6 28–32

Polyethylene 17

Poly(butylene terephthalate) 21–22

Poly(ethylene terephthalate) 21–22

Poly(methyl methacrylate) 20.5

Polypropylene 16.5

Polycarbonate 19–22

Styrene-acrylonitrile —

Poly(tetrafluoroethylene) 13

Poly(vinyl chloride) 19.5

� Scientific and technical consultancies use cohesionparameters to promote their services.

� Institutions and individuals include reference to theirpublications or research activities in resumes andbibliographies, such as the University of Geneva,<http://www.unige.ch/sciences/pharm/fagal/barra∼4.html#pubi>, and Princeton University, <http://www.princeton.edu/∼chemical/faculty/pubs.html>.

As with most other areas of the physical sciences, theamount of internet-accessible material describing thetheoretical background is limited. Probably the mostcomprehensive source is that written by John Burke(Oakland Museum of California), <http://sul-server-2.stanford.edu/byauth/burke/solpar/>.

For values of individual liquids, by far the greatestamount of quantitative information is provided in thepages of Charles Tennant & Co, <http://www.ctennant.co.uk/tenn04.htm>, with smaller numbers of compounds be-

ing listed by companies such as Shell, DuPont, Monsanto,Aeropres, Lal, and Eastman.

A smaller but increasing number of sites deal withcompressed gases and supercritical fluids, for example,Polymerizations in Supercritical CO2 from the Universityof Groningen, <http://polysg2.chem.rug.nl/>.

Solvent applications of fluids have always been animportant use of cohesion parameters, such as A Re-view of Supercritical Carbon Dioxide Extraction of Nat-ural Products from Engineering World, <http://www.exicom.org/cew/oct97/awasthi.htm>.

Chemical properties, as well as physical properties,are now being correlated by means of cohesion parame-ters, for example, Solution Effects on Cesium Complexa-tion with Calixarene—Crown Ethers from Liquid to Su-percritical Fluids (University of Idaho), <http://www.doe.gov/em52/65351.html>.

Liquid crystal applications are still poorly repre-sented, and studies of solids are mostly limited topharmaceuticals, such as Partial-Solubility Parametersof Naproxen and Sodium Diclofenac from the Journalof Pharmacy and Pharmacology, <http://dialspace.dial.pipex.com/town/avenue/ax60/j98020.htm>.

As in the printed literature, applications to polymersdominate internet cohesion parameter sites, for example,American Polywater Corporation on polycarbonatestress cracking, <http://www.polywater.com/cracking.html>; as well as Millipore, <http://millispider.millipore.com/micro/mieliq/MAL104.htm>; and Evaluating En-vironmental Stress Cracking of Medical Plastics fromMedical Device Link, <http://www.devicelink.com/mpb/archive/98/05/001.html>.

Polymer design is represented by IF/Prolog, <http://www.ifcomputer .com/Products/ IFProlog/Applications/PolymerDesign/home–en.html>.

Surfaces receive less attention than homogeneousfluid and solid phases, although coatings and adhesivesare important, with typical sites being University ofMissouri—Rolla, <http://www.umr.edu/∼wlf/Adhesion/young.html>, and Shell Chemicals, <http://www2.shell-chemical.com/CMM/WEB/GlobChem.NSF/Searchv/SC:2023-94?OpenDocument>.

Chromatographic properties, such as RelationshipBetween Retention Behavior of Substituted BenzeneDerivatives and Properties of the Mobile Phase in RPLCin the Journal of Chromatographic Science, <http://www.j-chrom-sci.com/353sun.htm>, provide additional possi-bilities for applications of cohesion parameters to complexsystems.

Nanoparticles are beginning to be discussed in termsof cohesion parameters, for example, on the formationof gliadin nanoparticles: the influence of the solubilityparameter of the protein solvent, <http://link.springerny

Page 38: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LDK/LSK/MAG P2: FYK/FFU QC: FYD Final

Encyclopedia of Physical Science and Technology EN003C-118 June 13, 2001 21:8

Cohesion Parameters 251

.com/link/service/ journals/00396/bibs/8276004/82760321.htm>, but dyes are surprisingly under-represented.

Among environmental applications, the area ofmaterials substitution is the most potentially productiveuse of cohesion parameters, for example, TolueneReplacement in Solvent Borne Pressure Sensitive Ad-hesive Formulations (Shell Chemicals), <http://www2.shellchemical.com/CMM/WEB/GlobChem.NSF/Searchv/SC:2023-94? OpenDocument>; Solvents—the Alterna-tives from the Waste Reduction Resource Center,Raleigh, NC, <http://www.p2pays.org/ref/01/00023.htm/index.htm>; and Solvent Substitution Data Systems (U.S.EPA), <http://es.epa.gov/ssds/ssds.html>.

Database and compilations include ThermoDex,<http://thermodex.lib.utexas.edu/search.jsp>, and Info-chem, <http://www.infochem.demon.co.uk/data.htm>.

Examples of modeling and simulation sites are Sci-Polymer from ScienceServe, <http://www.scienceserve.com/wwwscivision/scipolymer/prediciti.htm>; Molecu-lar Analysis ProTM, <http://www.sge.com/software/soft/13051.htm>; The Askadskii Approach from Million-Zillion Software, <http://www.millionzillion.com/cheops/metho1/askadskii.htm>; Cerius2 from Molecular Simu-lations Inc., <http://www.man.poznan.pl/software/msi-doc/cerius38/TutMats/tut – synthia.doc.htm1#740585>;<http:/ /www.msi.com/solutions/polymers/misc.html>;and ProCAMD from Capec, <http://www.capec.kt.dtu.dk/capec/docs/software/procamd/procamd.html>.

QSAR (quantitative structure activity relationships)and QSPR (quantitative structure property relationships),which study relationships between useful chemical and

ideal candidates for this kind of approach, for exam-

ple, Molecular Analysis ProTM, <http://www.chemsw.com/13051.htm>.

While internet sources do not provide the comprehen-sive and integrated information available in the Handbookof Solubility Parametersters (Barton, 1992) or even the convenient summary in

and Other Cohension Parame-

the Encyclopedia of Physical Science and Technology,they should not be overlooked for information on recentdevelopments.

SEE ALSO THE FOLLOWING ARTICLES

BONDING AND STRUCTURE IN SOLIDS • GAS CHRO-MATOGRAPHY • HYDROGEN BONDS • LIQUIDS, STRUC-TURE AND DYNAMICS • MOLECULAR HYDRODYNAMICS

• SURFACE CHEMISTRY

BIBLIOGRAPHY

Barton, A. F. M. (1983). In “Polymer Yearbook” (H. G. Elias, and R. A.Pethrick, eds.), p. 149, Harwood, Chur, Switzerland.

Barton, A. F. M. (1992). “Handbook of Solubility Parameters and OtherCohesion Parameters,” 2nd ed., CRC Press, Boca Raton, FL.

Barton, A. F. M. (1990). “Handbook of Polymer-Liquid Interaction Pa-rameters and Solubility Parameters,” CRC Press, Boca Raton, FL.

Barton, A. F. M., and Knight, A. R. (1996). J. Chem. Soc. Faraday Trans.92, 753.

Burrell, H. (1975). In “Polymer Yearbook” (J. Brandrup, and G. H.Immergut, eds.), 2nd ed., IV-337, Wiley (Interscience), New York.

Hansen, C. M. (1969). Ind. Eng. Chem. Prod. Res. Dev. 8, 2.Hansen, C. M., and Beerbower, A. (1971). In “Kirk-Othmer Encyclope-

dia of Chemical Technology,” 2nd ed. (A. Standen, ed.), Suppl. Vol.,p. 889, Wiley (Interscience), New York.

Hoy, K. L. (1970). J. Paint Technol. 42, 76.Karger, B. L., Snyder, L. R., and Eon, C. (1978). Anal. Chem. 50, 2126.Rowes, R. (1985). “Chemists propose universal solubility equation,”

Chem. Eng. News (March 18), 20.

Page 39: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN004G-160 June 15, 2001 12:44

CrystallographyJeffrey R. DeschampsJudith L. Flippen-AndersonLaboratory for the Structure of Matter, Naval Research Laboratory

I. IntroductionII. Evolution of CrystallographyIII. Structure of a CrystalIV. Steps in Crystal Structure AnalysisV. Comparison of X-Ray and Neutron Diffraction

VI. ResultsAppendix I: Factors Affecting IntensitiesAppendix II: Methods of Structure SolutionAppendix III: Methods of Refinement

GLOSSARY

Absorption edge Sharp discontinuity in the variationof the linear absorption coefficient with the wave-length of the incident radiation. The discontinuity oc-curs when the energy of the incident radiation, E = hν,matches the excitation energy of an electron in an atomof the sample.

Anomalous dispersion A phenomenon that influencesthe intensities of X-ray reflections and causes a differ-ence in the intensity of equivalent reflections. The effectis particularly important in studies of single crystals inpolar space groups and is used in some experiments todetermine phase information.

Bragg reflection When X-rays strike a crystal they arediffracted only when the Bragg equation, nλ = 2d sin θ

(where n is an integer and d is the spacing of a setof lattice planes), is satisfied. The diffracted beam isconsidered a reflection.

Bravais lattice One of the 14 possible arrays of pointsrepeated periodically in three-dimensional space suchthat the arrangement of points about any one point isidentical in every respect to the arrangement of pointsabout any other point in the lattice.

Centrosymmetric A structure or space group containingan inversion center is centrosymmetric, if there is noinversion center it is noncentrosymmetric.

Diffractometer An instrument used to measure the po-sition (i.e., Bragg angle) and relative intensity ofthe diffraction pattern produced by a crystallinematerial.

Lattice Any repetitive pattern can be described by not-ing the motif (the unit of pattern that repeats by trans-lation) and the translation interval. In the case of athree-dimensional pattern such as a crystal, the latticedescribes translations in three dimensions. It is an imag-inary, mathematical construct characterized by threetranslations, a, b, c, and three angles, α, β, γ .

121

Page 40: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

122 Crystallography

Miller indices A set of integers with no common factors,inversely proportional to the intercepts with the crystalaxes of a lattice plane.

Orientation matrix A matrix relating the crystal axes tothe instrument axes such that one can predict the valuesof the instrument angles (2θ, ω, χ, and �) for a givenreflection of the crystal.

Patterson function A Fourier summation that uses thesquares of the structure factor magnitudes as coeffi-cients. The peaks in this map correspond to vectorsbetween atoms. The peak height is related to the scat-tering powers of the atoms at the two ends of the vector.The region around the origin gives information aboutbonded distances.

Phase problem A central problem of crystallography.The intensities of the different reflections allow deriva-tion of the amplitude of the structure factors but nottheir phases. The phases are required in order to calcu-late the electron density, which is a “map” showing theposition of atoms in the unit cell.

Point group A group of symmetry operations that leaveunmoved at least one point within the object to whichthey apply.

Polar space group Space group in which the origin is notfixed by symmetry and hence must be defined (e.g., thespace group P21).

Reciprocal lattice A set of imaginary points constructedin such a way that the direction of a vector from onepoint to another coincides with the direction of a normalto the real space planes within the crystal. The sepa-ration of those points (absolute value of the vector) isequal to the reciprocal of the real inter-planar distance.

Space group Identical atom groups are usually symmetri-cally arranged within the crystal lattice. The symmetryrelating the groups may be due to rotations, inversions,mirror planes, or some other relational operation. Thespace group constitutes a mathematical shorthand de-scription of the symmetry operations required to pro-duce the unit cell.

Special position A point left invariant by at least twosymmetry operations of the space group.

Structure factor Fhkl , complex quantity correspondingto the amplitude and phase of the diffraction maximumassociated with the reciprocal lattice point hkl:

Fhkl =N∑

j=1

f j exp[2π i(hx j + ky j + lz j )]

where N is the number of atoms in the unit cell andx j , y j , z j are the fractional coordinates of the j th atom.

Torsion angle If a group of four atoms (ABCD) is pro-jected onto a plane normal to the bond between B andC, the angle between bonds connecting A and B, andC and D is the torsion angle.

Unit cell Parallelepiped bounded by three noncoplanarvectors a, b, c with angles α, β, γ that repeats by trans-lation. If this unit is the smallest volume that meetsthese criteria it is referred to as the primitive unit cell.

X-ray Electromagnetic radiation with wavelengths in therange 0.01 to 1.0 nm (0.1 to 10 A). The wavelengthrange most commonly used in diffraction experimentsis between 0.71 and 1.54 A. Shorter wavelengths re-quire longer path-lengths (distance from crystal to de-tector) in order to resolve adjacent peaks in the diffrac-tion pattern. Both the sample and the air along the beampath can significantly attenuate longer wavelengths.

CRYSTALLOGRAPHY is a broadly encompassing dis-cipline that involves a variety of fields of study. The pri-mary concern of modern crystallography is the three-dimensional arrangement of atoms in matter. Although theterm most often refers to studies of crystalline solids (ei-ther single crystals or crystalline powders) using X-ray orneutron diffraction, it encompasses a much broader rangeof methodologies.

I. INTRODUCTION

In the late 1660s, crystallography began as the study of themacroscopic geometry of crystals. Crystals were groupedinto systems on the basis of the symmetry of their externalshapes. Based on these observations, seven crystal systemswere identified: triclinic, monoclinic, trigonal, tetragonal,hexagonal, orthorhombic, and cubic. It was theorized thatthe observed crystal shapes could be built up by stackingminute balls. This idea was refined, and in the late 18thcentury it was thought that crystals were composed of ele-mentary building blocks (now referred to as the unit cell).This was supported by the orderly cleavage angles of cal-cite, which suggested a regular stacking of these elemen-tary blocks. Further studies in the 1800s led to derivationof the 32 different point groups, the Bravais lattices, andthe 230 space groups. All of these advances were madebefore any direct observations of the arrangement of atomswithin a crystal were possible.

In 1912, Von Laue reasoned that the arrangement ofatoms in crystals could help him measure the wavelengthof X-rays. Based on experiments with copper sulfate,he demonstrated the ability of crystals to act as three-dimensional diffraction gratings. Crystallography had en-tered a new era: the analysis of the arrangement of atomsin a crystal by careful analysis of the diffraction pattern ofthat crystal. Crystal structures were now viewed as beingbuilt up from repeating units of an atomic pattern ratherthan the regular stacking of solid shapes.

Page 41: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

Crystallography 123

As new types of scattering were discovered, they be-came incorporated into the discipline, and crystallogra-phy came to include structural studies of all classes ofsubstances. Absorption, diffraction, or other scatteringmethods are used to study crystals, powders, amorphousmaterials, surfaces, liquids, and gases. The physics andmethodologies associated with these techniques are alsopart of the science of crystallography.

Crystallographic studies play a vital role in materialsscience, chemistry, pharmacology, mineralogy, polymerscience, and molecular biology. Accurate knowledge ofmolecular structures is a prerequisite for rational drug de-sign and structure-based functional studies. Crystallogra-phy is the only method for determining the “absolute”configuration of a molecule. Absolute configuration is acritical property in biological systems, as changes in thismay alter the response of the biologic system.

A requirement for the high accuracy of crystallographicstructures is that a good crystal must be found, and thisis often the rate-limiting step. Additionally, only limitedinformation about the dynamic behavior of the moleculeis available from a single diffraction experiment.

In the past three decades, new developments in detec-tors, increases in computer power, and powerful graph-ics capabilities have contributed to a dramatic increase inthe number of materials characterized by crystallography.Synchrotron sources offer the possibility of time-resolvedstudies of physical, chemical, and biochemical processesin the millisecond to nanosecond range; the ability to studythe nearest neighbors of cations present at parts per mil-lion concentrations; and the possibility of recording small-angle scattering data and powder data in seconds. Charge-density studies have been made on numerous light atomstructures and are beginning to provide new insights intobonding of transition metals. Rietveld refinement is rev-olutionizing the study of powders and is being extendedto fibers. Direct methods of structure solution are beingapplied successfully to structures of over 1000 atoms. TheHuman Genome Project has created many opportunitiesfor crystallographic studies of biological macromoleculesand resulted in intense activity in the areas of structuralgenomics and proteomics. Polarized neutrons are beingused to determine the spin structure of magnetic materialsand to probe the surface structure of such materials. Thereview that follows attempts to describe the fundamentalsof crystallography and capture the excitement and diver-sity within the discipline.

II. EVOLUTION OF CRYSTALLOGRAPHY

It was not very long ago that X-ray diffraction data werecollected on photographic film, intensities of spots onthe film (corresponding to data points) were “measured”

by eye, and Fourier transforms were performed withBeevers–Lipson or Patterson–Tunell strips and summedby hand. Three-dimensional electron-density maps wereplotted by hand, one section at a time, traced onto glasssheets, and stacked in frames for interpretation. As longas all calculations were done by hand, small, flat moleculeswere the ones most amenable to study. Early computerswere difficult to program and had limited storage capacity,but they made it possible to solve and refine crystal struc-tures of molecules of moderate size in 6 months to a year.

The rapid advances in crystallography owe much to thedevelopment of computer-controlled diffractometers fordata collection, high-speed computers for data analyses,and, most recently, powerful graphics devices for display-ing structures with the ability to perform real-time manip-ulations. Tasks that required months of effort can now beaccomplished in minutes with the aid of a computer.

A. The Early Years

The first experiments in X-ray diffraction were recordedon film. By 1913, W. H. Bragg had constructed the first“X-ray spectrometer” to allow a more careful study ofX-rays. This instrument also proved useful in studieson crystals. Using measurements made with this X-rayspectrometer, Bragg’s son determined the structures offluorspar, cuprite, zincblende, iron pyrites, sodium nitrate,and the calcite group of minerals.

The first crystal structures reported were of substancescrystallizing in cubic space groups. The structure ofdiamond was determined in 1913 by the Braggs, fromsymmetry considerations using the observed intensities todiscriminate among possible structures. Their modelestablished the carbon–carbon single bond distance of1.52 A and confirmed that bonds to carbon are directedtetrahedrally. The younger Bragg combined symmetry ar-guments with the notion that the scattering power of atomsis related to their atomic weight to explain the structuresof the alkyl halides. These concepts were extended to ZnS,CaF2, and FeS2. The alkyl halide structures show anionssurrounded by cations and cations surrounded by anions,demonstrating conclusively that discrete “molecules”of the type NaCl do not exist in crystals of ionicmaterials. Pauling published the first intermetallic struc-ture, Mg2Sn, in 1923. In that same year, Dickerson andRaymond showed that hexamethylene tetramine consistedof discrete molecules, each having the same structure,with C–N distance of 1.44 A in a body-centered cubiclattice of edge 7.02 A.

1. Heavy-Atom Methods

The elder Bragg realized that the periodic pattern in theelectron-density distribution could be represented by a

Page 42: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

124 Crystallography

Fourier summation. The coefficients of this summationbecame known as the structure factors. This allowed thesolution of structures where positions of atoms were notrestricted to special positions. The determination of thestructure of diopside, CaMg(SiO3)2, in 1928 was the firstexample of the use of this method.

In 1934, Patterson showed that a Fourier series with |F |2as coefficients could be summed without knowledge of thephases and would reveal interatomic vectors. Since theweight of a peak in the Patterson function is proportionalto the product Zi Z j of the atomic numbers of the atoms atthe ends of the vector, vectors involving heavy atoms standout among light atom–light atom vectors. In 1936, Harkershowed that symmetry properties of the crystal causedvector density to accumulate on certain planes and lines(later known as the Harker section) in the Patterson func-tion. These two papers were the foundation of the heavy-atom method for crystal structure solution. The methodassumes that phases calculated from the heavy-atom posi-tions will be sufficiently accurate that a Fourier synthesis(using |F | as opposed to |F |2) will reveal the positions ofmore atoms, thus allowing solution of the structure. Phaseinformation from the new atoms could then be added tothe Fourier synthesis to locate more new atoms and so on,until the full structure was revealed.

The method of isomorphous replacement was used firstto solve the structure of the alums by Lipson and Beeversin 1935. If a centrosymmetric light-atom structure and itsheavy-atom derivative differ only in the presence of theheavy atom, then the differences in the intensity of equiv-alent reflections can be used to determine the signs ofthe structure factors. Robertson applied this method to thephthalocyanines in 1936. These molecules form an iso-morphous series and crystallize in P21/a. By comparingintensities of nickel phthalocyanine and the unsubstitutedmolecule, Robertson was able to assign phases to all buta few of the measured reflections from the h0l projection.Harker generalized the method to the noncentrosymmetriccase in 1956. Isomorphous replacement can be combinedwith anomalous dispersion to obtain phase information forlarge molecules.

2. Absolute Configuration

Friedel’s law states that the scattering from the front andback sides of a plane, hkl, are the same. This means that themeasured intensities of the 111 and 111 reflections (andall other “Friedel” pairs) should be equivalent. However,in a 1930 study on ZnS by Coster, Knol, and Prins, itwas noted that the 111 reflection was not equivalent toits Friedel mate. In the case of the ZnS crystal, the {111}faces are prominent but, even by visual examination, donot appear identical. One face is shiny and the other is dull.

The crystal structure can be regarded as alternating layersof Zn and S atoms perpendicular to the {111} direction.Looking at one layer of Zn atoms, we find it lies closer toone of the two adjacent layers of S atoms. If we assumethat the short Zn–S spacing is a “bonding” interaction andthe long Zn . . . S gap is a van der Waals contact, we do notexpect cleavage between bonded layers. This implies thatone of the {111} faces corresponds to a layer of S atomsand the other to a layer of Zn atoms.

It was fortuitous that the intensities in the Coster, Knol,and Prins study were measured using AuLα radiation(λ = 1.276 A). The K-absorption edge of Zn is 1.283 A.Thus, the measured intensities of Friedel pairs were notthe same. This was the first example of anomalous scatter-ing. The difference due to anomalous scattering is greatestwhen data are collected near an absorption edge of a heavyatom in the structure. It was nearly 20 years after the ZnSexperiment that Bijvoet realized this principle could beused to determine the absolute configuration of the sodiumrubidium salt of (+)-tartaric acid.

B. Modern Crystallography

The early years can be characterized as a period in whichthe size of a problem amenable to analysis was compu-tationally and instrumentally limited. Data collection andanalysis were manual processes. This began to change inthe 1950s.

1. Data Collection

Progress in crystallographic data collection can be chartedby examining early issues of Acta Crystallographica,published by the International Union of Crystallography(IUCr). Although a manual “diffractometer” was availablein 1913, the primary method for collecting intensity datafor crystallographic studies relied on X-ray cameras andfilm. In 1948, about two-thirds of the crystal structures re-ported used a camera and film to collect the intensity data,and less than one-quarter used a combination of film and adiffractometer. Despite the existence of automated single-crystal diffractometers, the situation was little changed by1962.

Use of the Cambridge Structure Database (CSD) allowsa more systematic study of the evolution of data collec-tion methods (Fig. 1) with the caveat that this databasereferences only organic and organometallic structures. Al-though automated diffractometers were first available inabout 1955 it was not until the mid-1960s that this newtechnology made much of an impact. At that time, onlya few hundred structures were being added to the CSDeach year. By the mid-1970s, over 1000 structures wereadded to the CSD each year and data for over 80% were

Page 43: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

Crystallography 125

FIGURE 1 Changes in data-collection methods with time. Using structures entered in the Cambridge StructureDatabase, changes in data collection strategies were determined by examining the fraction of structures reportedin one year for various data collection methods (photograph with visual estimation of intensities, photograph withdensitometer, diffractometer, or unknown). The large fraction listed as unknown between 1960 and 1965 is likely dueto changes occurring at that time in how data were collected and how data-collection methods were reported.

collected on a diffractometer. In 1997, over 15,000 struc-tures were added to the CSD and virtually all were col-lected on a diffractometer. The explosive growth of thisdatabase cannot be attributed solely to improvements indata collection, but certainly the routine use of automatedinstrumentation had a significant impact.

2. Structure Solution

Data collection was not the only beneficiary of the post-World War II progress in crystallography. No generalmethod existed for solving unknown structures withoutheavy atoms until the advent of direct methods—a meansof determining the values of phases from relationshipsamong the structure factor magnitudes associated withthose phases. The earliest structure solved by such meth-ods was decaborane by use of some inequalities derived byHarker and Kasper. However, this technique was limited tocentrosymmetric structures. The major effort in the 1950s

concerned the development of the mathematical aspectsof crystal structure analysis. The first general proceduresfor solving both centrosymmetric and noncentrosymmet-ric structures was developed in the early part of the 1960s.Use of this method grew with the power of computers andcomputer programs. It is now the most widely used methodof solving crystal structures of moderate size. Efforts arecurrently being made to apply direct methods to very largestructures such as proteins. For a more detail discussionof structure solution methods, see Section IV.H.

3. Charge-Density Distribution

The first papers to explore the difference between theresults of X-ray and neutron diffraction experiments ap-peared in the early 1970s. Systematic differences betweenthe positional and thermal parameters determined by thetwo techniques were reported. These differences were at-tributed to the difference in how neutrons and X-rays

Page 44: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

126 Crystallography

interact with atoms. Neutrons are diffracted primarilyfrom the nucleus; hence, neutron diffraction produces in-formation about nuclear position. X-rays are diffracted byelectrons and therefore yield information about the dis-tribution of electrons in the molecule. It was logical toextend these ideas and attempt to map the redistributionof electron density that occurs on bonding.

A number of different kinds of mapping have beendone. Subtracting core-electron densities from experimen-tal electron densities (i.e., p(r)) should reveal details ofthe redistribution of valence electrons on bonding. Theterm “valence density” is used to describe the differencefunction:

pvalence(r) = p(r) −∑atoms

pi,core(r)

To define either of these functions the positions and ther-mal parameters must be known. One way to approach thisis to use neutrons to determine the positional and thermalparameters and to use these parameters in conjunctionwith X-ray, spherical-atom scattering factors to calculatestructure factors for the “promolecule” using:

Fcalc,N =∑atoms

fi exp(2π iH · ri )Ti

Fcalc,N is the X-ray structure factor calculated from neu-tron positions. The deformation density or X–N map thencorresponds to:

pX−Ndeformation(r) = 1

∑H

(Fobs,X − Fcalc,N ) exp(−2πiH·r)

Since both Fobs and Fcalc contain the effects of thermal mo-tion, this deformation density map is thermally smeared.Its resolution is limited by the maximum (sin θ )/λ valueof the data.

Given that there are few neutron-diffraction facilitiesin the world and that it is difficult to correct adequatelyfor systematic effects in the two experiments (i.e., ab-sorption, extinction, thermal diffuse scattering, multiplereflections), it would be desirable to study bonding effectsusing exclusively X-ray data. There are several approachesto this problem.

The X–X formalism is similar to the X–N formal-ism described above, except that the calculated values forpositional and thermal parameters are derived from re-finement of high-angle X-ray data [(sin θ )/λ > 0.70 A−1].For deformation density maps, neutral spherical atoms aresubtracted from the observed density; in valence densitymaps, Hartree–Fock core-electron densities are used toevaluate Fcalc. Comparison of X–X and X–N maps showsthat they do indeed yield the same qualitative information.Bonding density shows in the middle of bonds, and lone-pair density is in the correct location. However, X–X maps

systematically underestimate lone-pair peak heights andplace hydrogen atoms too close to the atoms to whichthey are bonded. The experiments must be conducted atlow temperature (i.e., −75◦C or preferably less) for thismethod to succeed.

Other approaches include refinement of separate param-eters for core and valence electrons or the direct refinementof a deformation model. The major advantage of the di-rect refinement methods is that they make no assumptionsabout the (sin θ )/λ dependence of bonding features.

Errors in the deformation density maps arise from er-rors in both the model and the data. Demands on diffrac-tion methodology and interpretation are many times moresevere than those relevant to an average stereochemicalinvestigation. The following considerations are extremelyimportant:

1. The X-ray data set must be complete (all symmetry-related reflections measured) to a (sin θ )/λ limit of about1.3 A−1. This implies the use of short-wavelength ra-diation such as MoKα (λ = 0.7107 A) or AgKα (λ =0.5612 A). The maximum value of (sin θ )/λ is 0.65 forCuKα (λ = 1.5418 A).

2. The data must be corrected for absorption, extinc-tion, and thermal diffuse scattering. Multiple reflectionsmust be either avoided or eliminated. If these conditionsare met, internal agreements should be of the order of 2%;i.e., [ ∑

H

F2(H) − 〈F2(H)〉]/ ∑

H

F2(H) = 0.02

4. Rietveld Analysis

There has been a renaissance in powder diffraction in re-cent years because Rietveld refinement allows determina-tions of positional and thermal parameters from powderdata, even when the diffraction peaks are not well sep-arated in the recorded pattern. Rietveld analysis is not amethod of structure solution and can only be applied whencell dimensions and space group are known and when areasonable model exists for the structure.

In a polycrystalline sample, information may be lost asa result of the random orientation of the crystallites. Amore serious loss of information can result from the over-lap of independent diffraction peaks in the powder pat-tern. Using the total integrated intensities of the separategroups of overlapping peaks in the least-squares refine-ment of a structure leads to the loss of all the informationcontained in the often-detailed profile of these compositepeaks. Rietveld developed a refinement method that usesthe profile intensities of the composite peaks instead ofthe integrated quantities. This is a pattern-fitting method

Page 45: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

Crystallography 127

of structure refinement and allows extraction of the max-imum amount of information contained in the powderpattern.

A powder pattern is recorded in a step-scan mode witha step width of 0.02 to 0.03◦ 2θ . No attempt is made toallocate observed intensity to individual reflections or toresolve overlapping reflections. Instead, the intensity ofthe powder diffraction pattern is calculated as a stepwisefunction of the angle, 2θ . Refinement allows calculationof the shifts in the parameters that will improve the fit ofthe calculated powder pattern to the observed one. Thequantity minimized is∑

i

wi [yi (obs) − yi (calc)]2

where yi (obs) is the observed intensity of the i th step of theprofile, and yi (calc) includes the usual structural parame-ters (i.e., positional parameters xi , yi , zi ; thermal parame-ter Bi j ; and site-occupancy parameters p j ). However, themodel must also include instrumental and sample parame-ters: 2θ0 (overall scale), overall temperature factor, profilebreadth (H2 = U tan 2θ + V tan θ + W), profile asymme-try, background, preferred orientation, lattice parameters,and wavelength. The agreement factors most often quotedare

Rweighted pattern =[∑

wi [yi (obs) − (1/c)yi (calc)]2∑wi [yi (obs)]2

]1/2

and

RBragg =∑ |Ik(“obs”) − Ik(calc)|∑

Ik(“obs”)

In RBragg, “obs” has quotation marks because Ik(“obs”) iscalculated by partitioning the intensity.

To date, most of the papers published using thismethod have been neutron-diffraction studies from reac-tor sources. Advantages of neutron data include minimalpreferred orientation, no polarization, and neutron absorp-tion cross-sections smaller than X-ray values by a factor of104, scattering independent of θ , and for fixed-wavelengthexperiments the peak shape is simple.

In X-ray experiments using radiation from conventionalsources, peak shape is complicated by both α1–α2 split-ting (at high angles) and by the fact that the peak shapeis neither Lorentzian nor Gaussian, but is better describedby a convolution of these two functions called the Voigtfunction. The pseudo-Voigt function used in many pro-grams is an approximation of the Voigt function that canbe evaluated much more quickly.

Spallation neutron sources, time-of-flight experimentswith reactor sources, and synchrotron sources all have spe-cial problems in defining peak shapes, and this limits the

precision of the resulting parameters. However, precisioncomparable to single-crystal X-ray diffraction can now beobtained from neutron diffraction at a fixed wavelengthwith Rietveld refinement. For X-rays from conventionalsources, the precision of positional parameters is compa-rable to the single-crystal case, but thermal parameters areless reliable by a factor of two or three. Considerable ef-fort is being expended to improve profile functions for thevarious X-ray and neutron sources.

5. Small-Angle Scattering

Scattering at small angles is derived from large structuralunits—units whose dimension D is much larger than thewavelength of the radiation used in the experiment. Theacronyms SAXS and SANS refer to small-angle X-rayscattering and neutron scattering, respectively.

Different sorts of small-angle experiments are typi-cal of the kind of material studied and yield a charac-teristic type of pattern. Low-angle data from ordered orsemi-ordered systems give Bragg peaks at specific valuesof scattering vector. Examples include aligned structureswith long-range periodicity, such as two-dimensional bi-ological structures or samples such as opal that presentlong-range order.

Scattering from polymers in dilute solution or frombiological materials yields patterns that look rather likethe Debye–Scherrer rings observed in wide-angle datafrom powders. When scattering arises from the spheri-cal particles in a mono-disperse system, the pattern is aBessel function consisting of a succession of peaks ofdiminishing magnitude that are broad relative to Braggpeaks. Analysis of the pattern yields the radius of theparticles.

In some circumstances, multiple Bragg reflections giverise to scattering in the small-angle region. One of theadvantages of the “tunable” sources of X-rays and neu-trons is that multiple Bragg scattering can be avoided bychoosing wavelengths larger than the lattice spacing.

Small-angle scattering can be applied to a wide varietyof materials. In polymer science, it has been used to in-vestigate chain conformation in amorphous polymers, thestate of mixing in polymer blends, the compatibility rangesof polymer blends, and the measurement of domain struc-ture and molecular conformation within those domains.In biological materials, examples include measurement ofthe radius of gyration of proteins in solution; aggregationof chlorophyll into micelles; diffraction patterns of semi-ordered materials such as muscle, collagen, etc.; and stud-ies of the shapes and constitution of viruses. Separationprocesses, such as those used in refining of metals or ex-traction of tar sand, frequently involve micelle formation.Small-angle scattering can give information regarding the

Page 46: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

128 Crystallography

shape, size, and degree of polymerization of the aggre-gates. Studies of materials such as cements, zeolites, andcatalysts involve the measurement of size and distributionof pores and measurements of specific surface.

6. Extended X-Ray Absorption Fine Structure

Extended X-ray absorption fine structure (EXAFS) is atechnique for studying the local environment of a specificatomic species in a complex matrix. Because the inter-action of X-rays with the material under study is ab-sorption rather than diffraction, the technique can be ap-plied to gases, liquids, and amorphous solids as well as tocrystals.

In the experiment, the X-ray absorption coefficient ismeasured from slightly below to about 1000 eV abovethe absorption edge for the atomic species whose environ-ment is to be studied. Through analysis of the fine structureabove the edge it is possible to determine the coordinationof the atom. The actual position of the edge gives infor-mation about the oxidation state of the absorbing atom,while structure at energies at or just below the edge pro-vides information about bound states associated with theabsorbing atom and hence about the symmetry of the en-vironment. The technique can be applied to any atom withZ > 15 to yield a determination of radial distance to a pre-cision of ±0.02 to 0.01 A. For elements with high valuesof Z , it may be preferable to use the L-absorption edgerather than the K edge because higher fluxes are available;however, theory for the L edge is not yet well developed.

Extended X-ray absorption fine structure has applica-tions in many branches of science. In molecular biology,it has been used to study Ca+2 transport in membranes,binding of oxygen in hemoglobin, and other coordinationproblems. It is an invaluable tool for the study of amor-phous substances such as glass, since the manufacture ofglasses with particular mechanical and thermal propertiesdepends on structure. Catalysis has become extremely im-portant in energy development, resource utilization, pollu-tion abatement, refining of metals, etc. The chemical stateand atomic environment of an atomic species in a catalystin situ while reduced with hydrogen, chemisorbed withoxygen, heated, quenched, etc., can be determined withEXAFS. This allows the design of heterogeneous catalyststhat are tailored from precise knowledge of electronic andstructural parameters.

Synchrotron sources have revolutionized EXAFS stud-ies. The intensity of the source and high collimation makeit possible to collect the relevant data in about 20 minutesin a sample such as Cu metal with a resolution of 1 eVat 8.8 keV. To collect similar data with a rotating anodesource would take about 2 weeks, and the precision wouldbe reduced by a factor of about 100.

7. Implications of New X-Rayand Neutron Sources

Synchrotron sources present a unique combination ofproperties that are very attractive for X-ray scattering,absorption, and diffraction experiments. The radiationproduced has extreme brightness over a broad spectralrange. Synchrotron sources are five to six orders of mag-nitude brighter than the bremsstrahlung (that part of theX-ray spectrum caused by the slowing of electrons onimpact with the target, also referred to as white radia-tion) available from a conventional rotating anode source.Monochromators currently available produce resolutionof 0.1 eV at 8 keV. The radiation is naturally collimatedwith a divergence of the order of 2×10−4 radians, is planepolarized, and has a precise time structure (subnanosec-ond pulses repeated every 0.5 to 1 µsec). These propertiesallow experiments that simply cannot be done with con-ventional sources: EXAFS on dilute samples (parts permillion range), measurement of the magnitude and angu-lar dependence of the real and imaginary components ofanomalous dispersion, and determination of the structureof a protein using only one derivative and three wave-lengths. (In this context, the use of anomalous dispersionis formally equivalent to multiple isomorphous replace-ment with the added feature that the isomorphism is ex-act.) Determinations of cation-site distribution in mineralsand diffraction from monolayers on surfaces have manyapplications in such areas as catalysis or materials sci-ence. Perhaps the most exciting application is the abilityto do time-resolved studies of physical, chemical, and bi-ological processes using small-angle scattering, powderdiffraction, and other scattering techniques.

One early example was the study by Larson of temper-ature and temperature gradients in silicon during pulsed-laser annealing. In this example, the duration of the laserpulse was 15 nsec and that of the synchrotron X-ray pulsewas 0.15 nsec. The laser bursts were synchronized so thatthe probing X-rays arrived at 20, 55, and 155 nsec afterthe laser pulse. The experiments showed that the latticetemperature of silicon reaches the melting point duringthe 15-nsec pulse and remains at the melting point duringthe high reflectivity phase, after which time the tempera-ture rapidly subsides. Temperature gradients at the liquid–solid interface were measured for the first time and werefound to be in the range of 107 ◦K/cm. Larson received theWarren Award for Diffraction Physics for this pioneeringwork in nanosecond, time-resolved X-ray diffraction.

The new pulsed spallation sources (such as the Euro-pean Spallation Neutron Source at ForschungszentrumJulich GmbH, or the Spallation Neutron Source at OakRidge National Laboratory) provide spectra substantiallyricher in short-wavelength neutrons than those available

Page 47: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

Crystallography 129

from the reactor sources. Pulse duration and repetition aresource parameters, but the time structure can be exploitedby a variety of techniques. Essentially the same types ofexperiments are done at both the reactor and spallationsources. The higher neutron fluxes available with the newsources allow experiments to be done on smaller samplesand/or in shorter times than was previously possible.

8. Contribution of Diffraction to Molecular Biology

Molecular biologists seek to unravel the mysteries of thecell by mapping gene location, function, and control. Byunderstanding all of these we gain insight into how andwhen genes are turned on and how these might be usedto perform useful tasks. One goal of such studies is to al-ter (i.e., reengineer) a natural biologic process to performsome other function. Examples of this include attemptsto modify metal-binding proteins such that a protein thatwas originally selective for calcium is selective for zincor copper. This reengineered metal-binding protein couldthen be used to construct a sensor for zinc or copper. Otherexamples include modifying enzymes for use in industrialprocesses rather than biologic processes. In order to ef-fect these changes, the relationship between structure andfunction must be understood.

The structure of a macromolecule can be “known” ata number of levels. Primary structure is the linear se-quence of building blocks (i.e., amino acids) from whichthe protein is built. For example, the β chain of humanhemoglobin contains 144 amino acids, of which the firstfive have the sequence valine, histamine, leucine, threo-nine, and proline. The term “secondary structure” refersto local interactions that determine the conformation ofthe polypeptide chain and the interchain hydrogen bond-ing scheme. “Extended chain,” “α-helix,” and “β-sheet”are terms used to describe secondary structure. Tertiarystructure is the three-dimensional arrangement of atomswithin the macromolecule, while quaternary structure de-scribes the arrangement and interaction of aggregates ofthe macromolecules themselves. Diffraction techniquesare widely used to study the secondary, tertiary, and some-times quaternary structures of macromolecules.

Single crystals of macromolecules may be studied byX-ray diffraction. A wide variety of techniques arenow available for macromolecular structure solution. Inthe past, heavy-atom multiple isomorphous replacement(MIR) was the most common method of structure solution;however, anomalous dispersion is becoming more com-mon as a method of structure solution. The relatively smallprotein, crambin, was an early example of using anoma-lous dispersion to effect structure solution of a macro-molecule. In a few cases, neutron-diffraction studies havebeen carried out on single crystals of proteins. In myo-

globin, for example, at 1.8 A resolution, the negative den-sity of the protons (which form about half of the scatteringmaterial in the cell) makes the polypeptide chain stand outclearly in the Fourier maps. Soaking the same crystal inD2O has allowed identification of exchangeable protons.

Neutron- and X-ray-scattering experiments using non-Bragg scattering can be used to study the size, shape, andaggregation of micelles. Contrast variation can be used tostudy the internal structure of viruses.

The average scattering densities from the protein coatand the RNA interior of viruses are different. Each can be“matched” by a different H2O/D2O ratio. Thus, for virusparticles in solution, the matched phase becomes “invisi-ble” to the neutron beam and allows the radial distributionof scattering of the other component to be recorded. Forspherical viruses, this allows measurement of the thick-ness of the protein coat and the degree of interpenetra-tion of the two phases. Enzymes and other proteins bindsome substrates so well that detailed, atomic-level anal-yses of the structure of the native protein can be com-pared with those of protein–substrate complexes, protein–inhibitor complexes, and proteins with catalytic groupsbound to allosteric sites. Detailed comparisons betweensuch structures have greatly enhanced our understandingof the mechanisms of biological catalysis.

Molecular graphics programs are now available that candisplay the full three-dimensional structure of a macro-molecule and zoom in on any portion of it. It is possibleto examine the active site and to attempt to fit known sub-strates and/or inhibitors into that site. The availability ofcoordinates for macromolecules in the Protein Data Bank(see Section VI.A.8) has allowed many fruitful applica-tions of this type.

III. STRUCTURE OF A CRYSTAL

A. Choice of Unit Cell

A crystal is a multifaceted solid, similar in appearance toan unpolished gemstone. Internally it consists of a basicpattern, known as the repeating unit, of molecules thatrepeats itself by translation, in three dimensions, to theedges of the crystal. By choosing one corner of the repeat-ing unit to be the origin, one can use three translationalvectors, having both length and direction, to construct aparallelepiped that contains the entire basic pattern. Thisparallelepiped is defined as the unit cell and one need onlydetermine the contents of the unit cell, the basic pattern,to know the structure of the entire crystal. The generalsymbols for the unit-cell vectors are a, b, c and for theirmagnitudes a, b, c. The coordinate axes, or directions ofthe sides of the unit cell, are referred to in general as the

Page 48: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

130 Crystallography

FIGURE 2 Choice of unit cell. In the absence of symmetry, theunit cell may be chosen in a variety of ways. Each cell containsone unit of pattern. All such cells have the same volume.

x, y, and z axes. The interaxial angles are denoted by α, β,and γ . The unit cell can be defined in a variety of ways,but for a given pattern, the volumes of each cell will beequal (Fig. 2).

If the repeating unit itself has no internal symmetry,then the choice of unit cells is infinite. However, if thebasic repeating unit contains additional symmetry, this in-fluences the choice of the unit cell. If there are planesor axes of symmetry, the cell edges are generally chosento be parallel or perpendicular to these directions. Thisplaces restrictions on how the unit cell may be definedand gives rise to seven crystal systems: triclinic, mono-clinic, orthorhombic, tetragonal, hexagonal, trigonal, andcubic (see Table I).

B. Diffraction Pattern

Friedrich, Knipping, and Laue first demonstrated the factthat crystals could act as three-dimensional diffractiongratings for X-rays in 1912. This work not only estab-lished the wave nature of X-rays but also established therelationship between a crystal and its diffraction pattern.The size and shape of the repeating unit in the crystal de-termine the position of the diffraction spots they recordedon film. It was some time later before it was realized thatthe intensity of the spots is related to the distribution ofatoms within the unit cell.

TABLE I Seven Crystal Systems and Their Unit Cell Con-straints

Crystal System Conditions imposed on cell geometry

Triclinic a �= b �= c; α �= β �= γ

Monoclinic a �= b �= c; α = γ = 90◦

Orthorhombic a �= b �= c; α = β = γ = 90◦

Tetragonal a = b; α = β = γ = 90◦

Trigonal a = b; α = β = 90◦γ = 120◦ (hexagonal axes)

a = b = c; α = β = γ (trigonal axes)

Hexagonal a = b; α = β = 90◦γ = 120◦

Cubic a = b = c; α = β = γ = 90◦

Bragg’s contribution was to recognize the similaritybetween diffraction in a crystal and reflection in a mir-ror plane (Fig. 3). Consider a set of parallel planes withspacing d and an incoming beam of monochromatic X-rays at a glancing angle, θ . The condition for constructiveinterference is that the path difference between waves “re-flected” from successive planes must be an integral num-ber of wavelengths (i.e., AB + BC = nλ). However, AB =BC = d sin θ ; thus, nλ = 2d sin θ , which is Bragg’s law.Note that the angle between the incident beam directionand the reflected beam is 2θ .

The smaller the interplanar spacing d, the higher theangle at which the diffraction maximum or “reflection”is observed. This implies that large unit cells will givediffraction patterns with small spacing between the spots,while small unit cells will give patterns with wide spacing.Thus, lattices can describe both the crystal and its diffrac-tion pattern. Since the lattice of the diffraction pattern isinversely proportional to the crystal lattice, it is defined asthe reciprocal lattice.

In crystal space we can define a set of parallel planes ofspacing d, and note that the first of these planes (for whichthe distance from the origin is d) has intercepts with theedges of the unit cell of a/h, b/k, c/ l. The Miller’s in-dex of that set of planes is then hkl, where h, k, and l aresmall integers with no common factor. In Fig. 4 we havea plane with intercepts a/3, b/4, c/2. The Miller’s indexis 342. The set of parallel planes of index 342 will giverise to a spot in the diffraction pattern with index 342. Thereflection 684 can be regarded either as the second-orderreflection from the planes 342 or as the first-order reflec-tion from a set of parallel planes with spacing d/2. Thenumber of diffraction planes possible for a given structureis directly related to the lengths of a, b, and c.

C. Basic Formulas of Crystallography

The crystal structure is a pattern that repeats in threedimensions, and Fourier series can represent repetitivepatterns.

Let the position of the j th atom in the unit have frac-tional coordinates x j , y j , z j (in this notation, x j meansx/a). Then, the vector from the origin to the j th atomwould be r j = ax j + by j + cz j . The vector represent-ing the diffracted beam direction is H = ha∗ + kb∗ + lc∗

where hkl are indices of the reflecting plane and a∗, b∗, c∗

are the base vectors of the reciprocal lattice. The directionof the diffracted beam is given in terms of the indices hkl.The set of planes hkl cuts a into h divisions, b into k divi-sions, and c into l divisions. The phase difference for unittranslation along a is 2πh. Thus, if h = 3, a ray scatteredby an electron at a = 1 would be 2π3 or three wavelengthsout of phase with one scattered by an electron at the origin.

Page 49: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

Crystallography 131

FIGURE 3 Bragg’s law. The distance AB = BC = d sin θ . Con-structive interference occurs if the path difference is a whole num-ber of wavelengths. Thus, nλ = 2d sin θ .

The amplitude of the wave scattered by the plane hkl is

Fhkl = V∫ 1

0

∫ 1

0

∫ 1

0ρ(xyz)

× exp[2πi(hx + ky + lz)] dx dy dz

where ρ(xyz) is the electron density at the point x, y, zin the unit cell. The quantity (hx + ky + lz) is the vectorproduct (H · r).

By the properties of Fourier series, Fhkl is a Fouriercoefficient of ρ(xyz) so that:

ρ(xyz) = 1

V

∑h

∑k

∑l

Fhkl exp[−2πi(H · r)]

where the summations in h, k, and l each run from −∞to +∞. Note the change in sign of the exponent betweenthe two expressions. Fhkl is the Fourier transform of theelectron density in the cell. The electron density is theinverse Fourier transform of structure factors.

If the electron density is a superposition of N atomicdensities, then the structure factor expression can berewritten as:

Fhkl =N∑

j=1

f j exp[2πi(hx j + ky j + lz j )]

The summation is over all atoms in the cell and the scat-tering factor of the j th atom is

FIGURE 4 Miller’s index. The plane has intercepts a/3, b/4, c/2.The Miller’s index is 342.

f j = V∫ +∞

−∞ρ j (uvw)

× exp[2πi(hu + kv + lw)] du dv dw

where ρ j (uvw) is the electron density of the j th atom re-ferred to x j , y j , z j as origin. Thus, if we know the positionof an atom, then we can calculate the phases. Conversely,if we know the phases, then we can calculate the electrondensity and hence the positions of the atoms. The centralproblem in crystallography is that the phases are not ob-served in the diffraction experiment. Solving a structureconsists of finding positions for the atoms or of findingphases for the structure amplitudes. Details of the majormethods for structure solution are found in Appendix II.

IV. STEPS IN CRYSTALSTRUCTURE ANALYSIS

The steps outlined here would be typical of those used fora moderately complex structure. Notes are also includedon alternative techniques that may be more applicable tomacromolecular crystallography.

A. Growing Crystals

All strategies for the growth of crystals for diffraction ex-periments are aimed at bringing a concentrated solution ofa homogeneous population of molecules very slowly to-ward a state of minimum solubility. The goal is to achievea limited degree of supersaturation, from which the systemcan relax by formation of a crystalline precipitate. Manytechniques developed for achieving these ends have beendescribed.

Crystallization techniques used in routine syntheticmethods tend not to produce crystals of the quality re-quired for structural work. Suitable crystals must be grownslowly at near-equilibrium conditions. This implies lowsupersaturation ratios and small gradients. Generally, su-persaturation is achieved by changing the composition ofa solution containing the sample to be analyzed (and pos-sibly other additives) or by altering the temperature. Ineither case, the concentration of the sample is driven be-yond its saturation limit, whereby the sample is forced outof solution and crystal formation may result.

When crystals are grown from solution, changing thesolvent may have a pronounced effect on their habit andsize. Properties that may influence crystal growth, suchas density, viscosity, dielectric constant, and solubility,may be varied over a wide range by mixing two or moresolvents.

A small surface-to-volume ratio is useful for slow evap-oration of solvent. For small samples, an NMR (nuclear

Page 50: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

132 Crystallography

magnetic resonance) tube may be a suitable crystallizationvessel.

Very insoluble compounds may be crystallized usinga method known as reactant diffusion. In this method,reactants A and B are allowed to mix by diffusion; thevery insoluble product C will crystallize in the zone ofmixing. Sparingly soluble compounds can sometimes becrystallized from boiling solvent in a soxhlet extractor.Crude product is placed in the thimble and the reservoiris seeded.

Sublimation is effective for some classes of compounds.Vacuum sublimation reduces the temperature required andso increases the range of compounds for which it is suit-able. If the temperature gradient is too large, only micro-crystals will be formed. Large crystals grow at the expenseof small ones only when the process is carried out slowly.

Vapor diffusion is a method that works well with mil-ligram quantities. The solute is dissolved in a solvent inwhich it is relatively soluble. A small container of this so-lution is placed inside a closed beaker with a second sol-vent in which the solute is only sparingly soluble. The twosolvents must be miscible in one another and the secondsolvent should be the more volatile. Suitable solvent pairsinclude ethanol/ether, benzene/ligroin, and water/ethanol.Diffusion of one liquid into another is also effective. Thesolute is dissolved in the solvent in which it is moresoluble. Crystals form at the interface between the twosolvents.

A form of vapor diffusion is commonly used for thegrowth of protein crystals. For proteins (and other macro-molecules) the solution properties modified to achieve su-persaturation include increasing the concentration of anadditive (e.g., a precipitant), decreasing total solution vol-ume, changing the solution pH, and/or changing the tem-perature. Although a wide variety of experimental setupshave been used in protein crystallization, the most com-mon technique is “hanging-drop microvapor diffusion”(HDMVD). In HDMVD, a droplet of 4 to 20 µl con-taining protein and precipitant is suspended from a glasscoverslip which is sealed above a reservoir of a solutionat a higher precipitant concentration. Since the droplet isat a lower precipitant concentration than the reservoir, thenet migration of water vapor occurs from the droplet tothe reservoir, resulting in a decrease in drop volume. Thedecrease in drop volume results in increased precipitantand protein concentrations, which should drive the proteinout of solution. Crystals form if conditions are favorable.More often, protein precipitates as an amorphous solidalong the bottom of the droplet.

Once initial conditions are found where crystals haveformed, additional experiments often must be performedto perfect crystal growth in order to produce X-ray-diffraction-quality crystals. The crystals must be single,

with no satellite growths or twinning, and on the orderof 0.1 mm on each edge. Fine-tuning conditions, addingdetergents or counterions to the precipitant, and seed-ing droplets with microcrystals are all techniques usedto prepare large, single crystals from initial successfulexperiments.

B. Microscopic Examination

The crystal chosen for analysis must have a uniform in-ternal structure and be of an appropriate size and shape.The first criterion implies that the substance is pure—thecrystal contains no voids or inclusions and is not bent,cracked, distorted, or composed of crystallites. It must bea single crystal, and ideally it should not be twinned (atwinned crystal has two or more different orientations ofthe lattice growing together).

The size of crystal required is determined by the con-ditions of the experiment. For X-ray diffraction, approx-imately 0.1 mm is preferred; for neutrons, an order ofmagnitude larger is appropriate. A crystal with roughlyequal dimensions and well-defined edges is ideal; how-ever, many crystals grow as plates or needles. Although acrystal with a highly asymmetric shape is far from ideal,useful structural information can often be obtained fromsuch a crystal. In some cases, a more uniformly shapedfragment may be cut from a larger crystal with a sharprazor blade.

Examination with a binocular microscope allows a rapidscreening of crystals. A few that appear suitable shouldthen be examined more closely through crossed polariz-ers. As the crystals are rotated, they should either appearuniformly dark in all orientations or they should be brightand extinguish (appear uniformly dark) every 90◦ of ro-tation. An unsuitable crystal may show dark and light re-gions simultaneously, or regions that do not extinguish, ordifferent regions that display different colors.

C. Mounting a Crystal

The usual way of mounting a crystal for X-ray diffractionis to glue it to the end of a glass fiber that is mounted in abrass or aluminum pin. For photographic work, it is neces-sary to align a real lattice vector (Weissenberg technique)or a reciprocal lattice vector (precession technique) so thatit is perpendicular to the X-ray beam. On a diffractometer,it is necessary for the crystal to be well centered in theX-ray beam, but alignment is neither necessary nor advis-able since the aligned position is the one that often leadsto overlapping reflections.

Materials that are air or moisture sensitive may be sta-bilized by covering the crystal with a light coating of oil orthe mounting glue (provided that the crystal is not soluble

Page 51: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

Crystallography 133

FIGURE 5 Polaroid rotation photograph.

in the solvent of the glue). Very sensitive substances mayhave to be mounted in capillary tubes under an inert atmo-sphere, with all mounting operations carried out in a drybox.

D. Preliminary Photographs

Most diffractometers are equipped with a Polaroid cam-era for a quick check of crystal quality. Alternatively arotation photograph can be acquired with an area detector.The spots on the photograph should have similar shapes,without tails or streaks (see Fig. 5). A more sensitive checkof crystal quality is provided by omega scans of a num-ber of reflections measured with different orientations ofthe crystal. If the diffraction pattern falls off rapidly withangle, low-temperature data collection is advisable; how-ever, some crystals crack when cooled. If suitable crystalsare scarce, it may be advisable to collect a dataset at roomtemperature and then cool the sample to improve the res-olution of the experiment

E. Establishing the Orientation Matrix

The orientation matrix and the cell dimensions are estab-lished by determining the setting angles of about eight

to ten reflections. These may be found using informationavailable from the preliminary photograph or by allowingthe computer-controlled diffractometer to search for them.The indices of these reflections are then determined (usu-ally with a computer program). The cell dimensions arethen refined by least squares. High-angle reflections aremost sensitive, but fairly strong reflections are required.For this reason, it is common to use relatively low-angledata to establish a preliminary unit cell and matrix, andto recalculate the matrix when the intensities of the high-angle data have been established.

F. Data Collection

It is always advisable to collect symmetry-equivalentdata (reflection intensities). The degree of agreement be-tween equivalent reflections allows assessment of crys-tal quality, absolute configuration, stability of the count-ing chain, suitability of absorption correction, and othersystematic effects. For charge density studies, a com-plete dataset should be collected to a resolution of atleast sin θ/λ = 1.3 A−1 with an internal agreement ofabout 2%. For determination of absolute configuration, ef-fects of decomposition, absorption, and extinction errorsare minimized if hkl and hkl are measured consecutivelyat +2θ and −2θ . This may be done for a few dozen ofthe most sensitive reflections or for the whole dataset. Forlarge datasets, it is common practice to collect data inshells.

G. Data Reduction

The process of deriving structure amplitudes |Fhkl | fromthe observed intensities Ihkl is known as data reduction. Anumber of geometrical factors influence the intensities ob-served in a diffraction experiment. The most important ofthese are Lorentz, polarization, absorption, and extinctioncorrections.

The first three corrections are normally applied to theobserved intensities in the process of calculating the struc-ture amplitudes, |Fhkl |. The Lorentz factor corrects forthe relative speeds with which different reflections passthrough the reflecting position. Since X-rays are polar-ized on reflection, and the degree of polarization dependson experimental conditions, a correction must be appliedto account for how the polarization affects the observedintensities. Frequently, Lorentz and polarization correc-tions are combined. Absorption depends on the averagepath-length of both the incident and the reflected beamand hence may be very different for symmetry-relatedreflections.

Extinction is an interference process. An extinction cor-rection is generally made during least-squares refinement,making it a correction applied to the model rather than to

Page 52: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

134 Crystallography

the observations. A detailed discussion of the absorption,extinction, and other factors affecting diffraction intensitycan be found in Appendix I.

H. Solving the Phase Problem

The central problem in crystallography arises because theexperimental data yield only the modulus of the structurefactor, |Fhkl | and not the phase. The phase is required inorder to evaluate the electron density in the unit cell, butit cannot be measured directly.

Several methods have been developed to determinethe phases of the complex structure factors, |Fhkl |, withno prior knowledge of atomic positions. These meth-ods include multiple isomorphous replacement, single-isomorphous replacement with anomalous dispersion,multiple-wavelength and single-wavelength anomalousdispersion, heavy-atom, and direct methods. With the ex-ception of direct methods, all of these methods take advan-tage of the scattering properties of “heavy” atoms (transi-tion, actinides, and lanthanides). For very large molecules,such as proteins, neither direct nor heavy-atom methodsare generally used. Molecular replacement (MR) also pro-vides a powerful phasing method for the structure analysis.A model structure similar to the structure being analyzedis required for MR.

If the positions of the atoms are known, both the mag-nitude and phase of the structure factor can be calculated.The heavy-atom method of structure solution depends onthis. If at least one atom in the structure is heavy enoughto be located in a Patterson function, that position canbe used to calculate phase angles. A Fourier summationusing observed structure amplitudes and these calculatedphases would reveal the heavy atom and some others. Theadditional atoms are included in the structure factor cal-culation, providing a better estimate of the phase angles.This process can be facilitated by use of the tangent for-mula to extend and refine the new phases and is repeateduntil all of the atoms are located.

Multiple isomorphous replacement (MIR) is a powerfulmethod for determination of phases. MIR depends uponthe phasing power of heavy-metal atoms bound to a com-pound in such a fashion that the positions of other atomsin the crystal are minimally perturbed (i.e., isomorphousderivatives). The MIR method requires that the crystallog-rapher prepare more than one derivative of the parent crys-tal. These often turn out to be unstable, not isomorphous,or to have the metal bound with too low an occupancyto be useful. Even when successful, this method requiresthe collection and reduction of numerous datasets frommultiple crystals. Anomalous dispersion methods can becombined with isomorphous replacement to circumventsome of the disadvantages.

Advances in the phasing of macromolecular data havebeen made by the use of a phenomenon called anomalousdispersion or anomalous scattering. Single isomorphousreplacement with anomalous scattering (SIRAS) takes ad-vantage of both the phasing power and anomalous scat-tering properties of certain heavy atoms. An advantage ofthe SIRAS technique is that data, in some cases, can becollected using a conventional Cu-Kα radiation source.The SIRAS approach to phasing obviates the need forcrystallization and data collection from multiple samples.The technique does require careful data collection but pro-duces all of the information needed to determine phasesfrom a single dataset collected from only one crystal.

A crystal containing an anomalous scattering atom maybe used to collect data at multiple wavelengths and allowthe phases to be determined using multiple-wavelengthanomalous dispersion (MAD). A single crystal can beanalyzed at multiple wavelengths, generating a varia-tion in scattering factors that allows direct determinationof crystal structures. With careful measurements, evenweak signals from a single crystal can provide the nec-essary phasing information. Data must be collected using“tunable” X-ray radiation available only at synchrotronfacilities.

A paper by J. Karle (see Bibliography) offers thepossibility of nearly direct phasing for protein crystals.The potential for analysis of the data using this single-wavelength anomalous dispersion (SAD) technique is yetto be explored. Given the collection of sufficiently accurateanomalous dispersion data, one dataset at one wavelengthmay provide all the information required to determine thephases.

Direct methods of phase determination do not dependon any a priori structural information. Phases are deter-mined from statistical relationships among the intensities.Powerful computer programs have facilitated the use ofthese methods for centrosymmetric and noncentrosym-metric structures of moderate size. Considerable effort hasbeen made to extend these methods to protein structures,and structures containing over 1000 non-hydrogen atomshave been solved by direct methods.

In general, the conformationally rigid portions of themolecule are most easily located. Once the major featuresof the molecule have been recognized, the difference mapprovides a powerful means both for checking the accu-racy of the partial model and for completing it. A Fouriersummation using �F = |Fobs − Fcalc| as coefficients andphases calculated from the known portion of the struc-ture yields a map of the discrepancies between the crystaland the model. An atom present in the crystal but notincluded in the model will appear as a peak. Differencemaps are also useful for locating light atoms in the pres-ence of heavy ones, such as the location of hydrogen atoms

Page 53: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

Crystallography 135

in compounds of first-row elements or carbon atoms in atungsten compound.

I. Refinement

Once approximate locations have been determined for all,or almost all, of the atoms in the structure, it must be re-fined or made more precise. The standard method for re-finement of structures, known at atomic resolution (not formacromolecules), is full-matrix least squares. For struc-tures that are not known at atomic resolution, or for verylarge structures, sparse-matrix or simulated annealing isused for refinement. Since the structure factors are notlinear functions of the parameters, the process is an iter-ative one. Hydrogen atoms are not generally included inthe early stages of refinement but are included in the finalcycles of refinement.

The function minimized is∑

ω�2, where ω is theweight assigned to a particular observation and � is thedifference between the observed and calculated values ofFhkl (for a refinement based on |F|) or F2

hkl (for a refine-ment based on |F|2). A convenient parameter, referred toas the R-factor, for monitoring the progress of the refine-ment is

R =∑

(||Fo| − |Fc||)/ ∑

|Fc|Correct structures generally have R-values under 0.10, andthose that are well behaved are frequently under 0.05.

J. Determination of Absolute Configuration

If a particular torsional angle has a positive sign in theright-handed enantiomer, it will have a negative sign inthe left-handed molecule. Thus, determination of the ab-solute configuration in a chiral molecule can be regardedas the determination of the correct signs for the torsionalangles. A torsional angle ABCD is positive if a clockwiserotation will cause the bond AB to eclipse the bond CD.

Determination of absolute configuration by X-ray crys-tallography requires a structural study in the presence ofdispersive scatterers. Thus, if at least one atom in the struc-ture is an anomalous (i.e., dispersive) scatterer, Friedel’slaw breaks down and reflections from two sides of thesame plane are no longer equal. The differences in in-tensity are generally small, so careful measurement is re-quired. Coster et al. demonstrated that a noncentrosym-metric crystal structure could be distinguished from itsinverted image using these differences. Later, Bijvoet re-alized this principle was more general and used it to de-termine the absolute configuration of the sodium rubid-ium salt of (+)-tartaric acid. In 1983, Flack developed amethod for distinguishing a noncentrosymmetric structurefrom its inverse. Using the method of Flack, any noncen-

trosymmetric crystal is treated as a twin by inversion andthe contribution of the two components evaluated duringrefinement as the Flack parameter. In the case where thearrangement of atoms in the model and crystal are in agree-ment, the contribution of the Flack parameter is zero. Ifthe model and crystal are inverted with respect to eachother, the Flack parameter is one and the model needs tobe inverted.

K. Derived Parameters

The parameters produced directly by the least-squares re-finement are the positions of the atoms and their thermalparameters. Bond lengths, bond angles, and torsional an-gles are derived from these positions. An examination ofshort inter- and intramolecular contacts may provide in-formation about hydrogen bonding, van der Waals forces,packing forces, etc.

Room-temperature studies of organic compounds gen-erally show appreciable thermal motion with Ui j valuesof the order of 0.04 A2. This corresponds to a root-mean-square vibration amplitude of 0.2 A and is a reminder thatsome caution is required in comparing bond lengths fromdiffraction experiments with those determined by spec-troscopic and theoretical work. Bond lengths can some-times be “corrected” for thermal motion, but it is generallypreferable to reduce thermal motion by cooling the crystal.Many investigators routinely collect data at temperaturesin the range of 100 to 200 ◦K.

A single crystal structure determination providesvaluable information on chemical connectivity, relativeconformation, and, under the proper experimental condi-tions, absolute configuration. However, an understandingof structure–function relationships requires correlatingfeatures from a number of different structures. The exis-tence of computer-searchable databases of structural datagreatly enhances the possibilities for such comparisons(see Section VI).

V. COMPARISON OF X-RAY ANDNEUTRON DIFFRACTION

For a structure with N atoms, each with atomic scatteringamplitude fi and position ri in the unit cell, the structurefactor for the Bragg reflection of index h is

F(h) =N∑

i=1

fi (h)Ti (h) exp(2πih · ri )

where Ti is the temperature factor of the i th atom. Differ-ences between neutron diffraction and X-ray diffraction liein the scattering amplitudes fi (h) and in the temperature

Page 54: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

136 Crystallography

parameters Ti (h). In X-ray diffraction, X-rays are scatteredby the electrons of an atom, and the scattering factors fi (h)are strongly dependent on scattering angle. At a scatteringangle of 0◦, fi is proportional to Z , the atomic number ofthe scattering atom. As the scattering angle increases, thescattering factor fi (h) decreases. Thermal motion causesthe scattering to fall off even more strongly. These factorslimit the resolution available in an experiment and makeit difficult to determine the positions of light atoms ac-curately in the presence of heavy ones or to distinguishamong heavy atoms that have similar atomic numbers.

In neutron diffraction, the scattering is primarily fromthe nucleus. Since the diameter of the nucleus is smallrelative to the wavelength of thermal neutrons, the scat-tering factor is a constant, characteristic of the particularnucleus and independent of scattering angle. There is nosimple relationship between the scattering amplitudes andthe nuclear mass or charge. Nuclei with similar atomicnumber can have significantly different scattering ampli-tudes. Hence, neutron diffraction can distinguish amongnear neighbors in the periodic table and so is useful in thestudy of alloys.

Table II shows the relative scattering lengths of a num-ber of atoms for X-rays and neutrons. If the average con-tribution to the intensity of a structure factor is

P(i) = f 2i

/n∑

i=1

f 2i

Then, for a compound such as benzene with equal numbersof hydrogen and carbon atoms, carbon will contribute 97%and hydrogen will contribute 3%. Thus, in the X-ray ex-

TABLE II Scattering Amplitudes for SelectedElements (10−12 cm)a

Scattering amplitudeAtomic

Element number X-ray Neutron

1H 1 0.28 −0.372D 1 0.28 0.6712C 6 1.69 0.6614N 7 1.97 0.9416O 8 2.26 0.5832S 16 4.51 0.28

Clb 17 4.79 0.96

Brb 35 9.87 0.68127I 53 14.95 0.52238U 92 25.92 0.84

a For X-rays, the scattering amplitude at 0◦ is given by(e2/mc2) f0 or ((0.282 × 10−12 cm) × atomic number).Values for neutrons are taken from “International Tablesfor X-Ray Crystallography,” Vol. IV, 1974, pp. 270–271.

b These values are for the elements in their naturalisotopic abundances.

periment, positions of hydrogen atoms will be more poorlydetermined than positions of carbon atoms. In the neutronexperiment on the same compound, hydrogen would con-tribute 24% and carbon 76%. (Because of the negativesign on the scattering length of hydrogen, a Fourier sum-mation will show “holes” rather than peaks at hydrogenpositions.) For deuterobenzene, the contributions from Cand D (in the neutron-scattering experiment) are virtuallyidentical, with each contributing 50% of the scattering.Thus, neutron diffraction can locate hydrogen and deu-terium atoms with the same precision as carbon, nitrogen,and oxygen, while X-rays cannot. For this reason, stud-ies of hydrogen bonding in biologically significant com-pounds such as amino acids and sugars were among theearly experiments using neutron diffraction.

As the precision of the experiments and the sophistica-tion of the refinements improved, it became obvious thatthere were systematic differences in the positional andthermal parameters from the two experiments that weremuch larger than expected from the estimated standarddeviations. The differences are very pronounced for hy-drogen atoms. Even in the most precise, low-temperaturestudies, electron density maxima for hydrogen atoms weredisplaced by as much as 0.2 A from the positions of pro-tons determined from neutron diffraction. Since the C–Hand O–H bond lengths in the neutron experiment agreewith spectroscopic measurements, it was recognized thatthe apparent shortening of the bonds to hydrogen observedin X-ray experiments is a bonding effect. The position ofmaximum electron density does not coincide with the po-sition of the nucleus because the formation of the cova-lent bond perturbs the electron-density distribution in theatom.

Similar, but smaller, effects are observed for first-rowatoms, C, N, O, etc. Typically, the discrepancy is of theorder of 0.01 A, but it depends critically on the rangeof Bragg angles included in the X-ray refinement. If therefinement is based on very-high-order data [(sin θ )/λ >

1.00 A−1], the discrepancies will be much reduced; how-ever, few organic compounds scatter to such high angleseven at liquid-nitrogen temperature.

Systematic differences in temperature factors also re-flect the very real differences in the scattering processesin the two experiments. In aromatic molecules, thermalvibration parameters are greater for X-rays than for neu-trons in the plane of the ring and smaller perpendicular tothe ring, implying that electron density is smeared in theplane of the ring by covalent bonding and contracted inthe perpendicular direction.

Atoms with lone pairs of electrons show significantdifferences in both positional and thermal parameters inthe two experiments because the lone-pair density is notcentered on the nucleus. These very real differences sug-gest strongly that neutron diffraction is the preferred tool

Page 55: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

Crystallography 137

for determining atomic parameters, whereas the X-raysmeasure the electron density in the solid. The combina-tion of the two techniques provides a means of studyingbonding effects (this topic was covered in greater detail inSection II.B.3).

VI. RESULTS

The determination of a single crystal structure may answera question about the connectivity of a molecule or somedetail of its conformation. The direct results of structureanalysis are the positional, thermal, and occupancy pa-rameters of atoms in the asymmetric unit. Bond lengthsand angles, torsion angles, and intermolecular associations(such as hydrogen bonding) are all derived from these ba-sic structural parameters. While the final R-value that astructure refines to is often a good indicator of the qualityof a structural determination, a plot showing the thermalellipsoids (Fig. 6) can also give an indication of the qualityof a structural determination. Errors in the structure deter-mination and thermal motion in the molecule can distortthese ellipsoids.

Understanding a complex process (such as the mech-anism of a reaction, the biological activity of a class of

FIGURE 6 Thermal ellipsoid plot from a crystallographic structure determination. Hydrogen atoms are plotted assmall balls with an arbitrary radius; non-hydrogen atoms are plotted as ellipsoids, with the axes corresponding to thethermal parameters for the atom. Notice that the ellipsoids gradually increase in size and become more asymmetricmoving out the chain from C8 to C16. The ellipsoids can become large and more asymmetric for noncovalently boundsgroups such as the coordinated nitro group (N2, O1, O2, and O3). In both of these cases, the axes of the ellipsoidsare aligned with expected vibrational modes and do not indicate any major problems with the structure. If the axeswere poorly aligned and/or the ellipsoids more asymmetric, this could be an indication of problems with the structuraldetermination.

drugs, the phenomenon of one-dimensional conduction, orany other structure–property relationship) requires the de-tailed analysis of a large number of structures and correla-tion with results from other disciplines. Databases providethe kind of information required for structure–functionstudies and are very valuable to the scientific community.The topics presented here are intended to be illustrative ofthe many ways that the databases have been used. Thereare many other uses not documented here.

A. Crystallographic Databases

At present there are eight major databases for crystal-lographic results continuously maintained and updatedin different laboratories in Europe and North America.In each case the data are available in machine-readableform, and considerable effort has been expended to de-velop efficient computational algorithms for searching thefiles and correlating the data. Perhaps the most impor-tant feature of the databases is that the data have beenchecked and errors are corrected when possible or flaggedif uncorrected. Some of these databases have been in-corporated into commercial packages. Some are avail-able in both printed and computer-readable form. In thebrief summary that follows, they are listed in alphabeticalorder.

Page 56: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

138 Crystallography

1. Cambridge Structural Database

(http://www.ccdc.cam.ac.uk/prods/csd/csd.html; Cam-bridge Crystallographic Data Center (CCDC), UniversityChemical Laboratory, Lensfield Road, Cambridge, CB21EW, U.K.) The Cambridge Structure Database (CSD)is the largest database of experimentally determined or-ganic and metallo-organic crystal structures in the world;inorganic carbon compounds (such as carbonates andcyanides) are excluded. The CCDC also provides a suiteof graphical search, retrieval, data manipulation, and vi-sualization software for use with the database. The CSDcontains bibliographic data, tables of connectivity, atomicpositions, cell dimensions, and quality indicators for vir-tually all three-dimensional structures of organic com-pounds published since 1935. The CSD currently containsover 210,000 entries and is growing at a rate of approx-imately 15,000 entries per year. The CSD is available toscientists throughout the world.

2. Crystal Data

(http://www.nist.gov/srd/nist3.htm; Crystallographic Sec-tion, National Bureau of Standards, Washington, D.C.20234.) This database contains lattice parameters for allcrystals whose dimensions have been reported by X-ray,neutron, or electron diffraction on single crystals or fullyindexed powders. Data include name, formula, cell di-mensions, space groups, number of molecules in the cell,density (measured and calculated), bibliographical data,crystal habit, melting point, etc. Crystal Data accepts datafrom the other databases.

3. Electron Density Data Base

(Prof. H. Burzlaff, Lehrstuhl fur Kristallographie, Institutfur Angewandte Physik der Universitat, Bismarckstrasse10, D-91054 Erlangen, Germany.) This database containsaccurate structure factors for crystal structures whose elec-tron densities have been carefully determined. This is thetype of data required for studying bonding effects, co-valency in organometallics, and other details of electrondistribution.

4. Inorganic Crystal Structure Database

(http://crystal.fiz-karlsruhe.de/portal/cryst/ab icsd.html.)The Inorganic Crystal Structure Database (ICSD) wasinitiated in 1978 at the Institute for Inorganic Chemistryat the University of Bonn. Today the database is producedby FIZ Karlsruhe (P.O. Box 2465, D-76012 Karlsruhe,Germany) in cooperation with NIST. “Inorganic” isdefined to exclude metals, alloys, and compounds withC–H and C–C bonds (with the exception of graphites).

The data stored include chemical name, chemical for-mula, density, lattice parameters, space groups, atomiccoordinates, oxidation state, temperature factors, remarksregarding conditions of measurement, R-values, andbibliographical references. Online access to the file isavailable. It is also possible to lease the entire database.The database now contains over 53,000 entries and isupdated twice a year.

5. Metals Data File

(Manager, CAN/SND, Canada Institute for Scientific andTechnical Information, National Research Council (NRC)of Canada, Ottawa, Canada K1A OS2.) The Metals filecontains structural data for metal and alloy structures de-termined since 1913 based on either powder or single-crystal diffraction. Under an exclusive license from theNRC, Toth Information Systems has maintained and up-dated the database. If available, the following informationis included: formula, cell dimensions, structure type, Pear-son symbol, atomic coordinate, temperature parameters,occupancy factors, R-values, method of refinement, in-strument used, radiation, and bibliographical information.New software for manipulating the file has been developedby Toth Information Systems.

6. Nucleic Acid Database

(http://ndb-mirror-2.rutgers.edu/NDB/ndb.html; Dr. H. M.Berman, Department of Chemistry, Rutgers University,610 Taylor Road, Piscataway, NJ 08854-8087.) The goalof the Nucleic Acid Database (NDB) project is to assem-ble and distribute structural information about nucleicacids. Online access to the NDB is freely available. Avariety of tools have been developed in conjunction withthe NDB to provide a robust user interface.

7. Powder Diffraction File

(http://www.icdd.com/ productsold/pdf2.htm; Interna-tional Center for Diffraction Data (ICDD), 1601 ParkLane, Swarthmore, PA 19081.) The Powder DiffractionFile (PDF) is a compilation of powder diffraction pat-terns produced and published by the Joint Committee onPowder Diffraction Standards of the ICDD. The PDF isthe world’s largest and most complete collection of X-raypowder diffraction patterns. The 1999 release of the PDFcontains about 115,000 patterns, of which 20,000 are or-ganic and 95,000 inorganic. Included in the database arecalculated powder patterns for almost 40,000 compounds.The PDF is used for identification of crystalline materi-als by matching d-spacings and diffraction intensity mea-surements. Each pattern includes a table of interplanar

Page 57: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

Crystallography 139

d-spacings, relative intensities, and Miller indices, as wellas additional helpful information such as chemical for-mula, compound name, mineral name, structural formula,crystal system, physical data, experimental parameters,and references.

8. Protein Data Bank

(http://www.rcsb.org/pdb/; Dr. H.M. Berman, Depart-ment of Chemistry, Rutgers University, 610 Taylor Road,Piscataway, NJ 08854-8087.) The Protein Data Bank(PDB) is the single international repository for the pro-cessing and distribution of three-dimensional macro-molecular structure data (primarily determined experi-mentally by X-ray crystallography and NMR). In Juneof 1999, Brookhaven National Laboratory ceased its op-eration of the PDB when the Research Collaboratory forStructural Bioinformatics (RCSB) took over. The RCSBoperates under a contract from the U.S. National Sci-ence Foundation with additional support from the Depart-ment of Energy and two units of the National Institutes ofHealth. Contents of the PDB are in the public domain, butthe original work as well as the PDB should be properlycited whenever referred to. Affiliated centers in Australia,England, and Japan undertake distribution of data in theirrespective areas.

B. Structure Correlation

The working hypothesis behind structure-correlation stud-ies is that changes observed in a structural fragment or sub-unit in a number of different environments occur alonga potential-energy valley in the parameter space of thatfragment. Each observed structure is a sample point. Anexample of this type of study examines the role of io-dine in the binding of thyroid hormone. Since iodine-containing thyroid hormones are protein bound duringmost of their metabolic lifetime, attractive interactionswith nucleophiles could play an important role in thatbinding. Short contacts between iodine bound to carbonand nucleophiles (such as O, N, and S) were studied in aneffort to better understand these interactions. The shortestcontacts are essentially linear with the lone pair of the nu-cleophile directed towards the C–I vector. Such contactsare similar to hydrogen bonds and have been estimated tocontribute attractive energy of about 3 kcal/iodine atom.The importance of such interactions is further supportedby the observation of a short (2.96 A) I–O contact in thecrystal structure of pre-albumin with bound thyroxine.

C. Reaction Coordinate

A chemical reaction can be represented by a plot of en-ergy as a function of reaction coordinate. Figure 7 shows

FIGURE 7 Reaction coordinate. A plot of energy as a function ofreaction coordinate for molecule A yielding product B.

the starting material A going to product B through a tran-sition state with activation energy of �E , which can beobtained from kinetic data. It is fairly obvious that we candetermine the structures of the compounds A and B. Lessobvious is that the databases provide a means of looking atthe reaction path in some detail. Each individual structureprovides a snapshot at one point along the reaction coordi-nate, but a whole family of structures can plot out a curverelated to the potential-energy surface of the reaction.

An early example of this approach was provided bythe study of the interactions between amino and carbonylgroups in nucleophilic addition reactions. By examiningthe data from six crystal structures, Burgi, Dunitz, andSchefter were able to show that interaction with the nu-cleophile causes the carbon of the carbonyl group to bedisplaced from the plane of its three substituents towardthe approaching nucleophile. The direction of approachof the lone pair on the nucleophile is at an angle roughly109◦ to the carbonyl bond (not perpendicular to the planeof the carbonyl). In addition, the displacement of the car-bonyl carbon out of the plane of its substituents yields asmooth curve when plotted as a function of the observedC–N separation.

As further confirmation of the validity of this approach,the general conclusions have been reinforced by compari-son with SCF-LCGO calculations on the system CH2O +H− → CH3O−. The calculated reaction path for the nucle-ophilic attack of hydride ion on formaldehyde shows veryclose resemblance to the one predicted by the method ofstructure correlation.

A different type of correlation is provided in a set ofpapers published in 1984 by Kirby and colleagues on thelength of the C–O bond. This investigation was promptedby the observation that C–O bond lengths for acetalsshowed an unusually broad range of values. To determinewhether this was a general phenomenon or one peculiar to

Page 58: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

140 Crystallography

acetals, the geometry of nearly 2400 ethers and esters wasinvestigated. The results show clearly that there is sub-stantial, systematic variation in C–O bond lengths. Thefragment of interest is defined as R1–O–R2. Data are di-vided into four categories, depending on whether R1 ismethyl, primary, secondary, or tertiary, and each of thosecategories is then divided according to the effective elec-tronegativity of the group R2. The four categories of R2

are alkyl, aryl, enol/ether, and acyl/carboxylic ester, giving16 categories in all. The shortest C–O bonds (1.418 A) arefound in compounds where R1 is methyl and R2 is alkyl.Within each of the four categories of R1, the length of theR1–O bond increases with increasing electronegativity ofthe R2 group. The effective electronegativity of a groupcan be estimated from the pKa of its conjugate acid. In thesubset of 2-(aryloxy)-tetrahydropyrans, there is a linear re-lationship between the length of the exocyclic C–O bondand the pKa of the leaving group. This implies a linearrelationship between the bond length and the free energyof activation for the hydrolysis reaction or any other reac-tion in which the C–O bond is cleaved. The consequencesof these generalizations to the chemistry and reactivitiesof acetals and glycosides are fully explored in these pa-pers. In the third paper in the series, the authors plot thereaction coordinate for six aryl tetrahydropyranyl acetals.Both relative free energy and pKa are plotted as a func-tion of the bond length. The Morse function is also plotted,as is a reaction coordinate–energy contour diagram. Thiswork by Kirby et al. is one of the most comprehensiveattempts at deriving structure–reactivity relationships sofar available.

D. Drug Design

The coupling of the resources of databases with molec-ular graphics devices and QSAR (quantitative structure–activity relationships) techniques raises a tantalizing pos-sibility that therapeutically useful new structures might bepredictable. Detailed comparison of structures of knownagonists and antagonists is the major strategy currentlyin use. Similarities in the three-dimensional structure ofportions of the molecules allow identification of the fea-tures required for binding. Differences in other portionsof the structure may account for the agonist/antagonist re-sponse after binding. Once the pharmacophore has beenidentified, binding studies can be combined with structuralcomparisons to map the receptor site.

A sum or superposition of active molecules can be usedto define the available volume within a receptor. Inac-tive molecules then define excluded volumes—volumesoccupied either by the receptor itself or by a cofactor.Any new drug must be designed to present the correctpharmacophore and to occupy only the available volume.

Examples of this include mapping of the methionine bind-ing site in the enzyme S-adenosyl ATP transferase byMarshall and colleagues, and the postulating of a com-mon site of action for gamma-butyrolactone analogs andpicrotoxinin. On the basis of this model, it has been pro-posed that the convulsant activity of these compounds isrelated to their ability to block the passage of chloride ionsthrough channels. The model would appear to be applica-ble to a number of convulsant and anticonvulsant drugs.

E. Crystallography and Molecular Mechanics

Molecular mechanics is an empirical method for calcula-tion of properties of molecules such as molecular geome-try, heat of formation, strain energy, dipole moment, andvibrational frequencies. Different programs use differentparameter sets and reproduce these physical propertieswith different degrees of fidelity. Parameters are assumedto be transferable from one type of molecule to another.

The geometrical parameters used in molecular-mechanics programs are frequently derived from crys-tallographic data. However, the predictive value of themethod is limited by the datasets used to derive the pa-rameters. For instance, a parameter set derived from un-crowded hydrocarbons is not likely to predict structures ofcrowded hydrocarbons with satisfactory accuracy. Tran-sition states, small rings, unusual states of hybridization,and other electronic effects may require special treatment.These caveats notwithstanding, the method has enjoyedconsiderable success.

It is not uncommon for molecular-mechanics calcula-tions to be used to provide a starting point for the re-finement of electron diffraction data. A recent exampleis provided by the study of bicyclo[3.2.0]heptane. Manyother examples are available in the literature.

Molecular mechanics can also be useful in interpre-tation of crystal structures, particularly in differentiatingelectronic and steric effects or in estimating the effectsof packing forces. In the structure of lepidoptrene, a C–Cbond was observed to have a length of 1.64 A in the X-rayexperiment, while molecular mechanics predicted a lengthof 1.57 A. Mislow and colleagues demonstrated that theadditional lengthening of the bond beyond the amount ex-pected from steric strain was caused by “through-bond”coupling of adjacent π systems.

Although molecular-mechanics calculations can be use-ful in assessing which aspects of a structure control con-formation, it should be pointed out that conformationalparameters from molecular mechanics tend to be lessreliable then bond lengths or bond angles. Also, confor-mational parameters in crystal structures are most influ-enced by packing forces. Molecular-mechanics calcula-tions have been used to suggest alternative conformations

Page 59: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

Crystallography 141

(having energy similar to that observed in the crystal)that might exist in the gas phase or in solution. Addi-tionally, molecular-mechanics has been used to optimizethe geometry of a pharmacophore in model studies ofdrug–receptor binding, to evaluate the interaction ener-gies between dinucleoside monophosphates and cationicintercalators such as ethidium bromide, and to interpretconformational polymorphism. Although some caution isobviously necessary, molecular mechanics and crystallog-raphy can provide complementary information in a varietyof cases.

APPENDIX I: FACTORSAFFECTING INTENSITIES

A number of geometrical factors influence the intensitiesobserved in a diffraction experiment. The most impor-tant of these are Lorentz, polarization, absorption, andextinction corrections. The first three corrections are nor-mally applied to the observed intensities in the processof calculating the structure amplitudes |Fhkl |. Many re-searchers feel that the absence of this correction is the sin-gle largest source of systematic error in crystal structuresin the current literature. Extinction is a correction madeduring least-squares refinement and so is a correction ap-plied to the model rather than to the observations. How-ever, the correct formulation for extinction depends onboth the polarization and the path-length (calculated dur-ing absorption correction) and so is discussed with theseother correction terms.

A. Absorption

The absorption of X-rays by crystals obeys the relationI = I0 exp(−µt), where µ is the linear absorption coef-ficient, in units of (length)−1, and t is the thickness. Thegeneral effect is to reduce the intensity of reflections at lowsin θ . If the crystal is centrosymmetric in cross-section,neglect of this correction will have little effect on posi-tional parameters, although scale and thermal parametersmay be very strongly affected. If the crystal cross-sectionis not centrosymmetric, all structural parameters will besystematically wrong. Neglect of absorption is probablythe largest single source of systematic error in publishedstructures.

Absorption corrections can be done by collectingϕ-scans and calculating an empirical correction or byindexing and measuring crystal faces and calculating aface-indexed absorption correction. For a face-indexedcorrection, the size and shape of the crystal must be pre-cisely determined. Once the crystal shape is established,a number of techniques are available for calculating the

integrated path-length for incident and diffracted beamsfor all reflections.

For crystals with no reentrant angles, the most com-mon methods are the analytical method and the methodof Gaussian quadrature. In both cases, the objective is toevaluate the integral,

A = 1

V

∫crystal

exp(−µT ) dυ

This can be done analytically if the crystal is divided intoa number of polyhedra in each of which the path-lengthis a linear function of the coordinates. De Meulenaer andTompa first programmed this method in 1965. Calcula-tion time is independent of the severity of absorption. Thefactor-limiting accuracy in cases of severe absorption isthe precision of measuring the crystal, particularly in itsshortest direction.

Gaussian quadrature is a numerical integration methodthat evaluates an integral by summing an appropriate poly-nomial. It uses a nonisometric grid in which the inter-val is subdivided symmetrically about the midpoint, withlarge spacings near the middle and smaller ones towardthe edges. This tends to put the maximum number of gridpoints near the surface of the crystal where the change ofabsorption with path-length is largest. The number of gridpoints determines the precision of the calculation. Thus,if a 4 × 4 × 8 grid gives a precision of 2% in calculatedtransmission for µ = 2.5 cm−1, a grid of 8 × 8 × 16 isrequired to produce the same precision if µ = 5.0 cm−1.By choosing sufficient points, the Gaussian method canreproduce the analytical result to any desired precision;however, for strongly absorbing crystals, the analyticalmethod is the method of choice.

When a crystal has reentrant angles, neither of thesemethods can be used, and one possible resort is numeri-cal integration with an isometric grid; however, the gridmust be very fine in order to achieve reasonable precision.For crystals mounted in capillaries, crystals with irregu-lar shapes, and other pathological cases, absorption maybe dealt with by measurement of a transmission surfaceas proposed by Huber and Kopfmann in 1969. However,these problems can be dealt with by the program DIFABS,developed by Walker and Stuart in 1983, which models theabsorption surface by a Fourier series in polar coordinates.The coefficients are obtained by minimizing the sum ofsquares of residuals between observed and absorption-modified values of the structure factors. The chief virtueof the method is that it can be used even when the crystalis no longer available

B. Lorentz Factor

The Lorentz factor corrects for the relative speeds withwhich different reflections pass through the reflecting

Page 60: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

142 Crystallography

position. The intensity of a reflection produced by a mov-ing crystal depends on the time taken for the correspondingreciprocal lattice point to pass through the sphere of re-flection. Using 1/(ωS cos θ ) as the angular velocity, thedefinition S = 1/d , and Bragg’s law, λ = 2d sin θ , the cor-rection takes the form:

d/(ω cos θ ) = λ/(2ω sin θ cos θ ) = λ/(ω sin 2θ )

The factor 1/(sin 2θ ) is the Lorentz factor.

C. Polarization

X-rays are polarized on reflection, and the degree of polar-ization depends on experimental conditions. Neutrons arenot polarized on reflection from ordinary crystals. X-raysproduced from an X-ray tube or a rotating-anode gener-ator are unpolarized; that is, all directions of the electricvector normal to the direction of propagation are equallyrepresented. Thus, the beam can be regarded as composedof two components, one polarized parallel to the reflectionplane and one perpendicular. The relative intensities of thetwo components are

I|| = I⊥ = 12 I0

For an ideally imperfect crystal, the intensity of radia-tion scattered in a particular direction is proportional tosin2 φ, where φ is the angle between the electric vectorand the direction of observation. It follows that the par-allel component will be attenuated by reflection but theperpendicular component will not. Thus, the relative in-tensities of the two components after reflection at an an-gle θ will be I‖ = 1

2 I0 sin2(90 − 2θ ) = 12 I0 cos2 2θ and

I⊥ = 12 I0 sin2 90◦ = 1

2 I0. Thus, the beam is partially po-larized at all angles and completely polarized at 2θ = 90◦.

For data monochromated by means of a β filter, thepolarization correction is

p = 12 (1 + cos2 2θ )

Frequently, Lorentz and polarization corrections are com-bined to give:

Lp = (1 + cos2 2θ )/2 sin 2θ

These corrections are applied to the observed intensity toderive structure amplitudes:

|Fhkl | =√

Ihkl/Lp

In many modern diffractometers, monochromatization ofthe primary beam is achieved by Bragg reflection froma suitable crystal. Three of the most commonly usedmonochromator crystals are quartz (d = 3.35 A), LiF(d = 2.01 A), and highly oriented graphite (d = 3.35 A).The monochromator may be installed in the incident beamor the diffracted beam and may be mounted with its axis

parallel or perpendicular to the equatorial plane of thediffractometer. If the monochromator is in the incidentbeam and mounted so that its axis is perpendicular tothe equatorial plane, the component that was attenuatedby reflection from the monochromator is attenuated againby the sample, so that the polarization correction for thetwice-reflected beam becomes:

p = (1 + cos2 2θm cos2 2θ

)/(1 + cos2 2θm

)For a monochromator mounted with its axis parallel tothe equatorial plane, reflection from the monochromatorattenuates the beam normal to the equatorial plane, whilereflection from the sample attenuates the parallel compo-nent, so that

p = (cos2 2θm cos2 2θ

)/(1 + cos2 2θm

)These formulas assume that the monochromator crystalis an ideally mosaic crystal. For a perfect or non-mosaiccrystal, the factor cos22θm should be replaced by |cos θm|.

In practice, the polarization ratio seldom correspondsto either of these ideal values and may not even lie be-tween them. This has led some investigators to recast theequations in the form:

p = (1 + K cos22θ )/(1 + K )

for the monochromator axis perpendicular to the equato-rial plane, and

p = (K + cos2 2θ )/(1 + K )

for the parallel orientation, where K is the actual measuredvalue of the polarization ratio for the monochromator inquestion. The value of K is different for different wave-lengths. For routine structural work with MoKα radiation,the error in assuming that the monochromator is an idealmosaic will generally be small. However, maximum erroroccurs at θ = 45◦ and so is important in the case of veryprecise studies that rely on high-angle data. Vincent andFlack have developed a method for determining K , thepolarization ratio, without special equipment.

D. Extinction

As early as 1922, Darwin realized that absorption is notthe only effect that attenuates the X-ray beam as it passesthrough the crystal. He described two phenomena, whichhe designated as primary and secondary extinction, andshowed how they could be treated mathematically.

Primary extinction is an interference process. If a setof planes is in a position to reflect, the reflected rays mayalso be reflected a second time. Since there is a phasechange of π/2 on reflection, a beam that has been reflectedn times will be exactly out of phase with one that hasreflected (n–2) times. This causes the reflected intensity

Page 61: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

Crystallography 143

to be proportional to |F| rather than |F|2. A crystal forwhich this is true is called an ideally perfect crystal. Suchcrystals are rare. It is much more common to encountercrystals where I ∝ |F |n where I < n < 2 but nearer to 2.

Darwin modeled the phenomena of extinction by as-suming that crystals were made up of mosaic blocks,slightly misaligned with respect to one another. In perfectcrystals, the blocks are assumed to be large and the mis-alignments small. Secondary extinction occurs becausethe planes first encountered by the incident beam reflectso strongly that deeper planes receive less radiation andso reflect with less power than they otherwise would havedone. This effect is pronounced for strong reflections with|F/V | of the order of 0.1 × 10−24 cm−3.

According to the mosaic model, the effect is expected incrystals where the mosaic blocks are small with respect tothe size of the crystal. A mosaic crystal in which the blocksare sufficiently misaligned that secondary extinction isnegligible is called an ideally imperfect crystal. Since suchcrystals are seldom encountered in experiments, it is con-venient to correct for extinction in least-squares refine-ment. In most current programs, the correction is basedon Zachariasen’s 1967 formalism for isotropic extinction,in which the mosaic blocks are assumed spherical:

F∗c = k|Fc|

(1 + 2r∗ Q0Tp2

/p1

)−1/4

where k is the scale factor, Fc is the calculated value of thestructure factor, r∗ = β[1 + (β/g)2]1/2 where β = 2t/3λ, tis the mean path length in a single domain, and g is re-lated to the mosaic spread distribution and is frequentlyassumed to be Gaussian. For X-rays,

Q0 =(

e2 FK

mc2V

)λ3

sin 2θ

where e2/mc2 is the classical radius of an electron, and Kis the polarization ratio. For neutrons,

Q0 =(

F

V

)2λ2

sin 2θ

The term T is the mean path-length in the crystal andrepresents an integration of incident and diffracted beamsover all diffraction paths in the crystal. This is normallyevaluated during the calculation of absorption correction.If absorption is small in the crystal, T may be arbitrar-ily set to some value such as 0.03 cm. Finally, the termpn is a polarization term generally assumed to have theform 1 + cos2n 2θ appropriate to filtered radiation. If amonochromator is used, the appropriate form of the po-larization factor should be incorporated into the extinctioncalculation. For very precise work, the actual polarizationratio of the monochromator should be determined experi-mentally. For extinction correction, the term refined is r∗.

It contains both a breadth parameter and a misalignmentparameter. If β � g, then the broadening of the diffrac-tion peak is dominated by mosaic spread, and we have atype I extinction. Type II extinction, when β � g, is lesscommonly encountered and corresponds to the situationwhere the misalignment is small and the breadth of thediffraction peaks is controlled by small domain size. Sep-aration of the terms is possible if two determinations aremade on the same crystal with different wavelengths. Thisis rarely done.

Coppens and Hamilton first extended the treatment ofextinction to an anisotropic model in 1970. The crystalis modeled as if it were composed of ellipsoidal parti-cles whose misorientations follow a Gaussian probabilitydistribution. Since there is no need for the distributionof mosaic blocks to obey any symmetry in the crystal,symmetry-equivalent data are not averaged in thistreatment.

E. Anomalous Dispersion

When the wavelength of the incident X-ray beam is closeto the absorption edge of a scattering atom, the atomicscattering factor for that atom becomes complex:

f = f0 + � f ′ + i� f ′′

where f0 is the normal scattering factor for wavelengths farfrom the absorption edge, and� f ′ and i� f ′′ are correctionterms. The quantity � f ′ is usually negative, and i� f ′′

is always π/2 radians ahead of the real part in phase.For structural work, the corrections are assumed to beindependent of scattering angle. In addition, � f ′ = 0 forwavelengths longer than the absorption edge.

Four aspects of anomalous dispersion important in nor-mal structural work in crystallography are

1. Determination of absolute configuration2. Solution of the phase problem (structure solution)3. Distinguishing among atoms of similar scattering

power4. Avoiding systematic errors in structures with polar

space groups

Knowledge of the absolute configuration is extremely im-portant in physiologically active materials, since biolog-ical systems discriminate strongly between enantiomor-phous forms of their substrates. Anomalous scatteringwith phase change causes a breakdown in Friedel’s law,and Ihkl �= Ihkl . This effect was first exploited for deter-mination of the absolute configuration of the sodium ru-bidium salt of (+)-tartaric acid using Zr radiation (λ =6.07 A), whose wavelength is slightly shorter than that ofthe absorption edge of Rb (λ = 6.86 A). The experiment

Page 62: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

144 Crystallography

coupled with the known relationship between the stere-ochemistry of (+)-tartaric acid and (+)-glyceraldehydeshowed that Fischer’s arbitrary choice had been correct.With modern data collection techniques, the determinationof absolute configuration is relatively uncomplicated. Infavorable cases, the method can be applied to a compoundwith no atom heavier than oxygen when the incident ra-diation is CuKα (λ = 1.54 A). Rabinovich and Hope havedetermined the absolute sign of the torsional angles inthe achiral compound 4, 4′-dimethylchalcone, C17H16O.These authors believe that the determination of absoluteconfiguration may be possible with hydrocarbons.

In structure solution, anomalous scattering withoutphase change is formally equivalent to isomorphous re-placement. An anomalous-difference Patterson functionis analogous to an isomorphous-difference Patterson andso contains peaks only for vectors between anomalouslyscattering atoms and vectors between anomalous scatter-ers and normal scatterers. Vectors between normal scat-terers do not appear.

Anomalous dispersion with phase change (� f ′′ �= 0),can be used to determine the phase angles from noncen-trosymmetric crystals. In the case where the position ofthe anomalous scatterer is known, the procedure requiresthat differences in intensity between be measurable for asignificant number of intensities.

Anomalous dispersion provides an elegant means todistinguish among near neighbors in the periodic tablethat would otherwise have similar scattering power. Al-loys such as β-brass (Cu–Zn) and Cu2MnAl were earlyexamples of structures determined by this technique. Thetunability of the synchrotron source allows choice of awavelength close to the absorption edge of an elementin the sample to maximize the anomalous component ofscattering.

Neglecting to correct for anomalous dispersion will ob-viously introduce a small error in the magnitudes andin the phases of the calculated structure factors. It waslong assumed that such errors would have little effect onatomic positions. However, Cruickshank and McDonaldhave pointed out that neglect of the correction will alwayscause errors in thermal parameters, and in the case of po-lar space groups very serious errors in coordinates canarise. The size of the error varies directly with � f ′′ andinversely with the resolution of the data. For a moderatelyheavy atom such as Co (Z = 27), the error in coordinatescan be as large as 0.06 A in the experiment done withCu radiation. Neglect of anomalous dispersion will causeerrors of the order of 0.005 A for structures with atoms noheavier than oxygen with Cu radiation or sulfur with Moradiation. The error caused by including � f ′′ and choos-ing the wrong enantiomer is twice as large.

F. Scaling of Data

Virtually all methods of solving structures require a rea-sonable estimate of the relative scale factor between theobserved and calculated structure factors. Most computerprograms that calculate scale and temperature factors arebased on Wilson’s 1942 equation:

〈Ihkl〉 =∑

i

f 2i

which says that the local average value of the intensity isequal to the sum of the squares of the scattering factors. It isassumed that the average is taken over a sufficiently narrowrange of (sin θ )/λ so that the f values can be treated asconstants:

Ihkl = k∑

i

f 2i exp[−2B(sin2 θ/λ2)]

Ihkl

/ ∑i

f 2i = k exp[−2B(sin2 θ/λ2)]

loge

[Ihkl

/ ∑i

f 2i

]= loge k − 2B(sin2 θ/λ2)

Thus, a plot of loge[Ihkl/∑

i f 2i ] versus (sin2 θ )/λ2 will

give a straight line of slope 2B and intercept k.A number of conditions must hold so that the values of

B and k are reasonable. The sampling interval should besmall, so 40 to 50 intervals of (sin θ )/λ are required forthree-dimensional datasets. Weak reflections must be in-cluded. Elimination of all reflections with I < 3σ (I ) willcause the average intensity to be too high in the high rangesof (sin θ )/λ, and that will cause an underestimate of thetemperature factor and a corresponding overestimate ofthe scale factor. Including weak reflections at half of thelocal minimum observed intensity is better than leavingthem out, but a Baysian fill is probably a better strat-egy. Large excursions on both sides of the best-fit lineare quite common and reflect the facts that Wilson’s for-mula was derived for a random distribution of atoms andreal structures contain many repetitions of certain bondedand nonbonded distances. The inflection points (the re-gions in the graph where the experimental points cross thestraight line) are relatively constant for different types ofstructures. Hall and Subramanian recommend an “inflec-tion point least squares” in which the least-squares lineis fitted to 15 points: the five lowest angle points, fivepoints nearest to (sin2 θ )/λ2 = 0.15, and five points near-est (sin2 θ )/λ2 = 0.26.

G. Thermal Diffuse Scattering

Thermal diffuse scattering (TDS) arises mainly fromlow-frequency acoustic modes in the crystal. First- and

Page 63: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

Crystallography 145

second-order TDS cause the scattering density to peak un-der Bragg peaks, with the degree of peaking related to thevelocity of sound in the crystal. The effect is not removedin normal data-reduction techniques and is different indifferent directions in the crystal. In normal structure de-termination, TDS is ignored. The result is a systematicdecrease in apparent thermal parameters. Since TDS in-creases with increased (sin θ )/λ, it enhances the apparentintensity of high-order diffraction data.

In very precise work, such as the determination ofcharge-density distribution, it is extremely important thatthe effect be eliminated or accounted for. It should be notedthat the amount of TDS included in a diffraction profilewill generally be different for X-ray and neutron experi-ments, since it depends on such experimental conditionsas primary beam divergence, wavelength spread crystaldimensions, and counter aperture. Extensive calculationsare required to correct for TDS, and most formulationsdemand that the elastic constants of the crystal be known.However, cooling the crystal can reduce the effect. If α

is defined as I (TDS)/I (Bragg), cooling from room tem-perature to liquid-nitrogen temperature will reduce α bya factor of 5. Cooling to liquid-helium temperature willreduce α by another factor of five. Facilities for X-raydiffraction experiments down to liquid-nitrogen tempera-ture are fairly common. Helium cryostats are rare.

APPENDIX II: METHODS OFSTRUCTURE SOLUTION

A. Trial-and-Error Methods

The earliest structures that were determined by X-raydiffraction were mineral structures with relatively highsymmetry. Intensities were measured as strong, medium,and weak, and most of the atoms sat on special positionsin the cell. From knowledge of the density of the mate-rial, its chemical formula, and the space group, one could

FIGURE 8 Transform of a hexagon. The circle has a radius of 0.8 A−1. (a) Regular hexagon; (b) tilted hexagon;(c) geometric construction to determine tilt of hexagon in crystal space.

postulate trial structures and see whether the pattern of in-tensities matched. The method cannot generally be appliedto molecular structures in low-symmetry space groups.

B. Transform Methods

A crystal can be regarded as convolution of the latticewith the unit-cell contents. By the convolution theorem,the transform of a convolution is the product of the trans-forms. The diffraction pattern is the transform of the crys-tal structure and so must be the product of the delta func-tion representing the reciprocal lattice and the transformof the unit cell contents.

From the properties of a delta function we know thatthe product f (x) δ(x − x0) has values only at x0 sinceδ(x − x0) = 0 if x �= x0. Thus, the diffraction pattern ofa crystal can be regarded as the transform of the unit-cellcontents sampled at the points of the reciprocal lattice.This implies that a direct plot of the weighted recipro-cal lattice can give some information about the structure.The method is used frequently to solve the structures ofpolynuclear aromatic hydrocarbons. In these compoundsthe dominant features of the diffraction pattern are the ben-zene transform and the fringe function showing the sepa-ration of the molecules. The molecules generally crystal-lize in centrosymmetric space groups with one short axisroughly perpendicular to the plane of the molecule.

Figure 8a shows the calculated transform of a regularhexagon, the transform of a benzene ring. It is character-ized by positive and negative regions with strong positivepeaks at a distance 0.8 A−1 from the origin. If the hexagonis tilted along an axis, the distance perpendicular to theaxis of tilt will be foreshortened in crystal space. The cor-responding distances in the transform will be elongated.Simple geometry will allow calculation of the angle of tilt.

Determination of the separation is relatively simple.Consider two centrosymmetric molecules related by a cen-ter of symmetry with a separation of molecular centers

Page 64: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

146 Crystallography

FIGURE 9 (a) Transform of naphthalene with correctly oriented reciprocal lattice; inset shows naphthalene orien-tation used to calculate the transform. (b) Weighted reciprocal lattice of naphthalene for comparison with transform.Reflections such as 202 and 801, which lie on regions of the transform where density is changing, are much moresensitive to orientation than are those such as 203 that lie in the middle of strong areas.

of 5 A. Since the molecules are identical in shape andorientation, the combined transform will be that of a sin-gle molecule crossed by straight fringes. Regions in thecombined transform are strong only if the correspondingregion in the single transform is strong, but weak regionsarise either from weak regions in the single transform orfrom the zeros of the fringe system. A line perpendicularto the fringes is the direction of the line joining the cen-ters of molecules, and the separation of the molecules isreciprocal to the spacing of the fringes.

If the transform and the reciprocal lattice are drawn onthe same scale, the correct relative orientation of one to theother can be established by matching strong areas of thediffraction pattern with strong areas in the transform. Forfine adjustment of the orientation, attention must be paidto those reflections most sensitive to orientation effects,those lying on rising or falling regions of the transform.See, for example, reflections 202 and 801 in Fig. 9. Bycontrast, the 203 reflection lies well within a strong areaof the transform and its value will not be affected by evenfairly large changes in orientation.

C. Heavy-Atom Methods

In X-ray diffraction, the scattering power of an atom isproportional to the square of the atomic number, Z2. If amolecule contains a heavy atom (high Z ) and that atomcan be located, then a set of phase angles can be calculatedfor the dataset that are approximations to the true phases.A Fourier synthesis calculated with observed structureamplitudes and phases appropriate to the heavy atom willgive a map that contains the heavy atom, some light atoms,and some noise. Phases based on the known atom posi-tions are better estimates of the true phase than the heavyatom alone. The iterative procedure is repeated until allatoms are located.

The location of the heavy atom can be determined fromthe Patterson function:

P(uυw) = 2

V

∑h

∑k

∑l

(Fhkl)2

× [cos 2π (hu + kυ + lw)]

=∫

volume of cellp(xyz)

× p(x + u, y + υ, z + w) dυ

The Patterson function can be calculated directly from theintensities with no previous knowledge of the phases. Itis the self-convolution of the electron density. This meansthat a peak at uυw represents a vector between two atomswhose separation is equal to the vector distance from theorigin to the point uυw. The weight of that peak is propor-tional to the product of the atomic numbers of the atomsat each end of the vector. Vectors between heavy atomstend to dominate these maps and hence allow the positionof the heavy atom to be determined.

Figure 10 shows a hypothetical molecule with oneheavy atom and its Patterson function. Note that the Pat-terson function is always centrosymmetric. If the heavyatom were iodine and the light atoms were carbons, I–Ivectors would have weight 2809; I–C vectors, 318; andC–C vectors, 36.

If two molecules are related by a center of symmetry asin Fig. 11, then in addition to the intramolecular vectorsnear the origin of the vector maps, there are intermolecu-lar vectors. The I–C and C–C vectors are double-weightvectors because both molecules give the same pattern ofvectors. However, the I–I vectors, which represent vectorsbetween iodine atoms across the center of symmetry, aresingle-weight peaks.

Page 65: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

Crystallography 147

FIGURE 10 Patterson function. (a) Three-atom “molecule” and(b) its vector set. If there are N peaks in the Fourier, there are N2

peaks in the Patterson function. Of these, N are superimposed atthe origin and N(N − 1) are distributed through the cell. The atommarked O is heavy; X indicates a heavy-atom–light-atom vector;✔ indicates a light-atom–light-atom vector.

If the reader were to make a copy of Fig. 11 on transpar-ent paper, place the origin of the transparent map on an I–Ivector, and mark the places where the two maps overlapwith a mark corresponding to the lower intensity in theoverlapping functions, the original structure would be re-covered. This is the basis for the “difference function,”one of the methods for recovering the electron densityfrom the Patterson function. If the heavy-atom peak hadbeen a double-weight peak, two superpositions would benecessary to recover the original function.

There are constraints on the relative size of the scat-tering contribution of the heavy atoms. If the heavy atom

FIGURE 11 Symmetry in the Patterson function. (a) Twomolecules related by a center of symmetry, and (b) the corre-sponding vector set. The symbol � indicates a heavy-atom–heavy-atom vector. Note that peaks corresponding to a vector dis-tance between an atom and its symmetry mate are single-weightpeaks. All other peaks are double weight.

is too light, phases calculated from its position are poorestimates of the true phases and it may be very difficult tofind correct atom positions in a very noisy Fourier map. Ifit is too heavy, the scattering of the heavy atom will dom-inate to such an extent that the precision of the light-atomparameters may be seriously affected.

The rule of thumb is that the ratio∑

Z2heavy/

∑Z2

lightshould be approximately 1; however, the method will tol-erate large deviations in either direction. For instance, thestructure of vitamin B12 (C63H88N14O14PCo · H20) wassolved using the phases from the cobalt atom as a startingpoint. The Z2 ratio is about 0.17!

A Sim-weighted Fourier is a Fourier series phasedby the known portion of the structure with coefficientsweighted according to the probability that the phase iscorrect. This is a very useful technique for improving thesignal-to-noise ratio in poorly phased Fourier maps.

D. Isomorphous Replacement

Two compounds are perfectly isomorphous if the onlydifference in their electron-density maps corresponds tothe site of a replaceable atom. The method requires twoisomorphous derivatives in the centrosymmetric case andthree or more isomorphous derivatives in the noncen-trosymmetric case. As direct methods have improved, theuse of this method for general organic structures has de-clined; however, many all-protein structures have beensolved by this method or by the combination of isomor-phous replacement and anomalous scattering.

The centrosymmetric case is straightforward. Considertwo derivatives, A and B, for which the light-atom portionsare identical but the replaceable atoms are different. Then,

FA = FL + FAR

FB = FL + FBR

�FAB = FA − FB = FAR − FBR

The magnitudes |FA| and |FB| are available from the datacollection; the magnitudes and signs of the replaceablecomponents are available once the positions of the replace-able atoms are known. Thus, the four possible sign com-binations corresponding to the left-hand side of the thirdequation are calculated, and the combination that givesbest agreement with the right-hand side is accepted. Re-flections whose phases are not well determined are omittedfrom the Fourier synthesis.

In the noncentrosymmetric case, the use of two isomor-phous derivatives leads to a twofold ambiguity in phaseand it is necessary to have a third derivative with a heavyatom in a different position in order to resolve the ambi-guity. This is illustrated in Fig. 12. Again, let

Page 66: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

148 Crystallography

FIGURE 12 Isomorphous replacement. (a) The difference �FAB between the structure factors for the two isomorphsA and B provides the center for |FB|. (b) The ambiguity in phase can be resolved by a third derivative.

FA = FL + FAR

FB = FL + FBR

�FAB = FAR + FBR

The circles of radius |FA| represents all possible valuesfor the phase of FA. The magnitude and phase of �Fare indicated by the vector and a circle of radius FB thatis drawn using the tip of �F as center. The two pointsof intersection correspond to the phase combinations thatsatisfy the third equation. A third derivative introduces anew equation,

�FAC = FAR − FCR

Drawing a third circle with radius |FC| and center at �FAC

resolves the ambiguity.

E. Anomalous Scattering

When the energy of the radiation used in the experimentlies near an absorption edge for one (or more) of the atomsin the crystal, the scattering factor for that atom becomescomplex:

f = f0 + f ′ + i f ′′

In this circumstance, Friedel’s law breaks down and|F |2hkl �= |F |2

hkl. For centrosymmetric crystals, the data are

measured twice: once with a radiation for which the heavyatom scatters normally, and once for scattering anoma-lously. The two sets of data must be scaled carefully be-cause the differences tend to be small. If the imaginarycomponent is small, then anomalous scattering is exactlyanalogous to isomorphous replacement.

The results can be shown in a Harker diagram. In thecentrosymmetric case with only one type of anomalousscatterer (Fig. 13a), the first circle (i.e., FN ) correspondsto the amplitude of the structure factor with no anomalous

contribution. The second circle (FNH ) corresponds to theamplitude of the structure factor with an anomalous con-tribution. FH ′ and FH ′′ represent the real and imaginarycomponents of contribution of the anomalous scatterers.Since there is only one anomalous scatterer, the vectorsFH ′ and FH ′′ are perpendicular. The vector sum of FH ′

and FH ′′ becomes the center for the FNH structure ampli-tude. The circles intersect in two places but there is noambiguity, since the phase must lie on the real axis.

In the noncentrosymmetric case (Fig. 13b), the sameprocedure again leaves us with a phase ambiguity. In struc-tures of moderate size, it may be sufficient to choose thesolution near the phase of the heavy atom. This will notalways be the correct choice but frequently leads to thecorrect solution.

In protein structures, the heavy atom is generally fartoo light to use to discriminate between the two choices.The ambiguity can be resolved by using the other memberof the Bivoet pair (Fig. 13c) or by using data from anisomorphous derivative (not shown). In any case, threedatasets are required.

An interesting variation on the method is possible usingsynchrotron radiation. If the protein contains one anoma-lous scatterer, the tunability of the source can be exploitedto collect datasets at different wavelengths. Some pro-tein structures contain a large number of atoms whoseanomalous dispersion corrections are so weak they maybe neglected. If they also contain only a few anomalouslyscattering atoms all of one type, a simple special case ofthe general system of equations results:

|Fλh|2 = ∣∣Fn1,h

∣∣2 + [1 + Q(Q + 2 cos δλ2)]∣∣Fn

2,h

∣∣2

+ 2(1 + Q cos δλ2)∣∣Fn

1,h

∣∣∣∣Fn2,h

∣∣ cos(φn

1,h − φn2,h

)+ 2Q sin δλ2

∣∣Fn1,h

∣∣∣∣Fn2,h| sin

(φn

1,h − φn2,h

)

Page 67: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

Crystallography 149

FIGURE 13 Anomalous dispersion. The first circle is FN . The vectors FH′ and FH′′ are drawn and their vector sumbecomes the center for FNH . This gives a twofold ambiguity, which can be resolved in the (a) centrosymmetric caseby noting that the phase must lie on the real axis. In the (b) noncentrosymmetric case, the ambiguity can be resolvedby using the negative anomalous component as shown in (c) or an isomorphous derivative (not shown). F−

NH andF+

NH refer to Fhkl and Fhkl , respectively.

where Q is the ratio f aλ2/ f n

2,h, |Fn1,h| is the magnitude of the

structure factor for the normally scattering atoms, |Fn2,h| is

the normal part of the structure factor for the anomalouslyscattering atoms, and φn

1,h and φn2,h are the associated phase

angles. There is a second equation for the Friedel mate,and it is the same as the previous one except for a minussign before the last term. Thus, there will be two indepen-dent equations for each wavelength at which data are col-lected, plus a third equation resulting for the trigonometricidentity sin2 x + cos2 x = 1. This set of equations can besolved if data are collected at two or more wavelengths.Generally, one dataset is collected at energies below theabsorption edge so that � f ′′ = 0; a second set is collectedabove the absorption edge, with � f ′′ �= 0 and � f ′ hav-ing the same value as for the first experiment. The actualvalues for the real and imaginary parts of the scatteringfactor are determined experimentally before the choice ofwavelengths is made. The obvious advantage is that theisomorphism is exact. In principle, the data can be col-lected on one crystal; in practice, radiation damage willrequire the use of several different crystals. A generaliza-tion of the theory can take into account any number oftypes of anomalous scatterers and any number of anoma-lous scattereres within each type.

F. Direct Methods

Methods of structure solution that attempt to evaluate thephases of the structure factors without recourse to struc-

tural information are known as direct methods. It is notnecessary to determine all of the phases. In general, about10× the number of non-hydrogen atoms in the cell is suf-ficient.

Virtually all of the direct-method programs currentlyavailable make use of normalized structure factors, E(h),which “correct” the structure factor F(h) for fall-off withangle caused by the temperature factor and the scatteringfactors of the atoms,

E(h) = F(h)/(

ε−1/2h

∑f 2i

)1/2

If the shapes of the scattering curves are similar, the valuesof E(h) can be calculated from the relation:

E(h) =[ε

−1/2h

N∑j=1

Z j

/(σ2)1/2

]exp(2πih · r j )

where N is the number of atoms in the cell, and εh is the av-erage intensity multiple of the hth reflection. The averagevalue of |E |2(h) = 1. The most frequently used relation-ship in direct methods for centrosymmetric structures isthe

∑2 or triplet relationship:

S(Eh) ≈ S∑

k

Ek Eh−k

where S(Eh) is the sign of the reflection hkl, and ≈ means“probably equal to.” The probability associated with thisrelationship is

Page 68: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

150 Crystallography

P+(Eh) = 12 + 1

2 tanh

(σ3(σ2)−3/2|Eh|

∑k

Ek Eh−k

)

where σn = ∑Ni=1 Zn

i and Z is the atomic number. Forexample,

h = 633 and k = 790

h–k = 163

In this example from a centrosymmetric structure, if allof these reflections are strong, it is probable that the signof F(633) is positive. The probability increases as themagnitude of the normalized structure factors increase.The steps in solving a centrosymmetric structure (wherethe phases can have values of 0 or π ) are

1. Evaluate E terms. This includes calculation of thenormalized structure factors and sorting of the reflectionsamong eight subgroups defined by the parity of the h, k,

and l indices.2. Form

∑2 relationships with the strongest

reflections—those that have E values greater than somearbitrary value such as 1.2 or 1.5.

3. Determine phase. Historically, phases were deter-mined by the symbolic addition method. In this method,origin-determining reflections are given signs and a fewothers, chosen from the strongest |E | values with the most∑

2 interactions, are given symbols (a, b, c, etc.). Signscan be used because, in the case of a centrosymmetricstructure, the phases can only be 0 or π . Thus, Sh means thesign of the structure factor for the h reflection. The valuefor Sh could arise in one of several ways. One way is itsassignment as described above. Alternatively, Sh could ac-quire a value through the triplet relationship as follows: IfSk is know to be positive, and Sh−k is known to be a, then:

Sh = SkSh−k = a

During the course of the analysis, relationships appearamong the symbols, such as ac = e. Manipulation of theserelationships usually allows the number of unknowns tobe reduced at the end of phase determination.

4. Calculate the E map—a Fourier summation usingE values as coefficients and phases determined in step 3.

In the noncentrosymmetric case, the solution is more dif-ficult since the phase can take any value between 0 and2π . Hence, a different set of relationships was developedfor this case:

ϕh = 〈ϕk + ϕh−k〉kr

ϕh =∑

kr|Ek Eh−k|(ϕk + ϕh−k)∑

kr|Ek Eh−k|

The symbol kr implies that k ranges only over those vec-tors associated with large |E | values.

The process of choosing initial origin-determiningphases is similar to the centrosymmetric case, but an ad-ditional enantiomorph-determining reflection must alsobe specified. The symbols are assigned in the same wayas before and result in assignments such as φh = 2a − b.These are converted to numerical values, and each set isthen expanded and refined using the tangent formula. Thefirst computer programs for structure solution used thismethod. The symbolic addition procedure generates onlya few alternatives for the values of the phases which mustbe considered because the number of resulting unknownsymbols is usually no more than three for four.

In multisolution (i.e., multitrial) methods, a small num-ber of phases are assigned arbitrarily to fix the originand, in the case of noncentrosymmetric space groups,the enantiomorph. Additional reflections are each as-signed many different starting values in the hope thatone (or more) of these sets of starting conditions willlead to a solution. Some programs use random-numbergenerators to set starting values for some 20 to 200phases, which are then extended and refined by the tangentformula:

tan φh =∑

k wk|Ek Eh−k| sin(φk + φh−k)∑k wk|Ek Eh−k| cos(φk + φh−k)

The weighting function in the tangent formula is usefulin some approaches. A new phase φh is assigned a weightthat is the minimum of 0.2αh and 1.0:

w0 = min(0.2αh, 1.0)

Although this allows rapid development of a phase set foreach trial, it tends to lead to an incorrect centrosymmetricsolution in the case of polar space groups. For Hall–Irwinweights,w = w0 f (α/αest), where f = 1.0 forα < αest anddecreases for α > αest. This weighting tends to conservethe enantiomorph and has been incorporated into morerecent versions of the multisolution programs. The majordifference in treatments for centric and acentric datasetslies in the values assigned to the extra reflections in thestarting set. In the centric case, the possible values are 0and π ; in the acentric case, general reflections have fourpossible values, ±π/2 and ±3π/2.

A recent modification to the multisolution approachconsists of phase refinement alternated with the impo-sition of constraints by peak picking in real space. Thisapproach, referred to as a dual-space method, differs fromother multisolution methods in that the phases are ini-tially assigned values by computing structure factors fora randomly positioned set of atoms. The occurrence oftwo Fourier transformations per cycle results in an al-gorithm that is computationally intensive. However, the

Page 69: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

Crystallography 151

new approach also extends the limits of these programsto much larger structures. It has proven to be capable ofsolving complete structures containing as many as 2000independent non-hydrogen atoms (provided that accuratediffraction data have been measured to a resolution of1.2 A or better and some heavy atoms are present in thestructure).

There has been progress in the experimental evalua-tion of triplet-phase invariants. The phenomenon of si-multaneous diffraction has long been considered a nui-sance in single-crystal work. If three reflections lie onthe sphere of reflection simultaneously, there is a powertransfer that tends to enhance such weak reflections at theexpense of strong ones. In 1977, it was discovered thatthe shape of the simultaneous diffraction profile is sen-sitive to the phase of the triplet that gave rise to it. Thesense of the asymmetry is opposite for positive and nega-tive triplets. This effect has been observed with ordinarymosaic crystals of relatively heavy scatterers (ZnWO4)using CuKα radiation from a fine-focus tube. This tech-nique could be enhanced by the use of synchrotron radia-tion, since it is desirable that the beam be monochromatic,intense, and highly collimated. It is envisioned that exper-imental phase determination could be used to establish astarting set of 50 or so triplets and that the tangent for-mula or some similar technique could then expand theserelationships.

APPENDIX III: METHODS OF REFINEMENT

Once a trial structure has been proposed, improvements inthe values of the parameters are sought so that the modelcorresponds as closely as possible to reality. Exact agree-ment between observed and calculated structure factorswould yield a difference Fourier synthesis that was flatand an R-value of zero. The methods of refinement dis-cussed here are in the context of small molecules wherethe ratio of reflections to parameters is commonly of theorder of 10:1. Such a degree of overdetermination doesnot exist in protein structures. The modification of thesetechniques that would be required to handle proteins is notdiscussed in this review.

A. Difference Fourier

A Fourier synthesis with coefficients �F = |F0| − |Fc| re-flects differences between the crystal and the model. Thedifference map is routinely used during structure solutionto check the integrity of the model as it is being developed.Large peaks in the map (Z/3 to Z/2) correspond to atomsnot yet included in the model. Smaller peaks may indi-cate a slightly misplaced atom or wrong scattering type

or incorrect thermal parameters. Holes indicate electrondensity in the model where none exists in the crystal (seeFig. 14).

It is common to use difference maps to find hydrogenatoms once the positions of the heavier atoms have beenrefined isotropically. In centrosymmetric structures, thephases are either positive or negative and are usually cor-rect for all but the smallest structure factors by this stage.Hydrogen atoms will have electron densities in the range0.6 to 0.9 �e A−3. Phase errors in the noncentrosymmet-ric case cause difference maps to be less well defined. Atthe end of refinement, a difference map should show nosignificant features.

Since it is much faster to calculate a Fourier synthe-sis than to carry out least squares, refinement by differ-ence synthesis was popular before the advent of modern

FIGURE 14 Difference map. Comparison of observed, calcu-lated, and difference electron density for (a) an atom displacedby 0.1 A from its correct position, and (b) a temperature factoroverestimated. (c) Difference density observed when an isotropictemperature factor is used in the model and the thermal vibra-tion of the crystal is actually anisotropic. Dashed contours arenegative.

Page 70: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

152 Crystallography

computers. For small- and medium-sized molecules(<400 atoms), least squares is the preferred method ofrefinement, and difference maps are used mainly for errordetection.

B. Least-Squares Analysis

The set of parameters that minimizes the sum of thesquares of the difference between the observed and cal-culated values of the structure factors is the most satis-factory set. The function minimized is

∑hkl ω�2 where

� = |F0| − |Fc|, and ω is the weight. The minimizationcondition is ∑

hkl

ω�∂ Fc

�p j= 0

But, the relationship between � and the parameters isnonlinear. For a set of parameters close to the true values,� may be expanded as a function of the parameters by aTaylor series truncated to first order:

�(p + ε) = �(p) −n∑

i=1

εi∂|Fc|∂pi

where p stands for the whole set of n parameters, εi isa small change in the i th parameter, and the minus signreminds us that � = |F0| − |Fc|, and it is Fc that must bechanged.

The normal equations have the form:

∑ω

∂|F |2∂pi

ε1 +∑

ω∂|F |∂p1

∂|F |∂p2

ε2 + · · ·

=∑

ω�∂|F |∂p1

∑ω

∂|F |∂p1

∂|F |∂p2

ε1 +∑

ω∂|F |2∂p2

ε2 + · · ·

=∑

ω�∂|F |∂p2

The left-hand side yields an n × n matrix, symmetricalabout its diagonal. Computer programs store only theupper diagonal half of this matrix. For a 30-atom struc-ture there are 90 positional parameters, 180 tempera-ture parameters, and a scale factor. Thus, n = 271. Eval-uation of the left-hand side involves evaluating 271 ×270/2 = 36,585 terms, with the sum over all reflectionsformed for each term. The matrix is then inverted andsolved for the shifts in the parameters. These shifts areapplied and the process continues in an iterative manner.

C. Constrained Refinement

There are a number of reasons for wishing to add con-straints to a refinement, such as when the resolution islimited, the structure is disordered, or light atoms are be-ing refined in the presence of heavy ones. Examples ofwhere this type of refinement has been used are inorganiccoordination compounds with triphenylphosphine as a lig-and. The benzene ring can be constrained to the shape ofa regular hexagon with attached hydrogen atoms in theircorrect positions. Only six positional parameters need berefined: three for the center of mass and three for the ori-entation angles. The model for thermal motion may allowindividual thermal parameters for each atom of the group,or a single temperature parameter for the whole group.

The advantage of using rigid-body constraints is that thenumber of parameters is reduced. In addition, calculationof the coefficients of the derivatives of the individual atomcoordinates with respect to the rigid-body coordinates isperformed only once (not for each reflection). Thus, the re-finement will require less computer storage and less com-puter time. As an added bonus, the refinement tends toconverge more quickly than an unconstrained refinement.However, it must be kept in mind that a group constrainedto an incorrect geometry will result in a structure that issystematically wrong. If the number and quality of datawill allow, the constraints may be released at the end ofthe refinement. Hamilton’s R-value ratio test may help todetermine whether the change in R-value on release ofconstraints is significant.

The “hard” constraints described above reduce the num-ber of parameters needed to solve the structure and hencereduce the size of the matrix. “Soft” constraints do notchange the size of the matrix but increase the number ofobservational equations. Bond length and angles are per-mitted to deviate from standard values by a fixed amount,or a ring may be constrained to be planar. This tech-nique can be very useful in dealing with highly correlatedparameters.

In general, the addition of constraints at the end of re-finement will cause the R-value to rise slightly. Strict en-forcement of an inappropriate constraint will increase theeffect. It is unusual for constraints to be in place duringthe whole refinement; if they are, it is important that theybe weighted appropriately.

D. Thermal Motion Analysis

Most modern crystal-structure analyses refine values forthe anisotropic vibrational parameters Ui j for individualatoms in the structure. This model assumes that the vi-brations are harmonic and that vibrations of neighboring

Page 71: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GJC/LOW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN004G-160 June 8, 2001 19:47

Crystallography 153

atoms are uncorrelated. If systematic effects such as ab-sorption and extinction have been ignored, if the qual-ity of the crystals is poor, or if the weighting scheme isinappropriate, these parameters may have little physicalmeaning.

Hirshfeld has proposed a convenient test for reason-ableness of a set of vibration parameters. Since bond-stretching vibrations can be assumed to have much higherenergy than other intramolecular modes such as torsion orangle bending, bonds can be considered “rigid” to a firstapproximation. If two atoms A and B are bonded, it isexpected that the component of vibration along the bondwill be the same for both atoms. For carbon atoms, agree-ment within 0.001 A2 is expected for low-temperaturestructure determinations where bonding effects have beenaccounted for. In the spherical atom approximation,the discrepancy is more likely to be of the order of0.005 A2. In molecular crystals, vibrations of neighbor-ing atoms are correlated to some degree. For relativelyrigid molecules, translational and vibrational oscillationsare the major contributors to the internal modes of vi-bration. If the rigid body model is a good approxima-tion, it is possible to “correct” bond lengths for thermalvibration.

The reason for the correction can be seen in Fig. 15.If a molecule is undergoing librational motion, with aroot-mean-square amplitude of libration ω about an axisthrough 0, then an atom at a distance l from the center oflibration sweeps out an arc. However, the X-ray analysiswill place the center of the atom at C, the centroid of theelectron distribution. Thus, bond distances are systemati-cally shortened by an amount PC = l − l cos ω ≈ lω2/2.In three dimensions, the corrections are additive.

FIGURE 15 Rigid-body motion. An atom at a distance l from thecenter of libration sweeps out in an arc. Least-squares analysisplaces the atom at C rather than P.

The rigid-body models require 12 parameters for cen-trosymmetric molecules and 21 for noncentrosymmetric.In the former, both the translational and librational axespass through the center of gravity of the molecule. In thenoncentrosymmetric case, the libration axes do not nec-essarily coincide, and an additional tenser, S, is requiredto account for the correlation between the librational andtranslational motions.

Two indicators are used to test the goodness of fit be-tween the refined values of the Ui j parameters and thosecalculated from the rigid-body model. The root-mean-square difference between observed and calculated values,given by 〈(�Ui j )2〉1/2, would be considered excellent inthe region of 2×10−4 A2 but only fair at 2×10−3 A2. R =∑ |�Ui j |/

∑Ui j would be considered excellent at 0.04.

Obviously, the interatomic vectors in a crystal structurecan be interpreted as bond distances only if thermal vibra-tions are negligible. To correct for the effects of thermalvibration, a model for that vibration is required. Often it isbetter to reduce the effects of thermal motion by workingat the lowest available temperature than to attempt an aposteriori correction.

SEE ALSO THE FOLLOWING ARTICLES

CRYSTAL GROWTH • CRYSTALLIZATION PROCESSES •FERROMAGNETISM • MICROSCOPY • PHASE TRANS-FORMATIONS, CRYSTALLOGRAPHIC ASPECTS • X-RAY

SMALL-ANGLE SCATTERING • X-RAY, SYNCHROTRON

AND NEUTRON DIFFRACTION

BIBLIOGRAPHY

Carter, C. W., Jr., and Sweet, R. M., Eds. (1997). “Methods in Enzymol-ogy,” Vol. 277, “Macromolecular Crystallography.” Academic Press,New York.

Dunitz, J. D. (1979). “X-Ray Analysis and Structure of OrganicMolecules,” Cornell University Press, Ithaca, NY.

Ewald, P. P., Ed. (1962). “Fifty Years of X-Ray Diffraction,” N.V.A.Oosthoek’s Uitgeversmaatschappij, Utrecht.

Journal of Physics D, Applied Physics. (1991). Special issue on structuralaspects of crystal growth, 24(2), February 14.

Karle, J. (1994). J. Chem. Inf. Comput. Sci. 34, 381–390.Ladd, M. C. F., and Palmer, R. A., Eds. (1980). “Theory and Practice of

Direct Methods in Crystallography,” Plenum, New York.McPherson, A. (1982). “Preparation and Analysis of Protein Crystals,”

Wiley, New York.Schmidt, P. W., Ed. (1983). “Proc. symp. small-angle scattering.” Trans.

Am. Crystallogr. Assoc. 19.Schoenborn, B. P., Ed. (1985). “Proc. symp. structure determination with

synchrotron radiation.” Trans. Am. Crysrallogr. Assoc. 21.Wyckoff, H. W., Hirs, C. H. W., and Timasheff, S. N., Eds. (1985). “Meth-

ods in Enzymology,” Vol. 114, “Diffraction Methods for BiologicalMacromolecules.” Academic Press, New York.

Page 72: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LLL/GJP P2: FYD/GTL QC: FYD Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN005E-216 June 15, 2001 20:31

Electrons in SolidsRichard H. BubeStanford University

I. A Bond PictureII. Free-Electron Model of MetalsIII. Energy BandsIV. Optical PropertiesV. Electrical PropertiesVI.Galvanomagnetothermoelectric Effects

VII. Amorphous SemiconductorsVIII. Superconductors

IX. Conducting PolymersX. JunctionsXI. Magnetic Properties

GLOSSARY

Amorphous material Material without long-range orderas in a crystalline material, but with many optical andelectrical properties resembling those of a crystallinematerial.

Effective mass The apparent mass of an electron ina solid when it is treated as if it were a freeelectron.

Electrical conductivity Description of the transport ofelectrical charge in a metal, a semiconductor, or aninsulator.

Energy bands The allowed energies for electrons insolids lying in a series of allowed bands separated byforbidden bands.

Fermi-Dirac distribution The appropriate statistics foroccupancy of allowed electron states in a solid in viewof the Pauli exclusion principle.

Fermi energy The electrochemical potential of electronsin a solid, an important parameter for describing elec-trical properties.

Galvanomagnetoelectric effects A variety of effects insolids resulting when any two of the following—electric field, magnetic field, and thermal gradient—interact.

Imperfections Impurities or defects in a solid that mayplay a wide variety of roles in affecting the optical andelectrical properties.

Junctions Particular energy configurations occurring asa result of junctions between different kinds of materi-als, whether metals or semiconductors, which form thebasis for many useful devices.

Magnetic effects Effects related either to the orbital mo-tion of electrons in a solid or to the intrinsic angularmomentum of an electron, its spin, in interaction witha magnetic field.

307

Page 73: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LLL/GJP P2: FYD/GTL QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN005E-216 June 15, 2001 20:31

308 Electrons in Solids

Mobility The proportionality constant between the driftvelocity of a carrier in a conducting state in a solid andthe applied electric field.

Pauli exclusion principle The recognition that no twoelectrons in the same system can have all of their quan-tum numbers the same; that is, only one electron canexist in a specific energy state.

Periodic potential The kind of potential characteristic ofa crystalline material with atoms located in a regularperiodic lattice.

Scattering Processes that interact with an electron mov-ing in a solid under the influence of an electric field tocounteract the effects of the field and return the carrierdistribution toward the equilibrium condition.

Schroedinger equation The basic quantum-mechanicalequation used to describe the energy levels allowed forelectrons in a particular potential energy.

SOLIDS may be conveniently separated into those thatare crystalline and those that are amorphous. A crystallinematerial consists of atoms arranged in a periodic array,so that the whole crystal can be constructed by repeatinga basic unit called the unit cell. Usually this periodic ar-ray is in three dimensions, but special significant casesare known of two-dimensional periodic arrays (as in theplane of an interface). Because the atoms have a periodicarrangement, the electrical potential in the solid also isperiodic. The allowed energies for electrons, and many ofthe optical and electrical properties of electrons in crys-talline solids, can be associated with the effects of thisperiodic potential. An amorphous material does not havea periodic potential or the long-range order typical of acrystalline material, but it does exhibit some of the samebasic electrical and optical properties as those of a crys-talline material, insofar as these are based on the proper-ties of chemical bonds rather than uniquely on long-rangeorder.

I. A BOND PICTURE

Some of the most basic optical and electrical properties ofboth crystalline and amorphous materials can be describedqualitatively by considering the nature of the chemicalbonds between atoms rather than the effects of a periodicpotential.

In both kinds of materials, there is a minimum en-ergy �E that must be supplied to electrons in the chem-ical bonds of the material to give them enough energyto be able to move freely through the material. Thusthere is a minimum photon energy equal to �E foroptical absorption and an electrical conductivity with a

thermal activation energy of �E . We consider the behav-ior specifically in different kinds of crystalline materials asfollows.

In an ionic crystal an electron is transferred from thecation to the anion, and the electron is subsequently local-ized around the anion. So that the electron can be removedfrom the anion and can move freely in the ionic crystal, acertain basic energy �E must be supplied. If we think ofthe electron on the anion as being characterized by a rangeof energies characteristic of these electronic states, thenthe next allowed state where conductivity is possible liesat higher energies, separated from the allowed energiesbelow by some finite energy gap �E . If the electron is tomake the transition from the lower states where conduc-tion is not possible to the upper states where conduction ispossible, energy must be supplied to overcome this energygap. This energy may be supplied by photons, in whichcase the absorption spectrum consists of an absorptionedge at the value of the photon energy equal to this energygap, which rises to large values of absorption when thephoton energy exceeds this energy gap.

Since the energy required to remove an electron from acovalent bond (in a covalent material) to allow it to be freeis in general less than the energy required to accomplishthis in an ionic material, the absorption edge for a covalentmaterial occurs for lower values of photon energy. Thelargest energy gap for any material known is that for ioniclithium fluoride, for which �E = 11.5 eV, whereas in thewidely used semiconductor silicon, �E = 1.1 eV.

In a material with metallic binding, on the other hand,there is a constantly available “sea” of free electrons ca-pable of absorbing energy from photons to be excited tohigher energy states. The metal is not characterized, there-fore, by an absorption edge as the ionic and covalent insu-lators and semiconductors are, but rather has a continuousabsorption due to free electrons over a wide photon-energyrange.

This same kind of general picture can be connectedwith the basic features of the electrical properties of ionic,covalent, and metallic materials. In insulators and semi-conductors, electrons must be excited thermally from thehighest-lying filled states to the next higher-lying emptyallowed states for conductivity to be possible. An acti-vation energy for this process, with a value correspond-ing to that of the optical absorption edge �E describedabove, can be measured approximately from the slope in aplot of the logarithm of the electrical conductivity versusreciprocal temperature. Therefore ionic materials have alarge activation energy and a small electrical conductivity.Covalent materials, in general, have a smaller activationenergy and a larger electrical conductivity. Metals, havinga ready supply of free electrons, have no activation energyand the largest value of electrical conductivity.

Page 74: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LLL/GJP P2: FYD/GTL QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN005E-216 June 15, 2001 20:31

Electrons in Solids 309

II. FREE-ELECTRON MODEL OF METALS

It is often found that the outermost valence electrons ofatoms making up a solid can be treated as if they wereessentially free electrons. This is particularly true in theclass of materials known as metals.

These valence electrons in metals are of course not com-pletely free, since they still move in the presence of thepositively charged ions that are located on the crystal lat-tice sites of the metal. But to an often striking degree it issufficient to consider the valence electrons to be movingin a potential energy of zero, partially shielded from thepositively charged ions by the other electrons, and withthe repulsive interactions between electrons canceled outby the residual attractive effects of the positively chargedions. When we consider such electrons from a quantum-mechanical view, it is appropriate to consider them as con-tained in a box the size of the metal with zero potentialenergy inside the box and infinite potential energy outsidethe box. To escape from a real metal, an electron in themetal needs to acquire sufficient energy to overcome thefinite potential barrier at the surface, known as the workfunction of the metal, which is much larger than the aver-age electron energy and therefore approximately infinite.

From calculations based on the Schroedinger equationfor free particles in a box, it is possible to derive whatenergy states are allowed. For a cube with side L , theallowed energies are given by

E(nx , ny, nz) = (h✘ 2π2/2mL2)(n2

x + n2y + n2

z

), (1)

where nx , ny, and nz are positive integers equal to oneor greater. This quantization of the allowed energy levelsarises from the necessity to have solutions with wave-length λ such that nλ/2 = L in each of the three dimen-sions, to satisfy the geometric boundary conditions thatthe solution must go to zero at the faces of the cube. If weexamine the magnitude of E(1, 1, 1) we see that it is of theorder of 10−14/L2 eV/cm−2, so that for a cube of macro-scopic dimensions (e.g., L = 1 cm), E(1, 1, 1) can be setequal to zero. The same is true of the energy separationbetween different energy levels, so that the distributionof allowed energy levels can be treated effectively as apseudocontinuous distribution.

The variation of the density of these states N (E) per unitvolume [N (E)dE is the number of states per unit volumebetween E and E + dE] can be calculated from Eq. (1)with the result that

N (E) = (1/2π2)(2m/h✘ 2)3/2 E1/2, (2)

which includes a factor of 2 to account for the two possiblespin states (intrinsic angular momentum) of an electron.

When these results are coupled with the postulate con-sistent with the Pauli exclusion principle that only one

electron can occupy a particular energy state, we are ledto describe the occupation of a state with energy E by theFermi-Dirac distribution function

f (E, T ) = 1/[exp{(E − EF/kT } + 1], (3)

where EF is a particular energy called the Fermi energy,k is Boltzmann’s constant, and T is the temperature. At0 K the allowed energy states are occupied for all energiesup to the Fermi energy, so that if n is the total number offree electrons, n = ∫

N (E) dE from E = 0 to EF, and asa result,

EF = (h✘ 2/2m)(3π2n)2/3. (4)

From the expression for f (E, T ), it can be seen that theFermi energy is the energy for which f (E, T ) = 1

2 . TheFermi energy can also be related to thermodynamic quanti-ties by recognizing that it is the electrochemical potential.Using f (E, T ) makes it possible to calculate the densityof occupied electronic states n(E) as a function of energyby multiplying Eqs. (2) and (3), n(E) = f (E , T )N (E),with the result shown in Fig. 1.

This simple model is able to give qualitative and evensemiquantitative agreement with a number of phenom-ena related to electrons in metals, such as photoemission(emission of electrons from a metal as the result of absorp-tion of photons), thermionic emission (emission of elec-trons from a metal because of high temperatures), fieldemission (emission of electrons from a metal because ofhigh electric fields), and heat capacity (a measure of howrapidly the energy of a collection of electrons changes withtemperature). If the concept of electron spin is included,this same simple model can be used to describe the mag-netic properties of free electrons (free-electron or Pauliparamagnetism). A typical energy model for thermionicemission, for example, is given in Fig. 2.

FIGURE 1 The total density of allowed states per unit volumeN(E ), the Fermi distribution function f (E ), and the density of oc-cupied states per unit volume n(E ) = f (E )N(E ), at a finite tem-perature in a metal with Fermi energy EF.

Page 75: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LLL/GJP P2: FYD/GTL QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN005E-216 June 15, 2001 20:31

310 Electrons in Solids

FIGURE 2 Free-electron model of a metal surface suitable forthe discussion of a variety of electronic interactions with a metal,such as photoemission, thermionic emission, or field emission.The specific diagram shown here is for the case of thermionicemission. Electrons with velocity vx > [2(EF + qφ)/m]1/2 in thehigher-energy tail of the Fermi distribution can be emitted intovacuum. The result is a thermionic emission current that varies asJx ∝ T2 exp(−qφ/kT ).

III. ENERGY BANDS

In the free-electron model, the electrons occupy positiveenergy levels from E = 0 to higher values of energy. Wecould think of this as a “band” of energies with a lowerlimit at E = 0 but with no upper limit. Because the elec-trons involved in this band are the valence electrons, wecall this the “valence band” of the material.

The free electrons no longer belong to isolated atomsbut rather belong to the whole crystal. The allowed bandof energies is the result of the fact that the electrons aredescribable by Fermi-Dirac statistics and obey the Pauliexclusion principle: no two electrons in the whole crystalare allowed to be in exactly the same energy state.

What is true of the valence electrons is also true of theother electrons present in the crystal, corresponding tothe more tightly bound inner-shell electrons of the atoms.They also can be thought of as belonging to the wholecrystal and therefore requiring that their energies be ex-pressed by a band of energies rather than by the discreteenergy-level scheme characteristic of isolated atoms.

There are several complementary ways of showing thatthe existence of a periodic potential in a crystal to theexistence of energy bands:

1. It can be shown mathematically that such a periodicpotential gives rise directly to the presence of a series ofallowed energy bands separated by energy gaps in whichelectron states are not allowed.

2. We may pursue the consequences for the allowedenergy levels of interactions between atoms as these are

brought together in a periodic array to form a crystal (thetight-binding approximation). Because of the Pauli exclu-sion principle such interactions cause the discrete atomiclevels to broaden into allowed bands in the crystal, sep-arated by forbidden bands corresponding to the forbid-den energies between the discrete levels of the isolatedatoms.

3. We may start with effectively free electrons and in-quire as to what happens to the allowed energies if wesuperpose a small periodic potential (the weak-bindingapproximation). We find that the effect of such a potentialis to open up forbidden gaps in the previously continuousenergy distribution, which once again produces a series ofallowed bands and a series of forbidden bands.

A. Representations of Energy Bands

The major ways of describing energy bands can be seenin Fig. 3. The plot of energy versus distance through thecrystal, E versus x , in Fig. 3a shows the allowed energybands separated by the forbidden energy gaps and empha-sizes the nonlocalized nature of the band states that extendthroughout the whole crystal.

Within an energy band the relationship between the fre-quency ω of the electron waves (related to the energythrough E = h✘ ω) and the wavelength λ of the electronwaves (described by the wavenumber k = 2π/λ) is de-scribed by giving the specific dependence of E on k.Typical variations are shown in Fig. 3b, illustrating thecommon result that electrons near the top or bottom ofan energy band usually behave as if they were free-likeelectrons for which the relationship between E and k isparabolic (for free electrons, E = h✘ 2k2 /2m, where m isthe electron mass). The upper band in Fig. 3b correspondsto a typical conduction band in a semiconductor, whilethe lower band corresponds to a typical valence band. Fora simple one-dimensional crystal with lattice constant a,the limiting values of k for the energy band are −π/a and+π/a; these values are equivalent to the Bragg reflectionconditions in one dimension. They mean that an electronwave cannot propagate with wavenumber k = ± π/a. Therange from k = −π/a to k = +π/a is called the first Bril-louin zone. This dependence of E on k is illustrated furtherin Fig. 4 for the weak-binding model for the formation ofenergy bands as described above.

When the electrons in a band are free-like, then the cor-responding density of states is free-like, with N (E) ∝ E1/2

as stated above. Figure 3c shows the variation of the den-sity of states with energy throughout the bands. Near thetop and bottom of the band N (E) ∝ E1/2 (as measuredaway from the minium or maximum point), but in the mid-dle of the band the electrons do not exhibit free-like behav-ior, and the density of states passes through a maximum.

Page 76: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LLL/GJP P2: FYD/GTL QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN005E-216 June 15, 2001 20:31

Electrons in Solids 311

FIGURE 3 Four ways of describing band properties: (a) flat band E versus x, (b) electron wave dispersion relationshipE versus k, (c) density of states N(E ) versus E, and (d) equal-energy surfaces in k-space.

The basic electrial differences between solids—metals,semiconductors, and insulators—are readily understand-able in terms of the band picture. Possibilities are illus-trated in Fig. 5. If the highest-energy-allowed band occu-pied by electrons is only partially occupied, then there areavailable allowed states at very small energies above oc-cupied states, and the drift of electrons in an electric fieldcan be readily achieved. A partially filled valence bandtherefore corresponds to a metal.

Even if the highest-energy-allowed band occupied byelectrons is totally filled, metallic properties can be foundif the next higher-lying band overlaps the filled band (pos-sible in a three-dimensional crystal) to again produce acontinuum of allowed states separated by only small ener-gies from occupied states. Overlapping bands also producemetallic behavior.

If the highest-energy-allowed band occupied by elec-trons is totally filled, and the next higher-lying band liesan appreciable energy above the top of the filled band,insulator-like properties are observed. Only electrons inthe upper empty band can contribute to electrical conduc-tivity; their density is very small, since thermal excita-tion across a large energy gap is required to raise themfrom the filled valence band to the empty conductionband.

Finally, if the gap between the top of the filled bandand the higher-lying empty band is small, appreciableexcitation of electrons into the conduction band may occurat normal temperatures, and intermediate conductivity isobserved typical of a semiconductor.

If electrons exhibit free-like behavior, then a plot ofequal-energy surface in k-space, as shown in Fig. 3d, willhave spherical symmetry, since the magnitude of E doesnot depend on the direction of k. Thus the equal-energysurfaces in many materials are frequently spherical withcenter at either the minimum or maximum value of k forthe energy band. A notable exception is the widely usedsemiconductor silicon, for which the minimum of the con-duction band does not lie at k = 0 and for which the equal-energy surfaces are ellipsoidal rather than spherical.

B. Interpretation of E versus k Curves

From a knowledge of the dependence of E on k in an en-ergy band, two significant properties of electronic energystates can be deduced:

1. The velocity associated with an electronic state is thegroup velocity, given for the one-dimensional case by

vg = h✘ −1∂E/∂k. (5)

Page 77: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LLL/GJP P2: FYD/GTL QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN005E-216 June 15, 2001 20:31

312 Electrons in Solids

FIGURE 4 (a) The free-electron model of electrons as perturbed by a small periodic potential (weak-binding ap-proximation) causing Bragg reflection conditions for k = nπ/a in the extended zone representation. (b) The reducedrepresentation in the first Brillouin zone achieved by translating band segments in (a) by n2π/a to bring them into thebasic zone between k = −π/a and k = + π/a. A wavenumber k ′ = (k + n2π/a) corresponds to the same solutions asdoes k. In the reduced representation, separate bands are designated by a band index I .

For free electrons this means that vg = h✘ k/m, which illus-trates the fact that the momentum mvg = h✘ k, in agreementwith the de Broglie relationship that the momentump = h/λ. If this result is applied to the upper conduction

FIGURE 5 Illustrations of the dependence of electrical propertiesof solids on energy-band filling and spacing: (a) and (b) metals,(c) insulators, (d) semiconductors.

band in Fig. 3a, for example, we see that vg = 0 at the topand bottom of the band, is positive for +k and negative for−k, and has a maximum some-where near the middle ofthe band. We see immediately why the application of anelectric field to a totally filled band produces no electriccurrent; the number of occupied states with positive vg isequal to that with negative vg, and the application of anelectric field cannot change this equality.

2. Although electrons near the top or bottom of a bandmay be said to behave in a free-like manner, it is clear thatthey are moving in a periodic potential and are not re-ally free. To describe them in a free-like model, wemust usually define an effective mass m∗ that is differentfrom the free mass m. From the result that the force isthe time derivative of the momentum, we obtain F = h✘

∂k/∂t . If we also desire to write F = m∗a, after theclassical form, then it follows that since a = ∂vg/∂t =h✘ −1∂/∂t(∂E/∂k) = h✘ −1∂/∂k[(∂k/∂t)(∂E/∂k)],

m∗ = h✘ 2/(∂2E/∂k2) (6)

for the one-dimensional case. If we apply this to the free-electron case, we find, as expected, that m∗ = m. We seethat the effective mass m∗ is inversely proportional to thecurvature of the E versus k curve. Near the top or bottomof the band, E versus k may be parabolic and we canwrite E = h✘ 2k2 /2m ∗ with a constant m ∗ over a range ofE and k. At the bottom of the upper band in Fig. 3b theeffective mass for electrons is positive, whereas at the top

Page 78: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LLL/GJP P2: FYD/GTL QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN005E-216 June 15, 2001 20:31

Electrons in Solids 313

of the lower band in Fig. 3b the effective mass for electronsis negative. It is in this negative sign that we see moststrongly the effects of the periodic potential. Physically,a negative effective mass corresponds to an accelerationthat is in the opposite direction to the applied force; itcorresponds to a condition like that encountered in Braggreflection. Typical variations of vg and m∗ are shown inFig. 6. Effective masses in most solids vary between about10−2 m(m∗

e = 0.014 m in InSb, one of the smallest values)to values of the order of m (for free electrons in the alkalimetals or for holes in many semiconductors).

For the filled valence band of a semiconductor to con-tribute to the electrical conductivity, there must be someempty states corresponding, for example, to electrons thathave been excited up to the conduction band. Instead ofdescribing the electrical properties of the valence band interms of all the electronic states with positive and nega-tive velocities and negative and positive effective masses(depending on which state in the band is being examined),it is conceptually much simpler to describe the conductiv-ity in a partially empty band in terms of the empty statesthemselves, called “holes.” If we treat these holes as if theyhad a positive charge and a positive effective mass, a sym-metric treatment of electrical properties involving themat the top of the valence band can be given, as is com-monly given for electrons at the bottom of the conductionband.

FIGURE 6 Typical variations of group velocity vg and effectivemass m∗ as a function of k.

IV. OPTICAL PROPERTIES

Optical properties of solids include a wide range of phe-nomena involving either the interaction of light withcrystals or the generation of light by crystals under suitableconditions.

The velocity of light v in a material is reduced comparedwith its value in vacuum c by a factor known as the indexof refraction r = c/v. In general, the value of r for near-visible light can be expressed by

r2 = εr + c2α2/4ω2, (7)

where εr is the relative dielectric constant associated withpolarization of electrons in the material, α is the opticalabsorption constant (a sample with thickness of 1/α re-duces the light intensity passing through it by a factor ofe), and ω is the light frequency.

A. Optical Reflection

Reflection occurs for any wavelike phenomenon when thewave passes from a material with one set of properties toa material with another set of properties. Light is reflectedon passing from a material with value εr1 and α1 to asecond mateial with value εr2 and α2. For simple reflectionat an air–material interface (εr1 = 1 and α1 = 0 for air)with a material without absorption (α2 = 0), the reflectioncoefficient R is given by

R = (r2 − 1)2/

(r2 + 1)2. (8)

If the material does have a finite α, the result is that thehigher the absorption index of the material, the more lightit reflects, until in the extreme case R approaches unity:

R = [(r2 − 1)2 + �2

2

]/[(r2 + 1)2 + �2

2

], (9)

where �2 is the absorption index expressed through thecomplex index of refraction r∗ = r + i�. The absorp-tion index is related to the absorption coefficient by therelationship

α = 2ω�/c. (10)

This effect accounts for the very high reflectivity of metalsand other materials with high values of α due to specificabsorption processes.

One application of considerable practical importancethat involves the reflectivity properties of thin films of onematerial on another is the antireflecting coating. As a wayto reduce the reflection at an air–material A interface, asecond material B with suitable index of refraction (ideallythe square root of the index of refraction of A) is depositedin thin-film form on A, with a thickness such that lightreflected from the air–B interface interferes destructivelywith light reflected from the B–A interface. This occurs

Page 79: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LLL/GJP P2: FYD/GTL QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN005E-216 June 15, 2001 20:31

314 Electrons in Solids

when the film B thickness is an odd multiple of quarterwavelengths of the light in the film, λB = λair/rB.

B. Absorption Processes

Optical absorption in solids may be categorized under sixprincipal headings. One of these is the absorption of pho-tons in the excitation of optical mode vibrations of thecrystal lattice, known as reststrahlen absorption, whichusually occurs in the infrared between 10 and 200 µm.This is the major absorption process that does not involveelectronic transitions. Other kinds of optical absorptionassociated with the excitation of lattice vibrations may oc-cur when an impurity is present in the material and localvibrational modes associated with the impurity becomepossible.

Figure 7 illustrates the other five major types of elec-tronic absorption processes:

1. Excitation of an electron from the valence band tohigher-lying conduction bands, characterized by high-absorption processes, showing only relatively small vari-ations of absorption coefficient with photon energy de-pending on the density-of-states distributions in the bandsinvolved. The optical absorption constant is usually in therange of 105 to 106 cm−1.

2. Excitation of an electron from the valence band tothe lowest-lying conduction band with the minimum re-quired energy of the forbidden band gap. The magnitude

FIGURE 7 Characteristic types of optical transitions shown both for the flat-band model and for the E versus k plot.(1) Excitation from the valence band to higher-lying conduction bands, (2) excitation across the band gap, (3) excitonformation, (4) excitation from imperfections, and (5) free-carrier excitation.

and variation with energy of the absorption constant de-pends on whether the transition involves a photon only(direct transition) or whether it involves both a photonand a phonon (indirect transition). The absorption con-stant decreases by many orders of magnitude as the pho-ton energy drops below the band-gap energy. Direct andindirect optical transitions are illustrated in Fig. 8.

If a direct optical transition is involved, a plot of (αh✘ ω)2

versus h✘ ω, where ω is the light frequency, yields a straightline with the energy intercept of the direct band gap. If anindirect optical transition is involved, a plot of α1/2 ver-sus h✘ ω gives two straight-line segments with interceptsof h✘ ω1 and h✘ ω2, as shown in Fig. 8b. The upper segmentcorresponds to an indirect transition with emission of aphonon, whereas the lower segment corresponds to an in-direct transition with absorption of a phonon. The indirectband gap is given by (h✘ ω1 + h✘ ω2)/2, and the energy of theparticipating phonon is given by (h✘ ω2 − h✘ ω1)/2.

3. Excitation of a bound electron–hole pair, known asan exciton, requiring less energy than that needed to pro-duce a free electron–hole pair by excitation across the bandgap. The exciton can be thought of as a hydrogenic system,capable of moving and transporting energy through thecrystal without transporting net charge. The electron andhole making up an exciton may ultimately be thermallydissociated into free carriers or may recombine with theemission of photons or phonons.

4. Excitation of an electron from an imperfection levelto the conduction band, as shown in Fig. 7, or from the

Page 80: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LLL/GJP P2: FYD/GTL QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN005E-216 June 15, 2001 20:31

Electrons in Solids 315

FIGURE 8 (a) Band-to-band direct transitions, and (b) band-to-band indirect transitions. Inserts show the variation ofabsorption coefficient with photon energy expected for each type of transition, and a typical phonon dispersion curvefor the indirect material.

valence band to an imperfection. If imperfections arepresent in the crystal, they create energy levels that lie inthe forbidden gap. Therefore, at energies less than the bandgap it is still possible to excite electrons to the conductionband from imperfection levels occupied by electrons, orto excite electrons from the valence band to unoccupiedimperfection levels, each process giving rise to optical ab-sorption. This absorption in turn ceases when the photonenergy is less than the energy required to make a transitionfrom the imperfection level to one of the bands. The ab-sorption coefficient may be expressed as α = So NI (whereNI is the density of suitably occupied or unoccupied imper-fections, and So is the optical cross section, of the order of10−16 cm2). For very high imperfection densities, the cor-responding absorption constant may have values as highas 103 cm−1, but in general it is considerably smaller. Theabsorption spectrum for this type of absorption consists ofa threshold corresponding to the ionization energy of theimperfection and a relatively slow variation of absorptionfor higher photon energies.

Another kind of imperfection absorption may occur thatis not illustrated in Fig. 7. It is most commonly encoun-tered when an impurity with incomplete inner-shell atomiclevels is present in a material. Absorption between twosuch atomic levels yields an absorption spectrum with aGaussian shape, peaked at the energy separation betweenthe two levels.

5. Excitation of a free electron (or free hole) to a higherenergy state within the same band or to higher bands. Thisprocess can occur over a wide range of photon energies.

It involves the absorption of photons and the absorptionor emission of phonons, since both energy and k mustbe changed in the transition, and is thus an indirect op-tical transition. The specific quantitative description forthe effect depends on the magnitude of both the electri-cal conductivity and the frequency of the light; for opticalfrequency excitation of free carriers in a nondegeneratesemiconductor α ∝ ω−n where n is between 2 and 3. Theresults of a classical calculation for the dependence of ab-sorption coefficient on frequency for materials with dif-ferent low-frequency conductivity are given in Fig. 9.

Another kind of optical absorption due to free electrons(or holes) corresponds to the motion of free carriers act-ing collectively as a kind of “electron gas,” which has acharacteristic frequency ωp = (nq2/ε0m∗)1/2, where n isthe density of the carriers, q is the electronic charge, ε0

is the permittivity of free space in SI units, and m∗ is theeffective mass of the carriers. When photons with this fre-quency are incident on the material, resonant absorption ofenergy occurs. The effect is known as plasma resonance.The plasma frequency occurs in the ultraviolet for metalsand in the infrared for most semiconductors.

C. Photoelectronic Effects

When light is absorbed by a material so as to raise elec-trons to higher-energy states, several possibilities occur.If the excited electrons are in the conduction band, thenthe conductivity of the material is increased as a resultof the absorption of light, and the effect is known as

Page 81: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LLL/GJP P2: FYD/GTL QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN005E-216 June 15, 2001 20:31

316 Electrons in Solids

FIGURE 9 Dependence of the optical absorption coefficient α onthe frequency of the electric field ω, for different valus of the low-frequency conductivity σ0. The plot is divided into four regions, Athrough D, corresponding to low and high ω and low and high σ/ω.(A) Low ω, low σ/ω: α ∝ σ0. (B) Low ω, high σ/ω: α ∝ (σ0ω)1/2.(C) High ω, high σ/ω: α ∝ (σ0/ω)1/2. The plot assumes thatτ = 10−12 sec and that εr = 10. (D) High ω, low σ/ω: α ∝ σ0ω−2.

photoconductivity. If the excited electrons give up theirexcess energy when they return to their initial state inthe form of photons, then the effect is known as lumines-cence; in particular, if the initial excitation is by light, theemitted radiation is called photoluminescence emission.Processes described here as being associated with the cre-ation of free electrons by light absorption can, of course,also be associated with the creation of free holes.

Luminescence can also be excited by other high-energysources: excitation by an electron beam produces cathodo-luminescence [the conductivity analogue is electron-beaminduced current (EBIC)], excitation by friction producestriboluminescence, and luminescence may also be excitedby exposure to X-rays or high-energy particles. A materialin thermal equilibrium emits radiation due to the recombi-nation of thermally excited electrons and holes, which iscommonly known as blackbody radiation and is describedby Planck’s radiation law. Luminescence is distinguished

from this blackbody radiation, and the term is con-fined to radiation emitted over and above the blackbodyradiation.

If photoexcitation produces a free electron–hole pair byexcitation across the band gap of a semiconductor, thenboth the electron and the hole contribute to the increasedconductivity of the material until (a) they are capturedby localized imperfections, (b) they recombine with eachother directly or at localized imperfections, or (c) they passout of the material at one end without being replaced at theother, an effect that depends on the nature of the electricalcontacts to the material. Most often recombination occursnot between a free electron and a free hole (although thisprocess does occur and can be observed through intrinsicluminescence emission when the recombination occurs byphoton emission), but between a free carrier of one typeand a trapped carrier of the other type, or between trappedcarriers of both types that have been trapped near eachother in the crystal.

There are three processes by which the excess energy ofan excited carrier can be released during recombination:

1. Radiative recombination with emission of photonssuch that �E = h✘ ωpt, where �E is the energy releasedduring recombination and h✘ ωpt is the photon energy ofthe photons generated.

2. Nonradiative recombination with emission ofphonons such that �E = nh✘ ωpn, where h✘ ωpn is the phononenergy of the phonons generated, and where normally itwill require the simultaneous (or sequential, in the caseof a Coulomb-attractive imperfection) release of n suchphonons to release the energy �E .

3. Nonradiative (Auger) recombination in which theenergy is given up to another free carrier such that�E = Ecarrier, which raises it to a higher energy state in theband, from which it can again drop back to its lower en-ergy with the emission of phonons. This process increasesin probability with the density of free carriers present.

Photoelectronic effects are frequently described interms of optical spectra. These are illustrated in Fig. 10.The major types of such spectra are as follows:

1. Absorption spectra, plotted as α versus h✘ ωinc, whereh✘ ωinc is the photon energy of the incident photons

2. Photoconductivity excitation spectra, plotted as pho-toconductivity �σ (σlight − σdark) versus h✘ ωinc

3. Luminescence excitation spectra, plotted as lumines-cence emission intensity at a particular photon energy asa function of h✘ ωinc

4. Luminescence emission spectra, plotted as lumines-cence emission intensity as a function of h✘ ωemit of theemitted light

Page 82: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LLL/GJP P2: FYD/GTL QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN005E-216 June 15, 2001 20:31

Electrons in Solids 317

FIGURE 10 Optical spectra for an illustrative situation. (a) Cho-sen band structure involving band-to-band transitions and an im-purity with atomic ground state and excited state, in which it isassumed that only recombinations with energy EG or E′ are ra-diative; (b) optical absorption spectrum; (c) photoconductivity ex-citation spectrum; (d) extrinsic luminescence excitation spectrum,where Lext is the emission intensity of the extrinsic emission band;(e) intrinsic luminescence excitation spectrum, where L int is theemission intensity of the intrinsic emission band; and (f) lumi-nescence emission spectrum showing both extrinsic and intrinsicemission bands.

V. ELECTRICAL PROPERTIES

The electrical conductivity of different types of materi-als varies over a wide range, from values of the order of108 (� m)−1 for metals to less than 10−14 (� m)−1 for insu-lators. Semiconductors usually have a room-temperatureconductivity of the order of 102 (� m)−1, although thisvalue is strongly dependent on both the temperature andthe purity of the semiconductor.

A. Ohm’s Law

The basic equation describing electrical properties in el-ementary discussions is Ohm’s law, I = V/R, where I isthe electrical current measured in a circuit with resistanceR when a voltage difference V is applied. This relationcan be rewritten to explicitly indicate the role of electricalconductivity, σ :

J = σE, (11)

where J is the electrical current density (current per unitarea), and E is the electric field. Alternatively, we canwrite J = nqvd, where n is the density of free carriers, qis the charge per carrier, and vd is the drift velocity of theelectron caused by the electric field. We define a quantitycalled the carrier mobility µ so that

vd = µE; (12)

then the conductivity can be expressed as

σ = nqµ. (13)

If both electrons and holes contribute to the conductivity,then the total conductivity can be expressed as

σ = q(nµn + pµp), (14)

where p is the density of holes and µn and µp are, respec-tively, the mobilities of electrons and holes.

For Ohm’s law to hold, it is necessary that neither n norµ be a function of electric field. This requirement breaksdown for high values of electric field for which n maybe increased by the electric field through mechanisms ofimpact ionization, field emission, or contact injection, orfor which µ may become a function of electric field ifthe mobility is a function of carrier energy that in turn isincreased by high electric fields.

The temperature dependence of electrical conductiv-ity is determined by the temperature dependence of thefree-carrier density and the temperature dependence ofthe free-carrier mobility. In a metal, the free-carrier den-sity is independent of temperature, and therefore the tem-perature dependence of the conductivity for a metal arisestotally from the temperature dependence of the mobility.The free-carrier density in a semiconductor or an insulator

Page 83: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LLL/GJP P2: FYD/GTL QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN005E-216 June 15, 2001 20:31

318 Electrons in Solids

is thermally activated over a wide temperature range andtherefore increases exponentially with temperature overthis range. The temperature dependence of the mobilitydepends on the specific scattering process that limits thedrift of carriers in an electric field.

B. Scattering and Mobility

The effect of an electric field on a distribution of free (orquasi-free, to be more exact) carriers in a material is toshift the distribution so that more carriers have velocity inone direction than in the opposite. The magnitude of thisshift is determined by the interaction between the effectsof the electric field and the effects of a variety of scatteringprocesses that act to return the distribution to its equilib-rium condition. If a force qE acts on an electron for anaverage time τ , then the net shift in the distribution will begiven by qEτ = h✘ �k [see discussion preceding Eq. (6)],so that �k = qEτ/h✘ . We may consider τ to be the aver-age time between scattering events, which acts in such away that if n0 carriers are unscattered at t = 0, then thedensity of unscattered carriers n(t) at time t > 0 is givenby dn(t)/dt = −n/τ , or n(t) = n0 exp(−t/τ ).

Using this approach, we may rewrite the above equationfor �k, as qEτ = m∗vd, from which it follows that

µ = (q/m∗)τ. (15)

In general the scattering relaxation time is a function ofenergy τ (E), and the quantity that enters Eq. (15) as τ isa suitable average of τ (E) over electron energies.

The specific form of τ (E) depends on the scatter-ing mechanism. These mechanisms include scattering byacoustic lattice waves, optical lattice waves, charged im-perfections, neutral imperfections, piezoelectric effects,dislocations, grain boundaries, surfaces, and inhomo-geneities. Each scattering mechanism is characterized byits own temperature dependence of mobility. As examplesof these processes, we consider scattering by acoustic lat-tice waves and charged imperfections.

Acoustic lattice scattering corresponds to scattering offree carriers by interaction with lattice atoms as they movedue to thermal energy. The probability for scattering is pro-portional to the average energy in the lattice waves, thatis, to kT . The mean free path for scattering (the averagedistance traveled between scattering events, equal to theproduct of the thermal velocity of the carrier νth and τ )by acoustic lattice waves is therefore proportional to T −1.The relaxation time τ (E) ∝ (Tνth)−1. Since in a semicon-ductor νth = (2kT/m∗)1/2, it follows that τ ∝ µ ∝ T −3/2.In a metal, on the other hand, scattering events are ex-perienced only by electrons near the Fermi energy, andso the average value of τ (E) = τ (EF); as a result, τ ∝µ ∝ T −1.

Actually, in a metal this is true only at sufficiently hightemperatures where scattering by acoustic lattice waves iselastic; in other words, the change in energy on scatteringis small compared with the thermal energy of an elec-tron, kT . At low temperatures (temperatures less than theDebye temperature θD, which is defined approximatelyby kθD = h✘ ωmax, where h✘ ωmax is the maximum vibra-tional phonon energy) this is no longer true in a metal,and acoustic lattice wave scattering in a metal becomesinelastic; an appropriate calculation shows that over thisrange µ ∝ T −5.

An illustration of the temperature dependence of elec-trical resistivity ρ = 1/σ as a function of temperature fora metal is given in Fig. 11, with data for silver withθD = 226 K. Above this temperature the resistivity varieslinearly with T , whereas below this temperature ρ changesto a T 5 dependence, until at very low temperatures the re-sistivity is limited by impurity scattering. Since the totalresistivity for a metal is equal to the sum of the resistivitydue to lattice scattering and the resistivity due to impurities

FIGURE 11 Typical temperature dependence of electrical resis-tivity for silver with a Debye temperature of 226 K.

Page 84: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LLL/GJP P2: FYD/GTL QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN005E-216 June 15, 2001 20:31

Electrons in Solids 319

(Mathiessen’s rule), a large value of ρ300K/ρ4K indicatesa pure metal.

Scattering of electrons or holes by charged impuritiesis illustrated in Fig. 12, which shows that scattering ofboth electrons and holes by a charged impurity, regardlessof the sign of the charge on the impurity (i.e., whetherthe Coulomb interaction is attractive or repulsive), can betreated in the same mathematical way. In a simple modelwe consider the scattering effect to be large only if theCoulomb-interaction energy is comparable to the thermalenergy of the carrier, that is, if

Zq2/

(4πεrε0rs) = kT, (16)

where Zq is the charge on the impurity, εr is the relativedielectric constant of the material, and rs is a critical radiusdefined by Eq. (16). The effective scattering cross sectionSI = πr2

s and is given by

SI = Z2q4/(

16πε2r ε

20k2T 2

). (17)

The physical meaning of SI is that if the carrier comeswithin the area SI of the scattering center, then scatter-ing occurs; if not, then no scattering occurs. If there areNI charged impurities, then the rate of scattering is givenby 1/τ = NISIν, where ν is the thermal velocity of thecarrier. Since ν ∝ T 1/2, it follows that τ ∝ µ ∝ T 3/2. Atroom temperature and for εr = 10, SI ≈ 10−12 cm2; forNI = 1017 cm−3, τ = 10−13 sec.

If a semiconductor has scattering by both acoustic lat-tice waves and by charged-impurity scattering, then scat-tering rates add and

1/µ ≈ 1/µL + 1/µI, (18)

FIGURE 12 Scattering of an electron by a positively chargedimpurity by Coulomb attraction, and scattering of a hole by apositively charged impurity by Coulomb repulsion. The hyperbolicpaths of electrons and holes are mirror images of each other.

FIGURE 13 Temperature dependence of mobility in a semicon-ductor with scattering by both acoustic lattice waves and chargedimperfections, which results in a maximum mobility at a particulartemperature.

where µL is the lattice-scattering determined mobility andµI is the charged-impurity-scattering determined mobility.The relationship of Eq. (18) is only approximate since theneed for different averaging procedures in calculating thevarious mobilities introduces a correction factor of or-der unity. A typical situation is illustrated in Fig. 13. IfµL = AT −3/2 and µI = BT 3/2, a maximum mobility oc-curs for T = (A/B)1/3.

Scattering by optical lattice waves is an inelastic pro-cess in semiconductors, since the optical phonon energyis larger in general than kT . Since the scattering prob-ability is proportional to the density of optical phononspresent, the temperature dependence for scattering by op-tical modes is approximately given by the Bose-Einsteindistribution for optical phonons of energy h✘ ωpn at a tem-perature T, µ0 ∝ [exp(h✘ ωpn/kT ) − 1].

Electron mobilities are usually larger than hole mobi-lities in most semiconductors because m∗

e < m∗h. Elec-

tron mobilities at room temperature range from about10−2 m2/V-s (ZnS) to almost 10 m2/V-s (InSb).

C. Imperfections in Semiconductors

In most practical semiconductors, the electrical conduc-tivity is controlled not by thermal excitation across theband gap of the material, but by thermal excitation fromlocalized imperfections. The electrical behavior of imper-fections depends on their location in the crystal and theireffective number of valence electrons.

The simplest kind of imperfection is one that differs byone in valence from the atom for which it substitutes inthe crystal lattice. Such imperfections give rise to localizedenergy levels in the forbidden gap of the semiconductorlying close to either the conduction or the valence band

Page 85: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LLL/GJP P2: FYD/GTL QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN005E-216 June 15, 2001 20:31

320 Electrons in Solids

FIGURE 14 Localized donor and acceptor levels in a semicon-ductor, showing the donor ionization energy (Ec − ED) and theacceptor ionization energy (EA − Ev).

edge. They can be treated approximately as if they wereminiature hydrogenic systems, corrected for the effectivemass and the dielectric constant of the semiconductor.

Such imperfections can be conveniently divided intodonors and acceptors. A donor has one more valence elec-tron than the atom for which it substitutes, is able to giveup this electron to the conduction band if sufficient ther-mal excitation energy is available, and is neutral whenelectron-occupied and positive when ionized. Figure 14shows typical donor and acceptor energy levels in a banddiagram. The amount of energy required to free an elec-tron from the donor is the ionization energy of the donor.The donor energy level is located at an energy ED so thatthe energy difference (Ec − ED) is the ionization energyof the donor. Ionization of a donor can be thought of asdescribed by the equation D0 → D+ + e−.

An acceptor has one less valence electron than the atomfor which it substitutes, is able to give up a hole to thevalence band if sufficient thermal excitation energy isavailable, and is neutral when hole-occupied and negativewhen ionized (electron-occupied). The amount of energyrequired to free a hole from the acceptor is the ionizationenergy of the acceptor. The acceptor energy level is locatedat an energy EA so that the energy difference (EA − Ev)is the ionization energy of the acceptor. Ionization of anacceptor can be thought of as described by the equationA0 → A− + h+. Simple donors and acceptors of this typeusually have ionization energies between about 10 and30 meV.

When donors and acceptors are simultaneously presentin the same semiconductor, the electrons donated by thedonors may be accepted by the acceptors. By this process,ionized donors and acceptors are produced without thecorresponding free electrons or holes. When this happens,the semiconductor is said to be compensated.

An imperfection differing in valence by more than onefrom the atom for which it substitutes usually gives riseto levels that lie near the middle of the forbidden gap.Different energy levels are needed to describe differentstates of ionization of the imperfection. For example, Zn(2 valence electrons) in Si (4 valence electrons) can accepteither one electron (becoming a Zn− state) or two electrons(becoming a Zn−2 state); the energy level correspondingto Zn−2 lies 0.58 eV above the valence band in Si (witha band gap of 1.1 eV), whereas the level correspondingto Zn− lies 0.33 eV above the valence band in Si. In thiskind of discussion the charges given on the Zn representdifferences in charge between the site occupied by a Znand the normal site in the Si lattice.

D. Fermi Level in Semiconductors

In a nondegenerate semiconductor (one in which the Fermilevel lies in the forbidden gap more than kT away froma band edge), the location of the Fermi level can be cal-culated in a way similar to that used for a metal; that is,by calculating the total free-electron density n in the con-duction band by integrating n = ∫

N (E) f (E) dE from thebottom of the conduction band at Ec to the top of the con-duction band, which can be taken effectively as infinity. Inthis case f (E) can be expressed by a simple Boltzmannfactor, since the energies of interest in the conduction bandE � EF. The result is that

n = Nc exp[−(Ec − EF)/kT ], (19)

where Nc is called the effective density of states in theconduction band and is given by Nc = 2(2πm∗

ekT/h2)3/2.By performing a similar calculation for the free-hole den-sity by integrating over the valence band, we obtain theresult that

p = Nv exp[−(EF − Ev)/kT ], (20)

with Nv = 2(2πm∗hkT/h2)3/2, and is called the effective

density of states in the valence band. When n becomescomparable to Nc (or p to Nv), the Fermi level lies in theconduction band (valence band) and the semiconductoris said to be degenerate. The consequence is that the fullFermi function, and not just the Boltzmann function tail,must be used to calculate the occupancy of band states.

Equations (19) and (20) show that if any two of thefollowing three quantities are known—free-carrier den-sity, EF, or T —the third can be calculated. The productof Eqs. (19) and (20) gives a constant for the material at agiven temperature:

np = Nc Nv exp(−EG/kT ), (21)

since (Ec − Ev) = EG, the band gap of the semiconductor.

Page 86: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LLL/GJP P2: FYD/GTL QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN005E-216 June 15, 2001 20:31

Electrons in Solids 321

E. Electrical Conductivity in Semiconductors

In an intrinsic semiconductor (a semiconductor with prop-erties controlled by excitation and recombination acrossthe band gap rather than through imperfections), the tem-perature dependence of the Fermi level (and hence thetemperature dependence of the free-carrier densities) canbe calculated simply by equating Eqs. (19) and (20), sincein an intrinsic semiconductor n = p = ni . The result isthat

(Ec − EF) = EG/2 + (3kT/4) ln(m∗

e

/m∗

h

). (22)

In an extrinsic semiconductor (a semiconductor withproperties controlled by excitation and recombinationthrough imperfections), the situation may be a little morecomplicated, but a simple rule may be stated that coversall cases. In the intrinsic case above, the requirement thatn = p may be recognized as a condition of charge neu-trality. This condition of charge neutrality may be used ingeneral to determine the temperature dependence of theFermi level (and hence of the free-carrier density and theconductivity).

Consider the case shown in Fig. 14 with one kind ofdonor and one kind of acceptor imperfection present inthe material. The general charge-neutrality condition thatgoverns this case is given by

n + nA = (ND − nD) + p, (23)

where the negative species on the left are the free-electrondensity and the density of ionized (electron-occupied) ac-ceptors, and the positive species on the right are the densityof ionized (electron-unoccupied) donors and the free-holedensity. Each of these terms can be written in terms ofthe location of the Fermi level, and hence the Fermi-levelposition satisfying the equation can be readily determinedby a computer even if there are many different donors andacceptors present.

The information needed to accomplish this lies in theappropriate expressions for electron-occupied donors andhole-occupied acceptors. These may be summarized asfollows:

nD = ND{(

12

)exp[(ED − EF)/kT ] + 1

}−1(24a)

and

(NA − nA) = NA{(

12

)exp[(EF − EA)/kT + 1

}−1.

(24b)

In these equations ND is the total density of donors with anionization energy of (Ec − ED), and NA is the total den-sity of acceptors with an ionization energy of (EA − Ev).Notice that the expressions are similar only if we comparethe expression for electron-occupied donors with that forhole-occupied acceptors. Equations (24) are what would

be expected from our previous discussion of the Fermidistribution, except for the additional degeneracy factorof ( 1

2 ) that appears in Eqs. (24). This factor arises becausedonor and acceptor states can accommodate an electronwith either of two possible spin orientations; taking thisinto account leads to the insertion of the factors of ( 1

2 )shown in Eqs. (24).

The general statement of Eq. (23) can be simplified inseveral special cases with physical significance. For anintrinsic material, nA = (ND − nD) = 0 and n = p, as wehave seen. If only donors are present, then nA = 0 andp can be neglected in the extrinsic conductivity range,so that n = (ND − nD). Similarly, if only acceptors arepresent, then (ND − nD) = 0 and n can be neglected inthe extrinsic conductivity range, so that p = nA. If bothdonors and acceptors are present in approximately equaldensities, so that almost complete compensation occurs,nA = (ND − nD).

Analytical results can be obtained for simple cases.As an example, consider the case of donors only. Tworanges may be defined. In the high-temperature rangewhere all the donors are ionized, n = ND and (Ec − EF) =kT ln (Nc/ND) from Eq. (19). In the low-temperaturerange, the donors are only partially ionized and

(Ec − EF) = (Ec − ED)/2+ (kT/2) ln (2Nc/ND) (25a)

with

n = (Nc ND/2)1/2 exp[−(Ec − ED)/2kT ]. (25b)

Therefore a plot of ln (nT −3/4) versus 1/T yields astraight line at low temperatures with a slope proportionalto one-half the donor ionization energy. These results aresummarized in Fig. 15.

The case of one kind of donor with density ND andone kind of acceptor with density NA can also be solvedanalytically. Consider the case first where ND > NA. Atlow temperatures for which n � NA or (ND − NA), a plotof ln (nT −3/2) versus 1/T has an activation energyof (Ec − ED). For intermediate temperatures for whichNA � n � (ND − NA), a plot of ln (nT −3/4) versus 1/Thas an activation energy of (Ec − ED)/2. Finally, all of theuncompensated donors are ionized at high temperaturesand n = (ND − NA). If ND = NA, we have exact compen-sation, and the Fermi level must lie halfway between thedonor and acceptor levels to achieve charge neutrality; theobserved activation energy is therefore (ED + EA)/2.

VI. GALVANOMAGNETOTHERMOELECTRICEFFECTS

When any two of the following interact—an electricfield, a magnetic field, and a thermal gradient—a num-ber of electrical effects result. Some of these are small

Page 87: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LLL/GJP P2: FYD/GTL QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN005E-216 June 15, 2001 20:31

322 Electrons in Solids

FIGURE 15 Electron density and conductivity as a function oftemperature for a semiconductor with donors only with indicatedactivation energies. The slopes for the ln σ versus 1/T plot differslightly from those of the ln (nT−3/4) versus 1/T plot because ofthe temperature dependence of Nc and µn in the extrinsic range. Inthe intrinsic range, none of the slopes seen will be exactly −EG/2kbecause in this range a plot should be made of ln (nT−3/2) versus1/T .

second-order effects (Nernst, Ettingshausen, and Righi-Leduc effects). Here, we consider three of the major ef-fects: the Hall effect, magnetoresistance, and thermoelec-tric power.

A. Cyclotron Resonance Frequency

Classically, the effect of a magnetic field on free electronscan be readily described. The force exerted on a movingelectric charge q with velocity v by a magnetic field B isgiven by qv × B. This force causes an electron to movein a circular orbit in the plane orthogonal to B, that is,in the x–y plane for Bz . The radius of the circular orbitcan be determined by equating the magnetic force to thecentrifugal force of the circular motion: m∗

eν2/r = qνBz .

The angular frequency ωc = ν/r is called the cyclotronfrequency and is given by

ωc = qBz/

m∗e . (26)

If a circularly polarized electric field with frequencyω = ωc is applied to such a free electron in a magneticfield, resonant absorption takes place. This effect providesa direct method for the measurement of the effective massof carriers.

For this to be possible the product ωcτ � 1, where τ isthe relaxation time for scattering. This in turn implies thatlarge values of Bz must be used, which puts the problemout of the classical realm and into the quantum-mechanicalrealm where Schroedinger’s equation must now be solvedwith an energy term involving the magnetic field. Thenet result is that the resonant absorption that occurs whenh✘ ω = h✘ ωc corresponds in the quantum-mechanical pictureto a transition between two of the quantized band statesthat result from the high magnetic field.

B. Hall Effect

In the Hall effect, an electric field and a magnetic field ap-plied at right angles to a material with either free electronsor free holes produce an electric field in the third orthog-onal direction, in order to produce zero current in that di-rection. A magnetic field Bz in the presence of an electricfield Ex producing a current density Jx causes electronsto acquire velocity in the y-direction. Since there can beno net current flow in the y-direction, however, it followsthat an electric field Ey will form such that qEy = qνxBz .Then the induced Hall field Ey can be measured and thecarrier density deduced from it:

Ey = ±(1/nq)JxBz . (27)

The polarity of Ey indicated by the choice of sign tellswhether the carriers are electrons (negative sign) or holes(positive sign). Two commonly defined parameters arethe Hall coefficient RH = ±1/nq and the Hall mobilityµH = σRH = ±µ.

The Hall mobility, always defined as µH = σRH, is asecond kind of mobility, in addition to the conductivitymobility defined in Eq. (15). A third kind of mobility oftenreferred to is the drift mobility µd = d/Et , obtained froma direct measurement of the time t required for carriers totravel a distance d in the material under an electric fieldE . If there are localized trapping states in the materialassociated with imperfections, a carrier injected at x = 0may spend a major portion of its time in the material ina trapped state rather than in a free state, and thereforemay spend much longer in reaching the detection point atx = d than simply d/(µcon E). We may express the driftmobility in terms of the conductivity mobility, the densityof free carriers n, and the density of trapped carriers nt by

µd = µcon[n/(n + nt )]. (28)

Page 88: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LLL/GJP P2: FYD/GTL QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN005E-216 June 15, 2001 20:31

Electrons in Solids 323

If both electrons and holes are present at the same time,since the value of Ey associated with the electrons is ofopposite sign to that associated with the holes, the pos-sibility exists for exact cancellation. By considering thetotal current due to both electrons and holes, we find thatin general (for small values of Bz),

Ey = [q(

pµ2p − nµ2

n

)/α2

]JxBz, (29a)

RH = Ey/JxBz

= (pµ2

p − nµ2n

)/[q(pµp + nµn)2

], (29b)

and

µH = σ RH

= (pµ2

p − nµ2n

)/(pµp + nµn). (29c)

The Hall coefficient and the Hall mobility both becomezero if pµ2

p = nµ2n.

C. Magnetoresistance

Magnetoresistance is the change in resistance correspond-ing to an applied electric field because of the effects of asimultaneously applied magnetic field. In the Hall-effectgeometry, to be specific, it means a change in electricalresistance corresponding to an electric field Ex because ofBz . To bring about Jy = 0 by the development of a specificvalue of Ey , as described above for the Hall effect, for allthe carriers, essentially requires that all the carriers haveidentical properties. When all the carriers do not have iden-tical properties (e.g., some are electrons and some holes,the scattering relaxation time varies with energy and carri-ers have a range of energies, or the energy-band structureis such that the effective mass is not a scalar), the devel-opment of Ey cannot remove all of the y-component ofvelocity for all the carriers. Some carriers travel a rela-tively greater distance than others between end contacts,and an increase in resistance results. Although the effectis small, it has been of research interest because of the de-pendence of magnetoresistance effects on various kindsof energy-band structures.

D. Thermoelectric (Seebeck) Effect

The application of a temperature gradient to a materialcauses the average energy of free carriers at the hot endto increase, which thus establishes a concentration gra-dient along the material. Diffusion associated with thisconcentration gradient is counteracted by the buildup ofan electric field due to the displaced charge, to satisfy thecondition that the total current be zero. The magnitudeof the voltage per degree difference is called α, the ther-moelectric power in V/K. The effect is much smaller in a

metal than in a semiconductor, being of the order of micro-volts per degree in the former and of a millivolt per degreein the latter. Typical expressions for α are as follows:

Metal α = π2k2T/

qEF, (30a)

n-type Semiconductor α

= −(k/q)[A + (Ec − EF)/kT ], (30b)

and

p-type Semiconductor α

= +(k/q)[A + (EF − Ev)/kT ]. (30c)

Here A is a constant depending on the specific scatteringmechanism for the free carriers involved; for acoustic lat-tice scattering A = 2, and for charged-impurity scatteringA = 4. Since the thermoelectric power in a semiconductorprovides knowledge of the location of the Fermi level at agiven temperature, it can also be used to provide direct in-formation about the magnitude of the free-carrier densityin the semiconductor.

VII. AMORPHOUS SEMICONDUCTORS

Amorphous semiconductors are a class of semiconductingmaterials that do not show the long-range order typicalof crystalline materials with a periodic potential that arediscussed primarily in this article. Amorphous materialsare generally made by one of three methods: (1) depositionfrom the vapor phase, (2) cooling from a liquid melt (called“glasses”), and (3) transformation of a crystalline solid byparticle bombardment, oxidation, and so forth.

There are three general categories of amorphous ma-terials: (1) covalent solids such as tetrahedral films ofGroup IV elements, or III–V materials; tetrahedral glassesformed from II–IV–V ternary materials (e.g., CdGeAs2);or chalcogenide glasses formed from the Group VI ele-ments, or IV–V–VI binary and ternary materials; (2) ox-ide glasses such as V2O5–P2O5 that have ionic bondsand show electrical conductivity between different valencestates of the transition metal ion; and (3) dielectric filmssuch as SiOx and Al2O3.

Because of the lack of long-range order, amorphoussemiconductors do not show the same kind of density-of-states band gap as described above for crystalline semicon-ductors. A schematic density-of-states band diagram foran amorphous semiconductor is given in Fig. 16. The den-sity of states continues through the points Ev and Ec thatwould mark the band gap in a typical crystalline material,but the mode of transport changes from that characteris-tic of extended states (somewhat like that in the allowedbands of a crystalline material) for electrons above Ec

(holes below Ev) to that characteristic of localized states

Page 89: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LLL/GJP P2: FYD/GTL QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN005E-216 June 15, 2001 20:31

324 Electrons in Solids

FIGURE 16 Mott-Davis energy-band model for an amorphoussemiconductor, with extended states (E < Ev) and E > Ec), lo-calized tail states (Ev < E < EB and EA < E < Ec), and localizeddefect states (EB < E < EA).

for electrons immediately below Ec (holes immediatelyabove Ev). Since the mobility for transport in extendedstates is appreciably larger than for that in localized states(where transport must be by some kind of localized hop-ping from one state to another), we may say that the crys-talline density-of-states gap has become a mobility gap inan amorphous material. Furthermore, the density of local-ized states remains very high throughout the gap in a typ-ical amorphous semiconductor, to the extent that dopingof the semiconductor (changes in its free-carrier densityby impurity incorporation) is essentially impossible.

In the early 1970s a major breakthrough occurred whenW. E. Spear and P. G. LeComber showed that if hydrogenwas incorporated into amorphous silicon, the density ofthe localized states in the gap could be greatly decreased,which would make possible the control of the conduc-tivity in a-Si:H by incorporation of impurities similar tonormal expectations. Such a-Si:H is found to have a tail oflocalized states decreasing exponentially in density withenergy separation from either electron or hole mobilityedge, and a residual density of defects in the center of thegap that may be as low as 1015 cm−3 rather than the valueof 1018 cm−3 or higher typical of other amorphous semi-conductors. The possibility of a thin-film material withelectronic properties at least resembling those of crys-talline silicon has led to a large research and developmenteffort with applications of the a-Si:H material to solarcells, electrophotography, thin-film transistors, solid-state

image sensors, optical recording, and a variety of amor-phous junction devices.

Two additional properties of a-Si:H are of interest: (1)an increase in localized defect density caused by dop-ing, and (2) an increase in localized defect density causedby illumination (the Staebler-Wronski effect), a reversibleeffect that can be removed by low-temperature anneal-ing but constitutes a degradation mechanism for a-Si:Hdevices.

VIII. SUPERCONDUCTORS

In 1911 H. K. Onnes showed that the electrical resistanceof mercury suddenly vanished when the temperature wasreduced below 4.15 K, and a new state of matter was dis-covered. It has been subsequently found that a numberof metals and alloys show this zero resistance at suffi-ciently low temperatures, and a search has been underway since then to produce materials with higher criticaltemperature, the temperature below which superconduc-tivity becomes possible. Superconductors are character-ized not only by a critical temperature, but also by a crit-ical magnetic field and a critical current density, none ofwhich can be exceeded if the superconducting state is to beretained.

The accepted theory that describes superconductivity inthese materials was developed in the 1950s by Bardeen,Cooper, and Schrieffer (the BCS theory). It proposes thatin certain metals a new state of matter is possible at lowtemperatures, which results from an attraction betweenpairs of electrons through a phonon interaction that over-comes the Coulomb repulsion between them. This newstate has an energy lower than that of the normal E = 0state of free electrons and is separated from it by a su-perconducting energy gap that is larger than the energy ofphonons available for scattering at this low temperature.Scattering therefore ceases, since there is no energy con-serving final states for the scattering transition; the scat-tering relaxation time becomes infinite; and the resistancegoes to zero.

A search for materials with higher critical temperaturehad extended the range up to 22.3 K for Nb3Ge by 1973,an increase of only 4 K for the previous 20 years of effort.Suddenly in 1987 new types of superconducting materi-als were discovered, not in the family of metals and al-loys, but in a variety of materials based on copper oxide.These materials have critical temperatures much higherthan any found earlier. The materials involved includesuch complex systems as La2−x Bax CuO4 and variationsproduced by replacing La with Y and replacing Ba by Sror Ca. A representative material is YBa2Cu3O7 which hasa critical temperature of 95 K. Since the temperature of

Page 90: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LLL/GJP P2: FYD/GTL QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN005E-216 June 15, 2001 20:31

Electrons in Solids 325

liquid nitrogen is 77 K, this material has a critical tem-perature well into the range where practical applicationsbecome feasible, provided that a variety of materials prob-lems can be solved.

A search continues for an adequate theoretical descrip-tion of these materials that is able to account for the muchhigher values of critical temperature. The central need isfor a mechanism that provides stronger attractive couplingof Cooper pairs than the conventional electron–phononinteraction.

IX. CONDUCTING POLYMERS

Inorganic metals, semiconductors, and insulators are notthe only materials that exhibit electrical conductivity ef-fects. Polymers are a class of compounds that were thoughtto be insulators but have been shown to conduct electric-ity when treated or prepared in appropriate ways. Unlikecrystalline three-dimensional lattices, polymers are morenearly characterized by a chain of repeating units in anapproximately one-dimensional layout.

An example of such a polymer is 7,7,8,8-tetracyano-p-quinodimethane (TCNQ) and tetrathiafulvalene (TTF). Inthe compound (TTF) (TCNQ) a conduction band is formedby the overlapping of wave functions that allows a transferof charge between the two parts of the compound, with theTTF playing the role of electron donor and the TCNQ play-ing the role of electron acceptor. After the charge transfer,both polymers have partially filled bands and can sustain acurrent in an essentially one-dimensional geometry. Whenthis material is cooled to 60 K, its conductivity is aboutthe same as that of copper.

Other examples of conducting polymers includepoly(sulfer nitride), [SN]x , which is an inorganic polymerand has a larger work function than that of any of the ele-mental metals; polyacetylene, [CHx ], which can be madewith conductivities over a wide range by incorporating im-purities; poly(p-phenylene) (PPP) and poly(p-phenylenesulfide); polypyrrole; and phthalocyanine. Polymers havealso shown superconductivity with low values of criticaltemperature.

Indirectly related to polymers are development involv-ing large spherical molecules constructed of carbon atoms,of which C60 is perhaps the most widely known andhas been given the name “buckminsterfullerene” afterthe geodesic dome structures advanced by BuckminsterFuller. Other carbon clusters such as C70 and C84 havealso been observed. Such structures have been nicknamed“buckyballs.” They are structurally sturdy but able to ac-cept electrons and hence react with many organic chem-icals. Potassium-doped K3C60 has been shown to exhibitsuperconductivity with a critical temperature of 19.3 K.

X. JUNCTIONS

Since the 1950s, perhaps no single property of materialshas been developed as much with such far-reaching con-sequences as the cornucopia made available by junctionsbetween different kinds of materials. The whole mod-ern solid-state electronics world is made possible by thespecial properties of junctions in the semiconductor sili-con between p-type (free-hole carriers) and n-type (free-electron carriers) portions of the material.

Since the subject of “Junctions” is treated in detail else-where in this Encyclopedia, our purpose in this section issimply to give a brief introduction and survey.

At least five types of junctions can be enumerated, with avariety of combinations of these also being of interest: (1) amaterial surface, representing a junction between a mate-rial and vacuum or a gaseous environment; (2) junctionsbetween two different metals with different work func-tions; (3) junctions between a metal and a semiconductor(an MS junction) as commonly encountered in makingohmic (low-resistance) or blocking (rectifying) electricalcontacts to a material; (4) junctions between two portionsof the same material (homojunctions) with different elec-trical properties, most commonly one having p type andthe other n type, to form a p–n junction; and (5) junc-tions between two different materials (heterojunctions)with different electrical properties. Another class of junc-tions consists of variations of (3)–(5) with a thin insulatinglayer (I) such as an oxide between the two junction ma-terials to produce such structures as MIS junctions or SISjunctions.

Junctions between two materials occur when the workfunction (the energy separation between the vacuum leveland the Fermi level) in the two materials is different. Whenjunctions are formed, charge transfer takes place betweenthe two different constituents so as to build up an internalelectric field that allows the Fermi energy to be constantthroughout the whole two joined materials. When two met-als with different work functions are brought into contact,the potential difference between them is equal to the dif-ference in their work functions and is known as the contactpotential. When metals or semiconductors are brought intocontact, the potential difference between them is knownas the diffusion potential φD. In every case, the diffu-sion potential of a junction is equal to the difference inthe work functions of the two constituents forming thejunction.

Typical energy-band diagrams for junctions are givenin Fig. 17 for a rectifying contact between a metal andan n-type semiconductor (known as a Schottky barrier),in Fig. 18 for an ohmic contact between a metal and ann-type semiconductor, in Fig. 19 for a p–n homojunction,and in Fig. 20 for p–n heterojunctions.

Page 91: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LLL/GJP P2: FYD/GTL QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN005E-216 June 15, 2001 20:31

326 Electrons in Solids

FIGURE 17 A Schottky barrier between a metal and an n-typesemiconductor, formed when the work function of the metal islarger than that of the semiconductor. Shown are the work func-tions of the metal qφM and the semiconductor qφs, the diffusionpotential φD of the junction, and χs, the electron affinity of thesemiconductor.

The energy-band diagrams in Figs. 17, 19, and 20 allshow semiconductor regions that have been depleted ofcarriers in the formation of the internal fields requiredto equalize the Fermi levels. These depleted regions areknown as depletion layers. Their width wd can be cal-culated from Poisson’s equation, which for a depletionlayer in an n-type material with a Schottky barrier is∂2φ/∂x2 = −qN+

D /εrε0. The solution for the appropriate

FIGURE 18 An ohmic contact between a metal and an n-typesemiconductor, formed when the work function of the metal issmaller than that of the semiconductor. Instead of the depletionlayer in the semiconductor shown in Fig. 17, the ohmic contacthas an accumulation layer in the semiconductor.

FIGURE 19 Energy-band diagram of a p –n homojunction.

boundary conditions that Ex = ∂φ/∂x = 0 and φ = φD atx = wd, and φ = 0 at x = 0, where x = 0 marks the junctioninterface and φD is the diffusion potential, is

wd = (2εrε0φD

/N+

D q)1/2

. (31)

In cases involving homojunctions and heterojunctions, thedepletion layer will be shared by the materials on bothsides of the junction, depending on the respective charged-imperfection densities and the dielectric constants; in theevent that one side of the junction is much more conduct-ing than the other, the depletion layer will be essentiallylimited to the less-conducting material, as in the case of ametal–semiconductor junction.

An extremely useful situation results from the realiza-tion that, given normal plane geometry, a junction can bethought of as a parallel-plate condenser with the deple-tion region as the dielectric between the condenser plates.

FIGURE 20 Energy-band diagrams for p –n heterojunctions. Thematerials in (a) and (b) have the same band gaps, but in (a)the p-type material has a smaller electron affinity than that ofthe n-type material, whereas in (b) the situation is reversed. Het-erojunctions are generally characterized by abrupt changes in theconduction band �Ec and valence band �Ev because of differ-ences in the electron affinities and band gaps of the two materialsmaking up the junction; the sign of these changes is of criticalsignificance for the behavior of the heterojunction.

Page 92: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LLL/GJP P2: FYD/GTL QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN005E-216 June 15, 2001 20:31

Electrons in Solids 327

Since the capacitance C = εrε0 A/wd where A is the areaof the junction,

(C/A)−2 = [2(qφD − qφapp)

/εrε0q2 N+

D

], (32)

where φapp is the magnitude of the voltage applied tothe junction from an external source. A plot of (C/A)−2

versus φapp according to Eq. (32) yields a straight line withintercept on the voltage axis of φD and with slope inverselyproportional to N+

D .Figure 18 differs from the other three cases in that for

an ohmic contact between a metal and a semiconductor,there is an excess of free carriers in the semiconductorsuch that an accumulation layer exists. The width of thisaccumulation layer is determined by the balance betweendrift and diffusion currents occurring in this region.

The dependence of current on applied voltage for thejunctions of Figs. 17, 19, and 20 can be described by arelationship of general form

J = J0[exp(qφapp/AkT ) − 1]. (33)

This equation describes a current that increases exponen-tially with +φapp (forward bias) and that reaches a voltage-independent reverse current of −J0 for −φapp (reversebias). Actual current transport mechanisms determine thespecific dependences of J0 and A. For a Schottky barrierwith transport controlled by thermionic emission, or for anideal homojunction with transport controlled by diffusion,A = 1.

Because of the extreme versatility of the standard p–nhomojunction, it is useful for a wide spectrum of appli-cations, including rectifiers, amplifiers (transistors), pho-todetectors, photovoltaic energy converters, and radiationemitters, both noncoherent and lasers.

Properties and applications of single-crystal or thin-filmp–n junction devices are being greatly expanded by theability acquired in recent years that makes it possible todeposit multilayer heterojunctions with atomically abruptinterfaces and controlled composition and doping in in-dividual layers that are only a few tens of nanometersthick. These layers are so thin that the energy levels inthem show the quantum effects of small potential wellthickness, which causes the continuous allowed levelsassociated with macroscopic thicknesses to become thediscrete allowed spectrum described by Eq. (1) for smallvalues of L . It becomes possible to fine-tune a whole rangeof properties through choice of the appropriate structuralparameters.

Structures called quantum wells are formed by sand-wiching a very thin layer of a small-band-gap material be-tween two layers of a wide-band-gap material. The actualvalues of the discrete levels in a quantum well are deter-mined by the thickness and the depth of the well, which isthe band discontinuity (electron band discontinuity �Ec

for electrons and valence band discontinuity �Ev forholes). If many quantum wells are grown on top of oneanother, and the barriers are made so thin that tunnelingbetween them is significant, the result is a superlattice, aconcept first proposed by Esaki and Tsu in 1969. Proper-ties of superlattices can be varied, not only by the choice ofthe materials used to make up the heterojunctions, but alsoby the spacing of the layers and their thickness. Thus, itbecomes possible to fine-tune a whole range of propertiesthrough choice of the appropriate structural parameters.

XI. MAGNETIC PROPERTIES

In various places in the previous paragraphs we have men-tioned the fact that electrons have an intrinsic angular mo-mentum known as spin. In our discussion thus far thisproperty has played little role except in determining ap-propriate statistics. We have also discussed the interactionbetween moving free electrons and a magnetic field in sucheffects as cyclotron resonance, the Hall effect, and mag-netoresistance, but these have been interactions between amagnetic field and a moving charge, not with the electronspin or with magnetic moments associated with electronmotion. It is a consideration of these latter interactions thatleads to the large field known as magnetic properties.

Since the subject of magnetic properties is treated indetail elsewhere in this Encyclopedia, our purpose in thissection is simply to give a brief introduction and survey.

There are two sources of an electronic magnetic mo-ment: (1) the orbital electronic motion of electrons inatoms, and (2) the intrinsic electron angular momentum,the spin.

A current I flowing in a loop enclosing an area A givesrise to a magnetic moment

µ = µ0 IA, (34)

where µ0 is the permeability of free space, and the direc-tion of the magnetic moment is perpendicular to the planein which A is defined. In the completed electronic shell ofan atom (2 s electrons, 6 p electrons, 10 d electrons, etc.),the net magnetic moment in the absence of a magneticfield is zero. If a magnetic field is applied to such a ma-terial, however, a magnetic field is set up according toEq. (34) that opposes the applied field (Lenz’s law). Thisphenomenon is known as diamagnetism and correspondsto a slight repulsion of a material by a magnetic field. Themagnetization, the magnetic moment per unit volume, Mis related to the magnetic field H by M = κH, where κ isthe magnetic susceptibility; for a diamagnetic effect, κ issmall and negative. All materials exhibit some degree ofdiamagnetism. The classical Langevin equation for dia-magnetic susceptibility is

Page 93: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LLL/GJP P2: FYD/GTL QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN005E-216 June 15, 2001 20:31

328 Electrons in Solids

κdia = −NZq2µ0〈r2〉/6m, (35)

where N is the number of atoms per unit volume, Z isthe atomic number of the material, and 〈r2〉 is the averagevalue of r corresponding to the average electronic orbitradius.

If a magnetic moment is present in the absence of a mag-netic field, the magnetic moments line up with the mag-netic field to decrease the total energy. This phenomenonis known as paramagnetism and correspond to a slightattraction of a material by a magnetic field and a smallpositive value of κ . Such magnetic moments associatedwith the spin of electrons can arise either from free elec-trons or from bound electrons. Paramagnetism due to freeelectrons is temperature independent and is given by thefree-electron model as

κpara(free) = (µ2

Bm/π2h✘ 2

)(3π2n)1/3, (36)

where µB is the Bohr magneton, µB = qµ0h✘ /2m, the mag-netic moment associated with the electron spin. Paramag-netism due to bound electrons varies inversely with tem-perature and is about 100 times larger than paramagnetismdue to free electrons; the susceptibility is given by

κpara(bound) = Nµ2B

/kT . (37)

The presence of a large magnetic moment that persists inthe absence of a magnetic field due to spontaneous order-ing of the moments by direct interaction at temperaturesbelow some critical temperature, called the Curie tem-perature, is known as ferromagnetism (e.g., Fe, Co, andNi). The energy that makes ferromagnetism possible iscalled the exchange energy and is a consequence of inter-action between spins on neighboring atoms. The exchangeenergy between unpaired spins of neighboring atoms insuitable materials is lowest when these spins are parallel.In the ferromagnetic state, the susceptibility is positive,many orders of magnitude larger than in paramagnetism,and is field dependent. Above the Curie temperature, aferromagnetic material shows only paramagnetism.

Antiferromagnetism is another magnetic effect that re-sembles ferromagnetism in that it results from internalinteractions between magnetic moments, but in these ma-terials an ordered situation is produced where the magneticmoments of nearby atoms are oppositely oriented (e.g., Cr,Mn, MnO, MnS, and NiO). The total moment over a finitevolume is zero. The antiferromagnetic ordering is also de-stroyed above a critical temperature, the Neel temperature,above which the behavior is paramagnetic.

Ferrimagnetism involves an ordered structure not un-like that of antiferromagnetism but involves a case wherethe number of atoms with opposite spin are unequal,which therefore yields a net magnetic moment [e.g.,(MO)(Fe2O3) with M = Mn, Fe, Co, Ni, Cu]. Otherwise

the behavior is much like that of ferromagnetism, yieldinglarge and field-dependent susceptibilities below a Curietemperature.

All practical ferromagnetic materials consist of mag-netic domains in the absence of an applied magnetic field,the orientation of the magnetic moment in a domain cor-responding to overall minimization of various forms ofmagnetic energy. When a magnetic field is applied to sucha material, the magnetic field needs to line up the variousdomains or so alter their extent that the saturation mag-netization is achieved. Thus the variation of magnetiza-tion with magnetic field involves phenomena related tothe motion of domain walls as well as the other magneticphenomena described above.

The magnetic energies that must be minimized includethe exchange energy (which by itself would lead to onedomain); the magnetostatic energy, or the energy in the ex-ternal magnetic field (which by itself would lead to manyclosure domains to restrict magnetic lines of force withinthe material); the magnetocrystalline anisotropy energy, orthe energy required to align moments in crystalline direc-tions not favored by a particular crystal structure (whichby itself would limit the formation of closure domains);and the magnetoelastic or magnetostrictive energy, or anenergy related to a change in physical dimensions accom-panying magnetization in a particular direction (which byitself would make domains smaller).

Another consideration that limits the formation of do-mains is the energy required to form a wall betweentwo differently oriented domains. Minimization of the ex-change energy favors wide walls with small changes ofmoment orientation between neighboring atoms, whereasminimization of the magnetocrystalline anisotropy energyfavors narrow walls to avoid spin alignment in difficultdirections.

A number of these considerations are brought togetherin a consideration of the form of the variation of mag-netization with magnetic field strength, as illustrated inFig. 21. If the applied magnetic field to a ferromagneticmaterial is varied, the magnetization of the material ex-hibits hysteresis. The area of the hysteresis loop is theenergy required to traverse one hysteresis cycle and istherefore an indication of the defect structure of the mate-rial and of the type of applications for which it would bebest suited.

If a material contains many defects and inclusions,domain wall motion is difficult, and large-area hysteresisloops result, with high values of magnetization at zeroapplied field (the remanent magnetization) and largevalues of magnetic field required to reduce the remanentmagnetization to zero (the coercive field). Such materialsare useful as permanent magnets and are called hardmagnetic materials. If the material contains few defects

Page 94: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: LLL/GJP P2: FYD/GTL QC: FYD Final Pages

Encyclopedia of Physical Science and Technology EN005E-216 June 15, 2001 20:31

Electrons in Solids 329

FIGURE 21 Typical ferromagnetic hysteresis curve showing sat-uration magnetization Ms, remanent magnetization Mr, and coer-cive field Hc.

and inclusions such that domains can be easily aligned byan applied magnetic field, the area of the hysteresis loop issmall. Such materials are useful for transformer cores orfor electromagnets and are called soft magnetic materials.

Magnetic materials are playing an increasingly promi-nent role in power distribution, conversion between elec-trical and mechanical energy, microwave communica-tions, and data storage.

One significant development in magnetic memory hasbeen the magnetic bubble. A cylindrically shaped mag-netic domain—a small region of oppositely magnetizedmaterial in a matrix of magnetic material with the op-posite orientation—is stable for a wide range of magneticfields. These configurations allow the retention of memorywhen the power is turned off and also allow a high packingdensity.

Increasingly stringent requirements are imposed formagnetic recording heads as the recording density in-

creases, and of course magnetic materials are the basisfor magnetic recording media themselves. These mediaare of two types: (1) particulate, in which the magneticcomponent consists of tiny, single-domain particles in abinder, and (2) thin-film ferromagnetic metals and alloys.

Techniques for growing thin-film magnetic materialshave made possible not only high-quality thin films, butalso multilayer films in which successive layers are onlya few atomic layers thick. Such structures, for example,exhibit (a) “giant magnetoresistance,” a change in resis-tance as large as 100% with a change in the magnetic state(relative orientation of successive layers) of the system,(b) strong magnetic anisotropy in-plane and out-of-plane,and (c) interesting magneto-optic properties.

SEE ALSO THE FOLLOWING ARTICLES

BONDING AND STRUCTURE IN SOLIDS • POLYMERS, ELEC-TRONIC PROPERTIES • SOLID-STATE CHEMISTRY • SOLID-STATE ELECTROCHEMISTRY • SUPERCONDUCTIVITY

BIBLIOGRAPHY

Bube, R. H. (1992). “Electrons in Solids,” 3rd ed., Academic Press, NewYork.

Colclaser, R. A., and Diehl-Nagle, S. (1985). “Materials and Devices forElectrical Engineers and Physicists,” McGraw-Hill, New York.

Gerlach, B., and Lowen, H. (1991). Rev. Mod. Phys. 63(1), 63–90.Hummel, R. E. (1985). “Electronic Properties of Materials,” Springer-

Verlag, New York.Mayer, J. W., and Lau, S. S. (1990). “Electronic Materials Science: For

Integrated Circuits in Si and GaAs,” Macmillan Co., New York.Solymar, L., and Walsh, D. (1979). “Lectures on the Electrical Properties

of Materials,” 2nd ed., Oxford Univ. Press, Oxford.Wilson, J., and Hawkes, J. F. B. (1983). “Optoelectronics: An Introduc-

tion,” Prentice-Hall, Englewood Cliffs, N.J.Wolfe, C. M., Holonyak, N., Jr., and Stillman, G. E. (1989). “Physical

Properties of Semiconductors,” Prentice-Hall, Englewood Cliffs, N.J.

Page 95: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYD Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology E005-237 June 15, 2001 20:43

Excitons, SemiconductorDonald C. ReynoldsWright Patterson Air Force Base

Thomas C. CollinsUniversity of Tennessee

I. Intrinsic Exciton CharacteristicsII. Extrinsic Exciton Characteristics

III. Interaction of Excitons with Other SystemsIV. Special Properties of Excitons

GLOSSARY

Band gap Energy difference between two allowed bandsof electron energy in a solid.

Bound exciton Exciton localized at an impurity or defectin the crystal.

Brillouin zones Those volumes in K space bounded byintersecting surfaces defined by points in K spacewhere the energy is discontinuous and Bragg reflec-tion occurs.

Central-cell correction Differences in the binding ener-gies of different chemical impurity atoms (donors oracceptors) in a host lattice resulting from different coreconfigurations.

Degenerate semiconductor Semiconductor in which aband has orbital as well as spin degeneracy.

Direct semiconductor Semiconductor in which the min-imum in the conduction band and the maximum in thevalence band occur at the same wave vector.

Hamiltonian Total energy operator of the system, kineticplus potential energy operator.

Indirect semiconductor Semiconductor in which the

minimum in the conduction band and the maximumin the valence band occur at different wave vectors.

Nondegenerate semiconductor Semiconductor inwhich the bands have only spin degeneracy.

Oscillator strength Measure of the intensity of a partic-ular energy transition.

Phonon Expression for a quantized lattice vibration.Acoustic phonons refer to in-phase motion of neigh-boring ions in a lattice vibration; optical phonons referto out-of-phase motion of neighboring ions.

THE EXCITON is a quantum of electronic excitation pro-duced in a periodic structure such as an insulating or semi-conducting solid. This quantum of energy has motion, andthe motion is characterized by a wave vector. Frenkel wasthe first to treat the theory of optical absorption in a solidas a quantum process consisting of atomic excitations. Theexcitation process implies that the excited electron doesnot leave the cell from which it was excited. In his attemptto gain insight into the transformation of light into heatin solids, Frenkel was able to explain the transformation

687

Page 96: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYD Final Pages

Encyclopedia of Physical Science and Technology E005-237 June 15, 2001 20:43

688 Excitons, Semiconductor

by first-order perturbation theory of a system of N atomshaving one valence electron per atom with the followingproperties:

(a) The coupling between different atoms in a crystal issmall compared with the forces holding the electronwithin the separate atoms.

(b) The Born–Oppenheimer approximation is valid.(c) The total wave function is a product of one-electron

functions.

Thus, the Frenkel exciton is a tight-binding descrip-tion of an electron and a hole bound at a single site suchthat their separate identities are not lost. This model ofthe exciton emerges as a limiting case of the general the-ory of excitons and is applicable to insulating crystals.In the case of semiconductors, nonequilibrium electronsand holes are bound in excitonic states at low tempera-tures by Coulomb attraction. Semiconducting crystals arecharacterized by large dielectric constants and small ef-fective masses; therefore the electrons and holes may betreated in a good approximation as completely indepen-dent particles, despite the Coulomb interaction. This re-sults because the dielectric constant reduces the Coulombinteraction between the hole and electron to the extent thatit produces a weakly bound pair of particles that still retainmuch of their free character. The exciton represents a stateof slightly lower energy than the unbound hole–electron.The effective mass theory used to describe such weaklybound particles was developed by Wannier. These weaklybound excitons are appropriately described using the one-band electronic structure picture by adding the Coulombinteraction between the hole and electron. Semiconduc-tor materials are the heart of most modern optical andelectronic devices; as a result, the dominant technologicalinterest is focused on these materials. In view of this inter-est, this article emphasizes the Wannier excitons, whichare appropriate for these materials.

I. INTRINSIC EXCITON CHARACTERISTICS

A. Introduction

The intrinsic fundamental-gap exciton in semiconductorsis a hydrogenically bound hole–electron pair, the hole be-ing derived from the top valence band and the electronfrom the bottom conduction band. It is a normal modeof the crystal created by an optical excitation wave, andits wave functions are analogous to those of the Blockwave states of free electrons and holes. When most semi-conductors are optically excited at low temperatures, itis the intrinsic excitons that are excited. The energies of

the ground and excited states of the exciton lie below theband-gap energy of the semiconductor. Hence, the excitonstructure must first be determined in order to determine theband-gap energy. The exciton binding energy can be de-termined from spectral analysis of its hydrogenic groundand excited state transitions (this also gives central-cellcorrections). Precise band-gap energies can be determinedby adding the exciton binding energy to the experimentallymeasured photon energy of the ground-state transition.

Both direct and indirect exciton formation occurs insemiconductors, depending on the band structure. Theformer is characteristic of many of the II–VI and III–Vcompounds, and the latter is characteristic of germaniumand silicon. For indirect optical transitions, momentumis conserved by the emission or absorption of phonons.The detailed nature of the valence-band structure of de-generate and nondegenerate semiconductors is elucidatedby understanding the intrinsic-exciton structure of thesesemiconductors.

B. Excitation of the n-Particle System

Excitons are excited states of the system in which thenumber of electrons does not change. Ordinarily in solid-state physics one thinks of calculations of one-body ap-proximations as excitation of the system. However, thisonly works in metals near the Fermi surface or where oneadds an electron or takes one away from the system. Inthe case of metals, the electrons and holes are very dif-fuse and the interaction between the excited “particle”and the other “particles” is very small. When an electronis added to the system, one obtains electron affinities orN + 1 solutions, and when an electron is removed oneobtains ionization energies or N − 1 solutions. All so-lutions to one-body calculations such as the one-particleGreen’s function method, Hartree–Fock calculations, andsimilar approximations (unless specifically added) do notcontain the interaction of the excited “particle” with theother “particles.” It is necessary to calculate solutions tothe two-body Green’s function (or some approximation tothe two-body problem) in order to have an exciton.

Another approach is to calculate the many-body excitedstates by construction of excited wave functions that are or-thogonal to the approximate ground-state wave function.The excited-state approximate total energy is calculated,from which is subtracted the approximate total ground-state energy. The formalism gives an effective operatorthat looks like a scaled or screened hydrogen Hamiltonian.This approach was used by both Frenkel and Wannier. Inthe case of Frenkel, the excited states were required to be alinear combination of very local excitations. In particular,he required the atoms in his model to all be in the groundstate except one. This one was excited, but the excitation

Page 97: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYD Final Pages

Encyclopedia of Physical Science and Technology E005-237 June 15, 2001 20:43

Excitons, Semiconductor 689

was localized to the one atomic site. The many-body so-lution using these trial wave functions was identical to thenormal modes of vibration of the phonon spectra. The so-lution is also similar to spin waves where one starts witheither an up or down spin on each lattice site.

Wannier, on the other hand, let the excitation trial wavefunctions be more diffuse than one lattice site. This rep-resentation he called “exciton waves,” which could varyfrom very local excitations (Wannier-like functions) tovery diffuse (Bloch-like functions). His effective excita-tion operator was a hydrogenic operator with multipolecorrections. The many-body effects were approximatedthrough screening of the Coulomb potential between elec-trons and the hole left by the excitation. This formalismalso has the advantage that when small perturbing fieldsare turned on, the resulting effective operator adds the per-turbing operator, similar to what one finds for the hydrogenatom or molecule case with screened potentials.

C. Systems of Excitons in VariousCrystal Symmetries

1. Direct Nondegenerate Semiconductors

Nondegenerate semiconductors are typified by those ma-terials that belong to the wurtzite crystal structure. Thisis a uniaxial structure having sixfold rotational symmetryand belongs to the C6V crystal point group. In this struc-ture the degeneracy in the valence band is removed bycrystal-field interactions.

The tight-binding approximation in conjunction withgroup theory was first used to describe the irreducible rep-resentations, band symmetries, and selection rules for thewurtzite structure. If one considers the absorption (emis-sion) of electromagnetic radiation by atoms, the probabil-ity of the occurrence of a transition between two unper-turbed states ψi and ψ f as caused by the interaction of anelectromagnetic radiation field and a crystal is dependenton the matrix element∫

ψ∗f Hintψi dr (1)

where

Hint = eh

imcA · ∇ (2)

where A is the vector potential of the radiation field andhas the form

A = n |A0|ei(q·r−ωt)

where e is the electron charge, m is the electron mass, cis the velocity of light, n is a unit vector in the directionof polarization, and q is the wave vector. Expanding thespatial part of A in a series gives

Hint =∞∑j=0

H jint

where

H jint ≈ (q · r) j (n · ∇)

and the dipole term is then the first term ( j = 0). Thematrix element in Eq. (1) is now expressed as a series,and for an electric dipole transition to be allowed, thematrix element between the initial and final states must benonzero.

In the case of transitions between two states of an atom(that is in a crystalline field), the initial and final states ofthe atom are characterized by irreducible representationsof the point group of the crystal field. Also, the dipole mo-ment operation must transform like one of the irreduciblerepresentations of the group. If one denotes the represen-tations that correspond to the initial and final states of thetransition and to the multipole radiation of order j ( j = 0for electric dipole radiation) by �i , � f , and �

( j)r , respec-

tively, at k = 0, then the matrix element in Eq. (1) trans-forms under rotations like the triple direct product

� f × �( j)r × �i (3)

The selection rules are then determined by which of thetriple-direct-product matrix elements in question do notvanish.

The dipole moment operator for electric dipole radia-tion transforms like x , y, or z, depending on the polariza-tion. When the electric vector E of the incident light isparallel to the crystal axis, the operator corresponds to the�1 representation. When it is perpendicular to the crystalaxis, the operator corresponds to the �5 representation.

Since the crystal has a principal axis, the crystal fieldremoves part of the degeneracy of the p levels. Thus, disre-garding spin–orbit coupling, the following decompositionat the center of the Brillouin zone is obtained:

conduction band S → �1

Px , Py → �5

valence band Pz → �1

Introducing the spin doubles the number of levels. Thesplitting caused by the presence of spin is represented bythe inner products

�5 × D1/2 → �7 + �9

�1 × D1/2 → �7

and the band structure at k = 0 along with the band sym-metries is shown in Fig. 1.

Page 98: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYD Final Pages

Encyclopedia of Physical Science and Technology E005-237 June 15, 2001 20:43

690 Excitons, Semiconductor

FIGURE 1 Band structure and band symmetries for the wurtzitestructure.

2. Direct Degenerate Semiconductors

Materials that crystallize in the diamond or the zinc-blendestructures are representative of degenerate semiconduc-tors. Two materials that have been extensively investi-gated and that are characteristic of direct degenerate semi-conductors are GaAs and InP. These materials crystallizein the zinc-blende structure, which has Td point-groupsymmetry.

The dipole momement operator for electric dipole ra-diation in zinc-blende structures transforms like �5. Theconduction band is s-like, while the valence band is p-like.This structure does not have a principal axis; therefore, thecrystal-field energy is zero and the full degeneracy of the plevels is retained. Thus, disregarding spin–orbit coupling,the following decomposition at the center of the Brillouinzone is obtained:

conduction band S → �1

valence band P → �4

Introducing the spin doubles the number of levels. Con-sider the �1 s-like conduction band and the triply degener-ate p-like valence band. The states at the center of the Bril-louin zone, which belong to �1 and �4 representations ofthe single group, are shown in Fig. 2. The splitting causedby the presence of spin is represented by the inner productas follows:

�1 × D1/2 → �6

�4 × D3/2 → �7 + �8

Physically this result means that the six valence-bandstates, consisting of the three p-like states, each associatedwith one or the other of the two spin states, and that are

degenerate in the absence of spin–orbit interaction, nowsplit into two levels, one having �7 symmetry and the otherhaving �8 symmetry. The �8 level is fourfold degenerate,while the �7 level is twofold degenerate.

The energy of the exciton is described by the Hamil-tonian

Hex = He + Hh + Heh (4)

where He and Hh are the Hamiltonians for the electron andthe hole and Heh is the interaction Hamiltonian betweenthe electron and the hole.

3. Indirect Transitions

Two of the most extensively studied indirect materials arethe elemental semiconductors Si and Ge. Both of thesematerials have indirect band gaps, and therefore the low-est energy electronic state is an indirect exciton. For thislowest energy state to be optically excited, momentummust be conserved; thus, additional momentum must besupplied by the creation or annihilation of an appropriatephonon. These materials crystallize in the diamond struc-ture and belong to the Oh point-group symmetry.

The band structures of Si and Ge are similar; as a re-sult, Si will be used as the example in this discussion todescribe the indirect exciton. The band structure of Si isshown in Fig. 3. The conduction-band minimum occursas a �1 symmetry approximately 85% of the way to thezone boundary in the 〈100〉 direction. The correspondingvalence band symmetry is �5. Using the group charactertables at � = (0, 0, 0) and � = (k, 0, 0),

�1 × �5 = �+15 + �+

25 + �−15 + �−

25 (5)

FIGURE 2 Band structure and band symmetries for the zinc-blende structure.

Page 99: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYD Final Pages

Encyclopedia of Physical Science and Technology E005-237 June 15, 2001 20:43

Excitons, Semiconductor 691

FIGURE 3 The band structure of silicon near the energy gap. [Ascomputed by Phillips (1958). Phys. Rev. 112, 685.]

The presence of the �−15 symmetry means that this is an al-

lowed transition. The maximum in the valence band occursat k = 0 having �+

25 symmetry. The �+25 → �−

15 transitionis allowed as

�+25 × �−

15 = �−2 + �−

12 + �−25 + �−

15 (6)

The momentum-conserving phonon in the valence bandfrom �5 to �+

25 is

�+25 × �5 = (�5) + (�1 + �′

2) + �′1 + �2

= (TO + TA) + (LA + LO) (7)

where TO is transverse optical, TA is transverse acoustic,and L indicates longitudinal. All phonons are allowed, asseen from Fig. 4. In this case, the energy denominator in

FIGURE 4 The vibration spectrum of silicon. [As determined byBrockhouse (1959). Phys. Rev. Lett. 2, 256.]

the intermediate state is fairly large where the radiativetransition occurs first; therefore the transitions may beweak.

The more important phonon for conserving momentumis in the conduction band from �1 to �−

15,

�−15 × �1 = �1 + �5 = LA + (TO + TA) (8)

Here only the LO phonon is forbidden.The same momentum considerations apply to the ex-

citon as apply to the bands, since an exciton that is con-structed from bands, whose extrema differ by an amountk� will have a momentum k�, which must be suppliedby the phonon field during an optical transition.

D. Perturbations

1. Magnetic Fields

When the crystal is placed in a uniform magnetic field,there are several new terms in the Hamiltonian, whichwill be described below. In this description, the band-gapextrema are at k = 0 with their shape parabolic at least tosecond order in k and with only double spin degeneracy.One may write the exciton equation as a simple hydro-gen Schrodinger equation including mass and dielectricanisotropy. In general one finds the mass anisotropy issmall, allowing first-order perturbation calculations to bemade for the energy states as well as for the magnetic-fieldeffects.

Since in this model the valence and conduction bandextrema are at k = 0, the wave vector of the light thatcreates the exciton k will also represent the position ofthe exciton in k space. If one divides the momentum andspace coordinates into the center-of-mass coordinates andthe internal coordinates, the exciton Hamiltonian can bedivided into seven terms as follows:

H = H1 + H2 + H3 + H4 + Hk1 + Hk2 + Hk3; (9)

H1 = −h

2m

[1

µx

(∂2

∂x2+ ∂2

∂y2

)+ 1

µz

(∂2

∂z2

)]

− e2

εη1/2(x2 + y2 + η−1z2)−1/2 (10)

H2 = −2iζ

[Ax

�x

(∂

∂x

)+ Ay

�y

(∂

∂y

)+ Az

�z

∂z

](11)

H3 = e2

2mc2

{A2

x

µx+ A2

y

µy+ A2

z

µz

}(12)

H4 = ζ

2

∑γ=x,y,z

(geγ Seγ + ghγ Shγ )Hγ (13)

Hk1 = ih

2m

[Kx

�x

(∂

∂x

)+ Ky

�y

(∂

∂y

)+ Kz

�z

(∂

∂z

)](14)

Page 100: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYD Final Pages

Encyclopedia of Physical Science and Technology E005-237 June 15, 2001 20:43

692 Excitons, Semiconductor

Hk2 = ζ

(Kx

µxAx + Ky

µyAy + Kz

µzAz

)(15)

Hk3 = h2

8m

{K 2

x

µx+ K 2

y

µy+ K 2

z

µz

}, (16)

where m is the free-electron mass and µγ is the reducedeffective mass of the exciton in the direction γ . Also,

ζ = eh

2mc, η = εz

εx(17)

A = 12 (H × r) (18)

1

�γ

=(

m

m∗eγ

− m

m∗hγ

)(19)

in the wurtzite structure µx = µy .The first term is the Hamiltonian for a hydrogenic sys-

tem in the absence of external fields. This term has the pos-sibility of including the mass and dielectric anisotropies.The second term is an A ·p term, which leads to the linear(Zeeman) magnetic field term. In this term, the momentumoperator pi becomes pi − (2e/c)Ai , where Ai = 1

2 H × ri

is the vector potential, H is the magnetic field, and ri isthe coordinate of the i th electron. The A2 term is the dia-magnetic field term proportional to |H|2. The fourth termis the linear interaction of the magnetic field with the spinof the electron and hole. If one has small effective reducedmass for the electron and a large dielectric constant, theradii of the exciton states are much larger than the corre-sponding hydrogen-state radii. Hence, since the spin–orbitcoupling is proportional to r−3 and thus quite small, it islegitimate to write the magnetic-field perturbations in thePaschen–Back limit as done above.

The last three terms are the K · P, K · A, and K 2 terms;K is the center-of-mass momentum. Treating the K · Pto second order and adding the K 2 term, one can obtainan energy term that appears like the center-of-mass ki-netic energy. The K · A term has little effect upon theenergy; however, it has very interesting properties. Thisterm represents the quasi-electric field that an observerriding with the center-of-mass of the exciton would expe-rience because of the magnetic field in the laboratory. Thisquasi-field would produce a stark effect linear in H, andthis would give rise to a maximum splitting interpretableas a “g value.”

2. Strain Field

A detailed study has been made of the stress-inducedsplitting of the exciton states in both wurtzite and zinc-blende structures. The band symmetries for the structuresare shown in Figs. 1 and 2, respectively. In these mate-rials the conduction band is s-like and the valence bands

are p-like. The theory was developed around the effectiveHamiltonian

H = Hν−0 + Hν−p + Hc + Hex, (20)

where Hν−0 describes the zero pressure mixing of thethree p-like valence bands, Hν−p describes the strain-dependent mixing of the valence bands, Hc describes theconduction-band energy, and Hex describes the valence-hole–conduction-electron interaction. The final term in theHamiltonian is where the spin-exchange interaction is in-troduced. This term is written as

Hex = BE + Jσh · σe, (21)

where BE is the exciton binding energy and the last termis the exchange interaction term. The exchange constant Jcan be calculated from known band properties. The Hamil-tonians have been formalized for both zinc-blende andwurtzite structures. Solutions of the Hamiltonians leadto the matrix elements for the allowed optical transitions〈0|ε · ∇|ψ〉, where ε gives the polarization of the lightand |0〉 corresponds to a filled valence band and emptyconduction band.

Experimentally observed splittings of exciton lines un-der uniaxial stress were observed in wurtzite-type II-VIcompounds. In wurtzite structures, all of the orbital degen-eracies of the valence band are removed by the spin–orbitinteraction and by the trigonal crystal field. It was evidentthat this splitting could not be accounted for by the usualdeformation-potential theory based on one-electron en-ergy bands. It was determined that the observed splittingcould be attributed to the decomposition of the degenerate�5 exciton state by the deformation of the wurtzite lattice.It is the combined effect of stress and exchange couplingthat gives rise to the splitting. In the wurtzite structure theconduction band has �7 symmetry while the top valenceband has �9 symmetry, and the next two lower valencebands have �7 symmetry as shown in Fig. 1. Exciton sym-metries associated with optical transitions between thesebands are as follows:

�7 × �9 → �5 + �6

�7 × �7 → �5 + �1 + �2

In the above analysis of strain splitting only the splittingof the �5 exciton was treated. The �6 and �2 excitons werenot considered since they are forbidden. In a later study ofZnO the excitons that make up the top valence band wereobserved in emission, and both the �5 and �6 exciton wereresolved. In the absence of a magnetic field the �6 excitonwas observed in samples containing in-grown strain. Itwould be expected that strain would relax selection rulessince it changes the symmetry of the sample. Not only wasthe �6 exciton observed in the presence of strain but it also

Page 101: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYD Final Pages

Encyclopedia of Physical Science and Technology E005-237 June 15, 2001 20:43

Excitons, Semiconductor 693

showed a splitting, resulting from combined strain andelectron-hole spin exchange. The inclusion of the spin-exchange interaction also allows one to bridge the gapbetween the description of excitons in the J–J couplingscheme and the L–S coupling scheme.

II. EXTRINSIC EXCITONCHARACTERISTICS

A. Introduction

The intrinsic exciton may bind to various impurities, de-fects, and complexes, and the subsequent decay from thebound state yields information concerning the center towhich it was bound. Bound-exciton complexes are ex-trinsic properties of materials. These complexes are ob-served as sharp-line optical transitions in both photolumi-nescence and absorption. The binding energy of the exci-ton to the impurity or defect is generally weak comparedto the free-exciton binding energy. The resulting complexis molecular-like (analogous to the hydrogen molecule ormolecule–ion) and has spectral properties that are analo-gous to those of simple diatomic molecules. The emissionor absorption energies of these bound-exciton transitionsare always below those of the corresponding free-excitontransitions, due to the molecular binding energy.

Bound excitons were first reported in the indirect semi-conductor silicon. Here it was found that when group Velements were added to silicon, sharp photoluminescentlines were produced, and these lines were displaced inenergy in a regular way. The binding energies of excitoncomplexes produced by adding different group V donorswere described by the linear relation

E = 0.1Ei (22)

where E is the binding energy of the exciton and Ei is theionization energy of the donor. The small differences inionization energies for different-effective-mass chemicaldonors result from central cell corrections. A similar re-lationship was found when the group III acceptors wereadded to silicon. A modified linear relationship has beenfound for donors and acceptors in compound semiconduc-tors.

The sharp spectral lines of bound exciton complexescan be very intense (large oscillator strength). The lineintensities will, in general, depend on the concentrationsof impurities and/or defects present in the sample.

If the absorption transition occurs at k = 0 and if thediscrete level associated with the impurity approaches theconduction band, the intensity of the absorption line in-creases. The explanation offered for this intensity behavioris that the optical excitation is not localized in the impurity

but encompasses a number of neighboring lattice pointsof the host crystal. Hence, in the absorption process, lightis absorbed by the entire region of the crystal consistingof the impurity and its surroundings.

The oscillator strength of the bound exciton, Fd, relativeto that of the free exciton fex can be expressed as

Fd = (E0/|E |)3/2 fex (23)

where E0 = (2h2/m)(π/�0)2/3, E is the binding energyof the exciton to the impurity, m is the effective mass ofthe intrinsic exciton, and �0 is the volume of the unit cell.

It has been shown in some materials that Fd exceedsfex by more than four orders of magnitude. An inspectionof Eq. (23) reveals that, as the intrinsic exciton becomesmore tightly bound to the associated center, the oscillatorstrength, and hence the intensity of the excitoncomplexline, should decrease as (I/E)3/2.

In magnetic fields, bound excitons have unique Zeemanspectral characteristics, from which it is possible to iden-tify the types of centers to which the free excitons arebound. Bound-exciton spectroscopy is a very powerful an-alytical tool for the study and identification of impuritiesand defects in semiconductor materials.

B. Bound-Exciton Complexesin Different Symmetries

1. Degenerate Semiconductors

The model of the donor–bound-exciton complex is usedto describe bound-exciton complexes in zinc-blende struc-tures. The Hamiltonian of the system may be written

H = Hex + Hd + Hdex (24)

where Hex and Hd are the exciton and donor Hamiltoniansand Hdex describes the interaction between the excitonand donor. The Hamiltonian for the exciton is given inEq. (4). In this equation, Heh is the interaction Hamiltonianbetween the electron and hole,

Heh = −e2/ε|re − rh| + Hexch (25)

Here ε is the dielectric constant and Hexch is the electron–hole exchange Hamiltonian. The exchange HamiltonianHexch is

Hexch = A1σ · J + A2(σx J 3

x + σy J 3y + σz J 3

z

)(26)

where σ and J are the operators for electron spin and effec-tive hole spin, respectively, and A1 and A2 are parametersdescribing the exchange energy.

The model of the exciton bound to a neutral donor isshown in Fig. 5. In the initial state the two electrons pair toform a bonding state, leaving an unpaired hole. When the

Page 102: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYD Final Pages

Encyclopedia of Physical Science and Technology E005-237 June 15, 2001 20:43

694 Excitons, Semiconductor

FIGURE 5 Schematic representation of radiative recombinationof an exciton bound to a neutral donor, where the final state is thedonor in the ground or in the excited configuration.

exciton collapses from this state, the final state may consistof the donor in the ground state or in an excited state. Thedonor may pick up energy from the exciton recombination,thus leaving the donor in an excited state. The model ofthe exciton bound to the acceptor is more complicatedthan the donor. The initial state of the neutral acceptor-bound exciton consists of two J =

32 holes and one J = 1

2electron as shown in Fig. 6. The two J =

32 holes combine

FIGURE 6 Schematic representation of radiative recombinationof an exciton bound to a neutral acceptor, where the final state isthe acceptor in the ground or in the excited configuration.

FIGURE 7 Photoluminescence spectrum of GaAs in the nearbandgap region.

to give a J = 0 and a J = 2 state. The interaction ofthe electron spin operator σ with the effective hole spinoperator J results in three J + σ states, 1

2 , 32 , and 5

2 . As inthe case of the donor, when the exciton collapses, the finalstate will consist of the neutral acceptor in the ground oran excited state.

The photoluminescent spectrum for GaAs is shown inFig. 7, where X is the free exciton transition; the D◦X linesare associated with the neutral donor-bound excitons, andthe A◦X lines are associated with the neutral acceptor-bound excitons. The J = 5

2 and J = 32 neutral acceptor-

bound exciton transitions are clearly observed. The (D◦X)n = 2 lines are associated with the collapse of the excitonfrom the neutral donor-bound exciton state, leaving thedonor in an excited state. The energy of the transition isexpressed as

ET = Eex − EDex − E∗D (27)

where Eex is the free exciton energy, EDex is the bindingenergy of the exciton to the donor, and E∗

D is the energyrequired to place the neutral donor in an excited state. Theanalogous neutral acceptor-bound exciton transitions inwhich the final state of the acceptor is left in an excitedstate are not shown. These transitions occur at appreciablylower energies, due to the larger binding energies of theacceptors.

2. Nondegenerate Semiconductors

The theory for the nondegenerate case is based on thewurtzite structure, with the salient factors of the bandstructure such as band symmetries and selection rules be-ing derived from group theory.

Consider any simple optical transition in which anelectron bound to an impurity is taken from one bandto another. Suppose the initial and final states can be

Page 103: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYD Final Pages

Encyclopedia of Physical Science and Technology E005-237 June 15, 2001 20:43

Excitons, Semiconductor 695

approximately assigned effective mass wave functions.The initial-state wave function can then be written as

fi (r )Ui0(r ) (28)

where fi (r ) is a slowly varying function of r and Ui0 isthe periodic part of the Bloch function of band i for wavevector zero. Similarly, the final-state wave function can bewritten

f f (r )U f 0(r ) (29)

Since f is a slowly varying function, the optical matrixelement ∫

fi (r )Ui0(r )P f ∗f (r )U ∗

f 0(r ) d3r (30)

can be approximately written as[ ∫fi (r ) f ∗

f (r ) d3r

]1

[ ∫Ui0(r )PU ∗

f 0(r ) dτ

](31)

where the second integration is carried out over the unitcell, whose volume is �. In this approximation, the onlylarge optical-matrix elements will arise when the analo-gous band-to-band transition is allowed.

A similar argument can be made to show that in thiseffective-mass approximation, large g values can be ex-pected only when the parent energy-band wave functionsexhibit large g values.

In the case of weakly bound states at substitutional im-purities and energy bands at k = 0 in the wurzite structure,it is reasonable to describe the states as though they be-longed to the point group of the crystal rather than to thegroup of the impurity. Such a description gives the de-generacy of the states correctly. This description neglectscertain optical transitions that are technically allowed, butthat are weak in the effective mass approximation andwill set equal to zero certain g values that should be muchsmaller than usual g values. The advantage of the descrip-tion is that it neglects these small effects and thus permitsthe full use of group theory without the clutter of whatshould be small perturbations.

The electron g value, ge, should be very nearly isotropic,since the conduction band is simple and the g shift of thefree electron is small, only weakly dependent on the stateof binding of the electron. The hole g value, gh, shouldbe completely anisotropic with gh equal to zero (for thetop �9) for magnetic fields perpendicular to the hexag-onal axis. It is to be expected that the hole g value willbe sensitive to its state of binding, since the different va-lence bands will be strongly mixed in bound-hole states.The model of the exciton bound to a neutral donor for thenondegenerate case is very similar to that for the degener-ate case, shown in Fig. 5, for zero applied magnetic field.The unpaired hole for the nondegenerate semiconductor

FIGURE 8 Model of the exciton bound to the neutral acceptor fora nondegenerate semiconductor.

is twofold degenerate, as compared to the fourfold degen-eracy for the degenerate semiconductor. The model of theexciton bound to the neutral acceptor for the nondegener-ate case is less complicated than for the degenerate case.The initial state of the complex consists of paired doublydegenerate holes, leaving an unpaired electron as shown inFig. 8. The final state consists of the acceptor, either in theground state or an excited state. In the absence of an ap-plied magnetic field, a single optical transition is observed.

3. Bound-Exciton Excited States

In many materials, on the high energy side of the neu-tral donor-bound-exciton complex lines is a similar set oflines, which are excited states of the lower energy com-plex structure. A rigid rotation model was proposed toexplain these excied states in CdTe. In this model the holeis excited to rotate around the fixed donor, analogous torotation of diatomic molecules. A non-rigid-rotator modelwas subsequently proposed, which was successful in pre-dicting the excited state energies in InP and GaAs. A moresophisticated model followed, which was applied to theD,0X ground and excited states, this model was success-ful in predicting the energy ordering of the excited states.A final model was proposed to explain the high-magnetic-field results in InP. In this model D,0X is considered to bea free exciton orbiting a neutral donor; one electron wasconsidered to be strongly correlated with the hole and theother with the donor. This model was capable of explainingthe relative intensities of the photoluminescence transitionin the ground- and excited-state regions of InP.

Excited states associated with the D,0X ground-statetransitions were later observed in ZnO. The transitions

Page 104: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYD Final Pages

Encyclopedia of Physical Science and Technology E005-237 June 15, 2001 20:43

696 Excitons, Semiconductor

FIGURE 9

3.3662 eV (�6) and 3.3670 eV (�5) in Fig. 9 are excitedstates analogous to rotational states of the H2 molecule.These states are rotational states associated with the3.3564 eV ground state, and are not electronic excitedstates. As observed in Fig. 9, these transitions are on thelow energy side of the 3.3772 eV (�5) and 3.3750 eV (�6)free exciton transitions. The solid curve in the figure rep-resents spectra with an applied magnetic field of 18 KG.The �6 exciton is an unallowed transition that becomesallowed in the presence of an applied magnetic field. Thedashed curve shows the same transition in zero magneticfield. Note that the rotator state associated with the �6 exci-ton was observed. The two lowest energy rotator states areassociated with the lowest energy 3.3564 eV D,0X tran-sition. The next two lowest energy rotator states, 3.3714eV (�5) and 3.3702 eV (�6), are associated with the nextlowest energy 3.3594 eV D,0X transition. It is noted thatagain one of the rotator states is associated with the �6

exciton. Other rotator states associated with the �6 exci-ton are most likely not resolved since they would comein the energy region where they would not be resolvedfrom other �5 rotator states. This was the first observationof rotator states associated with the �6 unallowed exciton,and lends support to the model that the exciton itself ratherthan the hole is rotating.

C. Perturbations

1. Magnetic Field

When a magnetic field is applied to the donor-bound ex-citon complexes in a degenerate semiconductor, the linesplitting due to the presence of the magnetic field canbe predicted from Fig. 5. In the initial state, the J =

32

unpaired hole will split into a quartet while the 1s finalstate will split into a doublet. This splitting will result

in six allowed transitions. When the final state consistsof the excited n = 2 states, the splitting is much morecomplicated. The 2s state is doubly degenerate, while the2p state is sixfold degenerate. In addition to the increasedmultiplicity of lines, rather large diamagnetic shifts arealso observed. The energies of these transitions in a mag-netic field have been calculated. In general it is not easy tosolve the Hamiltonian for the donor-bound exciton com-plex in a magnetic field. In the low-field regime, the exci-ton Hamiltonian of Eq. (4) can be separated into two parts:the spherical symmetric (s-wave-like) and asymmetric (d-wave-like) parts. For the perturbation calculation, in thelow-field regime one can treat the s-wave-like part as anunperturbed Hamiltonian and the d-wave-like part as aperturbed Hamiltonian.

In the high-field regime—i.e., when the magnetic en-ergy is much greater than the Coulomb energy—the so-lution of Eq. (4) may be obtained by an adiabatic methodwhich can be written as

Ei j = Li jγ (32)

where the Li j are the linear coefficients for the Landau-type solutions. They turn out to be of the order of Li j =0.01. These energies shift much more rapidly withmagnetic field than is experimentally observed in theintermediate-field region.

In the intermediate-field regime where the magnetic en-ergy is of the order of the Coulomb energy, the solution ofEq. (4) is not easily obtained. A phenomenological schemefor solution in this region was used to bridge the gap be-tween the solutions in the low- and high-field regimes. Inthe intermediate-field region, a variety of functional formsfor eigenvalues can be constructed. In the framework ofinfinite-order perturbation calculations, one may concludethat the dominant correction term will be an even functionof an applied magnetic field, provided the linear Zeemanenergy term is absorbed in the unperturbed Hamiltonian.For simplicity, the following form for the eigenvalues forall fields was chosen:

Ei j = EB + (Gi jγ + Di jγ

2 + βi j Li jγ3)/(

1 + βi jγ2)

(33)

In the low-field regime, the above reduces to the perturba-tion scheme, and in the high-field regime, it reduces to theadiabatic scheme, that is, the Landau-level-type solutions.

The magnetic field splitting of the acceptor-bound ex-citon is quite complicated. It can be seen from Fig. 6 thatthe 3 initial states will split into a total of 12 states, andthe final 1s state will split into a quartet. These transitionshave been observed experimentally; however, the ener-gies of the transitions have not been calculated due to thecomplexity of the problem, which involves the degeneratevalence band.

Page 105: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYD Final Pages

Encyclopedia of Physical Science and Technology E005-237 June 15, 2001 20:43

Excitons, Semiconductor 697

The theory of bound excitons in nondegenerate semi-conductors is based on the wurtzite structure. In consid-ering transitions involving bound excitons formed fromholes in the top valence band, the g value of the elec-tron is isotropic. The g value of the hole has the formgh = gh|| cos θ , where θ is the angle between the c axisof the crystal and the magnetic field direction. The sym-bols ⊕ and � refer to ionized donors and acceptors, re-spectively, and + and − refer to electrons and holes. Theneutral donor-bound exciton is very similar to that for thedegenerate case in Fig. 5; however, in the initial state, theunpaired hole is only doubly degenerate. Therefore, a totalof four transitions is observed in the presence of a mag-netic field. For the orientation C ⊥ H, only two transitionswill be observed, since the hole g value goes to zero for thisorientation. The neutral acceptor-bound exciton will alsoexhibit a four-line transition in the presence of a magneticfield. In the initial state, the unpaired electron is doublydegenerate, while the final state consisting of the unpairedhole is also doubly degenerate. In this case also, the holesplitting goes to zero for the orientation C ⊥ H, resultingin a two-line transition.

2. Stress Field

In zinc-blende-type semiconductors, the uniaxial strainpatterns and electric dipole selection rules have been de-rived for lines arising from weakly bound exciton com-plexes. The effect of stress on excitons bound to shallowneutral acceptors in zinc-blende structures has been ratherthoroughly investigated.

In the unstrained crystal, a hole from the J = 32 (�8) va-

lence band in combination with an electron from the J = 12

(�6) conduction band gives rise to the ground-state exci-ton. Uniaxial stress splits the J = 3

2 degenerate valenceband into two bands, one with M j = ± 1

2 , the other withM j = ± 3

2 . This splitting is reflected in optical transitionsinvolving holes from the valence band. The shallow accep-tor removes an energy state from the valence band and es-tablishes it as a quantum state of lower energy in the gap re-gion. This state is made up of valence-band wave functionsand therefore will also reflect valence-band splittings.

When an excition is captured by the shallow acceptor,an acceptor-bound exciton complex (A0X) is formed thatconsists of two holes and one electron weakly bound toa negative acceptor ion. In the absence of stress, threetransitions are observed, as shown in Fig. 6.

When a uniaxial stress is applied to the (A0X) complexdescribed above, the degeneracy of the states is lifted dueto the splitting of the �8 hole states. A schematic plot of theresulting energies is shown in Fig. 10. The lower part ofthe figure shows the splitting of the final (one-hole) stateafter the collapse of the exciton. The upper part shows

FIGURE 10 Schematic plot of the splitting of the (A0X) initial andground states under uniaxial stress. [From Schmidt, M., Morgan,T. N., and Shairer, W. (1975). Phys. Rev. B11, 5002.]

the splitting and shifts of the initial energy states priorto exciton decay. The energies of these states have beencalculated and compared with experimental observations,as shown in Fig. 11. The lines in Fig. 11. show the pre-dicted energy levels (in the absence of a crystal field) for

FIGURE 11 Experimental (points) and calculated (lines) lineshifts under increasing uniaxial stress. The stress is applied ina [100] direction. [From Schmidt, M., Morgan, T. N., and Shairer,W. (1975). Phys. Rev. B11, 5002.]

Page 106: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYD Final Pages

Encyclopedia of Physical Science and Technology E005-237 June 15, 2001 20:43

698 Excitons, Semiconductor

transitions between the initial and final states, with the cen-ter of gravity shift included. The σ lines are for transitionspolarized perpendicular to the applied stress, while theπ lines show transitions polarized parallel to the appliedstress. The agreement between theory and experiment isvery good.

D. Multibound Excitons

Sharp photoluminescent lines have been observed at ener-gies less than the energy of the line associated with an exci-ton bound to a neutral donor in silicon, germanium, and sil-icon carbide. Similar lines have also been observed that areassociated with acceptors in silicon and gallium arsenide.The energies and widths of these lines were such thatthey could not be explained in terms of any recombinationmechanism involving just a single exciton bound to a neu-tral shallow impurity center. A model involving a multi-exciton complex bound to a donor (acceptor) was invokedin which each line was associated with radiative recombi-nations of an exciton in the bound multiexciton complex.

A series of emission lines was observed in silicon crys-tals also lightly doped with boron or phosphorus. Theseries began with the bound-exciton line and convergedtoward the energetic position of the maximum of emis-sion of the condensed electron–hole state. The emissionseries is shown for both boron and phosphorus dopantsin Fig. 12. The impurities can bind a series of intermedi-ate “multiple-exciton states” containing the single boundexciton and electron–hole droplet state.

A model was proposed in which the multiple-excitoncomplex is built up by successive capture of free excitonsat neutral impurity centers. A multiple-exciton complexhaving index m can capture another free exciton and thenhave the index m + 1; the decay of an exciton would de-crease the index to m − 1. The observed photon energyhνm is the difference between the energies of the initialand final states,

hνm = Eg − EFE − Em = hνFE − Em (34)

where EFE is the binding energy of the free exciton and Em

that of an exciton in the m complex. The energy differencebetween the mth line hνm and the free-exciton line is ameasure of the binding energy Em .

The model was successful in obtaining an empirical fitto the series of emission lines with the series formulas

hν∗m = −18.5[1 − exp(−0.21m)] meV (35)

for Si : B (except for the bound exciton line) and

hν∗m = −18.5[1 − exp(−0.32m)] meV (36)

for Si : P. The calculated line positions are shown inFig. 12, h ν∗

0 = 0 corresponds to the free-exciton line.

FIGURE 12 Emission spectra of (a) Si : B with TO phonon and(b) Si : P without phonon assistance. Excitation intensity 7.5 Wcm−2, T = 2 K. The dashed lines indicate the positions of theFEs and the maxima of EHD emission. In Si : P(NP), the FEdoes not really appear; however, its position is known from thephonon-assisted FE spectrum. The EHD emission in the NPspectrum only appears at higher doping levels and under highexcitation. The arrows mark the calculated values hν∗

m givenby hν∗

m = −18.5[1 − exp(−0.21m)] meV for Si : B and hν∗m =

−18.5[1 − exp(−0.32 m)] meV for Si : P. [From Sauer, R. (1973).Phys. Rev. Lett. 31, 376.]

Other models have been proposed, including a shell modelin which all of the electrons and all of the holes in thebound multiexciton complex are assumed to be equiva-lent and therefore must conform to the Pauli principle.The complex is then built up along the lines of a shellmodel similar to what has been used to study nuclei andmany-electron atoms.

The behavior of these new lines in the presence of mag-netic and stress fields helped to establish the viability ofthe bound multiexciton complex model.

E. Donor–Acceptor Pairs

Donor–acceptor pairs introduce transitions in the bound-exciton region whose behavior is quite different from

Page 107: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYD Final Pages

Encyclopedia of Physical Science and Technology E005-237 June 15, 2001 20:43

Excitons, Semiconductor 699

excitons bound to foreign impurities or defects. The pairscan produce bound states distributed in energy. The rangeof energies results both from the possible impurities ordefects interacting as pairs and from a dependence on pairseparation. Discrete pair spectra were first observed as avery complicated spectra in GaP consisting of very manysharp lines. The donors and acceptors will occupy substi-tutional or interstitial sites. In the case of substitutionalsites, both the donor and acceptor can occupy sites on thesame sublattice for a compound material such as GaP, orthey may be on opposite sublattice sites. Another arrange-ment is with one impurity at an interstitial site and theother at a particular lattice site. All of these arrangementshave been observed.

The energy required to bring a hole and an electron frominfinity to an ionized donor–acceptor pair separated by adistance R may be written as

E(R) = Eg − EA − ED + e2/εR (37)

In this expression E(R) is the energy of the pair recom-bination line, Eg the band gap of the semiconductor, EA

and ED the acceptor- and donor-binding energies, respec-tively, R the donor–acceptor separation, and ε the low-frequency dielectric constant. When the donor–acceptordistances become small [R < R0 = (donor–acceptorconcentration)1/3], a van der Waals attractive term maybecome important, and Eq. (37) becomes

E(R) = Eg − EA − ED + e2/εR − (e/K )(a/R)6 (38)

In the case of random pair distribution, it would beexpected that over a small range of R, the line intensitywould reflect the statistical probability of a specific pairoccurring. In considering GaP, which has the zinc-blendestructure, and assuming that both the donors and accep-tors result from substitutional impurities and that bothoccupy sites on the same sublattice, it is possible to re-late R to a given observed line. For the preceding case,Rm = a0( 1

2 m)1/2, where a0 is the GaP lattice constantand Rm the distance to the mth nearest neighbor on theradius of the mth shell. The donors and acceptors oc-cupy face-centered cubic sites, and the number of pairsfor a given m can be tabulated. The variation in numberof pairs allows a correlation with observed spectra. Forthe case when the donors and acceptors occupy oppositesublattice sites, Rm = a0( 1

2 m − 516 )1/2 and N (R) > 0 for

all m.The value of E(R) and R are determined from experi-

ment; therefore Eqs. (37) and (38) can be helpful in identi-fying the donors and acceptors involved in donor–acceptorpair recombination.

III. INTERACTION OF EXCITONSWITH OTHER SYSTEMS

A. Phonons

Emission from bound exciton complexes has been ob-served in many materials. These are very sharp transi-tions which in many cases are replicated by emissionlines that are separated in energy from the parent tran-sition by an optical phonon energy for the particular lat-tice in question. In crystals having the wurtzite symme-try there is a �1 and a �5 longitudinal optical–transverseoptical (LO–TO) splitting due to long-range electrostaticforces as well as a �1–�5 LO–LO and TO–TO splittingdue to anisotropic short-range interatomic forces. The �1–�5 LO–LO splitting has been observed on the phononsidebands, due to the interaction with the macroscopiclongitudinal optical phonon electric field, in both CdS andZnO. In CdS, the phonon-assisted the results from the col-lapse of the exciton bound to a neutral acceptor with thecreation of an LO phonon. Both the �1 and the �5 LOphonons are created in this process. These two phononsdiffer in energy by 2.4 cm−1. This small difference in en-ergy is clearly resolved, showing that the phonon-assistedtransitions are not appreciably broadened by the phononinteraction.

In the case of CdS, the exciton is rather weakly bound(17 meV) to the acceptor. This results in a localized statein K space. The phonon energies show that it is localizednear K = 0. The phonon dispersion curves were calcu-lated for CdS using a mixed binding model. In this modelthe potential contains a short-range part corresponding tocovalent bonding and a long-range part due to Coulombinteractions between point ion charges. The calculationsshow that the LO phonon dispersion curves are quiteflat in the vicinity of K = 0. This would account forthe very small line broadening observed in the phononinteraction.

A similar interaction between the LO phonon and anexciton bound to a neutral donor was observed in ZnO.In this material, the energy separation between the �1 LOphonon and the �5 LO phonon is larger (11 cm−1).

1. Exciton–Bound-Phonon Quasi-Particle

Optical transitions have been observed in a number ofionic crystals in which the energy separating the parenttransition and its LO-phonon sideband is less than theLO-phonon energy hω0 by approximately 10%. In ab-sorption spectra of AgBr : I, transitions associated withthe bound exciton occur at energy separations approxi-mately 30% less than hω0. These results can be explainedin terms of a bound-phonon quasi-particle model. The

Page 108: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYD Final Pages

Encyclopedia of Physical Science and Technology E005-237 June 15, 2001 20:43

700 Excitons, Semiconductor

calculated binding energies and oscillator strengths forthis new quasi-particle can account for phonon interac-tions whose energies are less than that of the LO phononhω0. LO phonons bound to neutral donors have been ob-served in both Raman-scattered and luminescence spectraof GaP crystals that were doped with S, Te, Si, and Sn.These results were interpreted as impurity modes associ-ated with dielectric effects of the neutral donors, ratherthan as local modes associated with mass defects of thesubstituents.

The spectra due to Raman scattering from the neu-tral donor LO-phonon bound states are shown in Fig. 13.The binding energies associated with the exciton–bound-phonon quasi-particle for several different donors aregiven in Table I.

The virtual process

donor + LO phonon k → excited donor

→ donor + LO phonon k′

produces the interaction of an LO phonon with a donorsite. The effective scattering matrix element Hkk′ is pro-portional to

FIGURE 13 Raman scattering of 5145- A Ar+ laser light fromGaP containing ∼1015 cm−3 neutral Sn, Te, or S donors, recordedjust below the k = 0, LO phonon at 50.2 meV, showing the newlocal modes. These modes can be seen easily at donor concen-trations as low as 1017 cm−3, although their strengths relative tothe LO � lattice normal mode decrease in proportion to the neutraldonor concentrations. [From Dean P. J., Manchon, Jr., D. D., andHopfield, J. J. (1970). Phys. Rev. Lett. 25, 1027.]

∑j

(E j − E0)〈0|eik·r| j〉〈 j |eik′ ·r|0〉[(E j − E0)2 − (hω)2

]K K ′ (39)

where |0〉 is the donor ground-state wave function andthe sum is over excited donor states j . The interaction isattractive when hω is less than the excitation energy of thedonor. For a spherical approximation, the interaction willproduce bound states for each angular momentum of theLO phonon around the donor. When the first excited stateof the donor is comparable with the phonon frequency,an approximation in which only the lowest donor excitedstate is kept will give a reasonable lower bound to thebinding energy for HKK′ .

Using this approximation, one obtains the followingwave functions and energies for the s and p states. For sstates,

EB = 32

729hω0

(ε0

ε∞− 1

)e2

2aε0

E2s − E1s

(E2s − E1s)2 − (hω0)2

(40)

�K αK[(

32 a

)2 + K 2]3 (41)

For p states,

EB = 224

6561hω0

(ε0

ε∞− 1

)e2

2aε0

E2p − E1s

(E2p − E1s)2 − (hω0)2

(42)

�K αcos θb[(

32 a

)2K 2

]3 (43)

In these expressions, ε0 and ε∞ are the static and high-frequency dielectric constants of the Frohlich electron–phonon interaction. The donor Bohr radius is a, and thedonor binding energy is e2/2aε0 for a hydrogenic donor.The theoretical binding energies from Eqs. (40) and (42)are included in Table I.

B. Photons

1. Spatial Resonance Dispersion

For those states in a crystal where the photon wave vec-tor and the exciton wave vector are essentially equal, theenergy denominator for exciton–photon mixing is smalland the mixing becomes large. These states are not to beconsidered as pure photon states or pure exciton states,but rather mixed states. Such a mixed state has beencalled a polariton. When there is a dispersion of the di-electric constant, spatial dispersion has been invoked toexplain certain optical effects of crystals. It was origi-nally thought that it would introduce only small correc-tions to such things as the index of refraction, until it was

Page 109: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYD Final Pages

Encyclopedia of Physical Science and Technology E005-237 June 15, 2001 20:43

Excitons, Semiconductor 701

TABLE I Experimental and Calculated Binding Energies EB of LO Phonons Localized at Neutral Donors in GaPObserved through Raman Scattering (EB)p, and as Sidebands in the Luminescence of Excitons Bound to NeutralDonors (EB)sa

E1s (A1)→E2s(A1) E1s (A1)→Eb2p (EB)s

calc (EB)2exp (EB)p

calc (EB)pexp

Donor (meV) (meV) (meV) (meV) (meV) (meV) (meV)

S 104.2c 82.6c 94.0g 0.98 1.2 ± 0.2 0.58 0.8 ± 0.2Te 89.8d 68.2d 79.9d 1.40 1.9 ± 0.2 0.70 1.2 ± 0.2Si 82.5c 70.2 f 72.3c 1.16 ? 0.82 1.3 ± 0.2Sn 65.5e 53.2e 58.5e [3.6]h ? [1.35]h 1.6 ± 0.2

a From Dean, P. J. et al. (1970). Phys. Rev. Lett. 25, 1027.b Calculated as weighted mean of E1s (A1) → E2p0 and E1s(A1) → E2p±.c From Onton, A. (1969). Phys. Rev. 186, 786.d From Onton, A., and Raylor, R. C. (1970). Phys. Rev. B 1, 2587.e From Dean, P. J., Faulkner, R. A., Schonherr, E. G. (1970). Proc. Int. Conf. Phys. Semicond., 10th, Cambridge, Mass., p. 286.f Assuming the E2s(A1) is the same as for Sn.g Assuming the E2po and E2p± are the same for S and Te.h Using degenerate perturbation theory.

demonstrated that if there was more than one energy trans-port mechanism, as in the case of excitons, this was nottrue. Spatial dispersion addresses the possibility that twodifferent kinds of waves of the same energy and samepolarization can exist in a crystal differing only in wavevector. The one with an anomalously large wave vectoris an anomalous wave. In the treatment of dispersion byexciton theory, it was shown that if the normal modes ofthe system were allowed to depend on the wave vector,a much higher order equation for the index of refractionwould result. The new solutions occur whenever there isany curvature of the ordinary exciton band in the region oflarge exciton–photon coupling. These results apply to theLorentz model as well as to quantum-mechanical mod-els whenever there is a dependence of frequency on wavevector.

It was pointed out early in the investigation of spatialdispersion that the specific dipole moment of polarizationof a crystal and the electric field intensity are not in di-rect proportion. It was found that the two were related by adifferential equation that resulted in giving Maxwell equa-tions of higher order. This led to the existence of severalwaves of the same frequency, polarization, and directionbut with different indices of refraction. Subsequent stud-ies of the reflectivity of CdS demonstrated the effects ofspatial dispersion. Extensive calculations resulted in thefollowing expression for the index of refraction:

n2 = c2k2

ω2= ε

+∑

j

4π(α0 j + α2 j k2

)ω2

· j

ω20 j + (

hω0 j k2/

m∗j

) − ω2 − iω� j(44)

In this equation the sum over j is to include the excitonsin the frequency region of interest, and the contributions

from other oscillators are in included in a background di-electric constant ε. In Eq. (44) one has expanded both thenumerator and denominator in powers of k, keeping termsto order of k2, m∗ is the sum of the effective masses ofthe hole and electron that comprise the exciton, and ω0 j

is the frequency of the j th oscillator at k = 0. Eliminat-ing k2 from Eq. (44) and neglecting the line width of theoscillators, the working equation is

n2 = ε +∑

j

4π(α0 j + α2 jω

2n2/

c2)ω2

0 j

ω20 j + (

hω20 jω

2n2/

c2m∗j

) − ω2(45)

The sum is over excitons from the top two valence bands,where the “allowed” excitons have been included withα0 j �= 0, α2 j = 0 while the “forbidden” (which are seenonly because k �= 0) excitons are included with α0 j = 0,α2 j �= 0. The above equation reduces to a polynomial in n2

whose roots give the wavelengths of the various “normalmodes” for transfer of energy within the crystal.

In the classical case, α2 j = hω0 j/m∗j = 0. For a given

frequency, the two roots of n2 for Eq. (45) are −n and+n. Thus in the classical case, for a given principle polar-ization, frequency, and direction of propagation, only onetransverse mode exists.

2. Two-Photon Processes

The concept of two-photon processes dates back more than40 years, and was first treated theoretically. The observa-tion of two-photon transitions occurred approximately 30years later. The two-photon transition is a nonlinear pro-cess, and as such its full potential as a tool for investigatingmaterial parameters was not realized until the advent oflaser sources.

The process is one in which two quanta are simulta-neously absorbed in an electronic transition. The energy

Page 110: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYD Final Pages

Encyclopedia of Physical Science and Technology E005-237 June 15, 2001 20:43

702 Excitons, Semiconductor

sum of the two quanta must be equal to the energy of theelectronic transition. In one-photon absorption spectra, theabsorption coefficient as a function of photon energy isobtained. In two-photon transitions, the absorption is de-pendent on two frequencies; therefore, instead of a planecurve, the spectrum is a two-dimensional surface. The twoprocesses are complementary; however, they yield dif-ferent information. The single-photon process is allowedbetween states of different parity, while the two-photontransitions are allowed between states of the same par-ity. The polarization dependence of two-photon absorp-tion is more complicated than it is for one-photon pro-cesses. Even for isotropic materials, the mutual orientationof the electric vectors of the absorbed fields is important.Thus, the two-photon spectrum contains more informa-tion than the one-photon spectrum. All two-quantum pro-cesses require an intermediate state in which one photonis absorbed or emitted and the atom is left in an excitedstate.

The fine structure of the 2P states of the exciton fromthe top valence level in CdS has been studied. In this ex-periment, a visible dye laser and a CO2 laser were used.From the polarization selection rules, it was shown thatthe visible photon created the virtual 1s exciton, and theabsorption of the infrared photon brings it to the final pstate. Using these results, the two-photon absorption coef-ficient was calculated and compared with the experimen-tally measured results. The agreement between theory andexperiment is very good.

IV. SPECIAL PROPERTIES OF EXCITONS

A. Introduction

The excitonic properties of the semiconductor are key tothe development of a large number of devices. In thissection five of the properties that have direct applicationwill be outlined. The first is the exciton states in quantumwells. The binding energy of the exciton can be adjusteddue to the thickness of the layers of the different semi-conductor materials. This leads to having different energyresponses, as well as many other electrical and opticalproperties.

The second area is hole–electron droplets, which areextremely useful in the study of many-body effects andthe phase transition from a gas into a liquid. The role ofexcitons in developing a high-temperature superconductoris given in Section IV.D. When this property is exploited,it could have a major impact on modern electronic de-vices. The last two areas outlined are lasing transitionsand optical bistability. These two areas lead to totally op-tical switching devices, which again may lead to manyuseful devices such as the optical computer.

B. Excitons in Quantum Wellsand Quantum Dots

Superlattice structures have generated considerable in-terest for more than a decade because of the noveltransport phenomena predicted for such structures. Thesuperlattice is a multilayered periodic structure having di-mensions varying from a few-angstroms to hundreds ofangstroms. Carriers in semiconductor superlattices maybe confined to certain layers by the superlattice potentialvariations, resulting in new conductivity properties. Theallowed carrier energy levels are determined by quanti-zation effects when the confined regions are sufficientlysmall. The confined regions produce new optical effects aswell as electrical effects. Superlattices emerged as practi-cal structures when the metal organic chemical vapor de-position (MOCVD) and molecular beam epitaxy (MBE)crystal growth techniques evolved, making high-qualitystructures feasible. Very thin layers with smooth surfacemorphology can be grown by these techniques. One ofthe very common heterostructures produced by these tech-niques is GaAs/Alx Ga1−x As. By cladding the GaAs layerwith GaAIAs barriers, the electrons and holes are con-fined within the GaAs well, resulting in a modification oftheir energy levels in the well. Repeating the growth ofthese layers results in a multiquantum well (MQW) struc-ture, the number of wells being equal to the number ofrepeated cycles. When the layer thicknesses are small, theelectrons are confined as quantized electron waves. If thebarrier-layer thicknesses are large enough, tunneling be-tween wells does not occur. The confinement of carrierswithin the GaAs well results in an effective increase in thebandgap. The low-temperature bandgap of bulk GaAs is1.5196 eV. Alx Ga1−x As has a direct bandgap for x < 0.45,with a bulk bandgap which is 1.25 x eV greater thanGaAs. In the quantum well structure, the difference inbandgap is divided between the conduction band and thevalence band. The percentage contribution to each band isa measure of the confining barrier for that band. Well-sizequantization results in a shift of the allowed energies forelectrons. If infinite confining barriers are assumed, theallowed minimum energies for electrons are given by

En = h2n2/8m∗L2 (46)

where h is Planck’s constant, n is an integer marking thenumber of half-wavelengths of the confined electron, m∗

is the effective mass, and L is the well thickness. These en-ergy shifts for electrons in the conduction band are shownin Fig. 14.

The energy levels in the valence band of the quan-tum well are also modified. In bulk GaAs, the light andheavy hole valence bands are degenerate at K = 0. FromEq. (46) it is seen that the energy of the confined particles is

Page 111: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYD Final Pages

Encyclopedia of Physical Science and Technology E005-237 June 15, 2001 20:43

Excitons, Semiconductor 703

FIGURE 14 Quantum-state energy levels in GaAs/AlxGa1−xAsquantum well. CB and VB refer to conduction and valence bands,respectively.

different for different masses. The layered structure has re-duced the cubic symmetry of bulk GaAs to uniaxial sym-metry. The optical transitions thus become nondegenerate.This feature is shown in the valence band of Fig. 14.

The line shape of the absorption and photolumines-cence bands of GaAs (MQW) is excitonic. Verificationof these bands as being assigned to light and heavy-hole free-exciton transitions has been made from polar-ization measurements. These measurements include opti-cal spin-orientation measurements and linear polarizationmeasurements and linear polarization measurements ofemission emanating from a cleaved edge of the MQW.

The lowest level conduction and valence subband lev-els are shown in Fig. 14. In absorption and photolumi-nescence, it is the excitons associated with these subbandlevels that are observed. A summary of a calculation ofthe energy levels of heavy- and light-hole excitons associ-ated with the lowest electron and hole subbands for finitevalues of the potential barrier heights is presented.

The Hamiltonian of an exciton associated with either theheavy- or the light-hole band, in a quantum well structureas shown in Fig. 14, within the framework of an effectivemass approximation is given by:

H = −h2

2µ±

[1

ρ

∂ρρ

∂ρ+ 1

ρ2

∂2

∂φ2

]− h2

2me

∂2

∂z2e

− h2

2m±

∂2

∂z2h

− e2

ε0|re − rh| + Vew(Ze) + Vhw(Zh) (47)

where me is the effective mass of the conduction electron,ε0 is the static dielectric constant, m± is the heavy (+) orlight (−) hole mass along the Z direction, and µ± is thereduced mass corresponding to heavy (+) or light (−) holebands in the plane perpendicular to the Z axis. Both µ±and m± can be expressed in terms of the Kohn-Luttingerband parameters γ1 and γ2 as

1

µ±= 1

me+ 1

m0(γ1 ± γ2) (48)

and

1

m±= 1

m0(γ1 ∓ 2γ2) (49)

where m0 is the free electron mass. In these equationsthe upper sign refers to the JZ = ± 3

2 (heavy-hole) bandand the lower sign to the JZ = ± 1

2 (light-hole) band. Thepositions of the electron and hole are designated by re andrh, respectively,ρ,φ, and Z are the cylindrical coordinates.The potential wells for the conduction electron, Vew(Ze)and the holes Vhw(Zh) are assumed to be square wells ofwidth L

Vew(Ze) ={

0, |Ze| < L/2Ve, |Ze| > L/2

(49a)

and

Vhw(Zh) ={

0, |Zh| < L/2Vh, |Zh| > L/2

(49b)

The values of Ve and Vh are determined from the Al con-centration in the AlxGa1−xAs barrier.

An exact solution of the Schrodinger equation corre-sponding to the exciton Hamiltonian Eq. (47) is not pos-sible. A variational approach was used to calculate theground-state energy E1 of the Hamiltonian. The bindingenergy of the ground state of an exciton E1s is then ob-tained by subtracting E1 from the sum of the lowest elec-tron and hole subband energies (Ee + Eh). These subbandenergies are determined by solving the transcendentalequations for finite square-wells.

The binding energy of the ground state of a heavy-holeexciton E1s(h) (solid lines) and a light-hole exciton E1s(�)(dashed lines) as a function of well-width L for differentvalues of the potential-barrier heights are shown in Fig. 15.For a given value of X , the value of E1s(h) increases asL is reduced until it reaches a maximum and then dropsquite rapidly. Similar behavior is exhibited by E1s(�). Theexplanation of this behavior is that as L is reduced theexciton wave function is compressed in the quantum well,leading to increased binding. However, beyond a certainvalue of L the spread of the exciton wave function into thebarrier becomes important. This causes the binding energyto approach the value in bulk AlxGa1−xAs as L becomesdiminishingly small.

It is seen that for a given value of X , E1s(�) is largerthan E1s(h) for L greater than a certain critical value Lc,at which they become equal. For values below Lc, E1s(�)is smaller than E1s(h). The value of Lc depends on X ;the larger the value of X the smaller the value of Lc. ForX = 0.3, Lc = 50 A. This behavior can be understood asfollows: the value of E1s(�) is greater than that of E1s(h)

Page 112: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYD Final Pages

Encyclopedia of Physical Science and Technology E005-237 June 15, 2001 20:43

704 Excitons, Semiconductor

FIGURE 15 Variation of the binding energy of the ground state,E1s, of a heavy-hole exciton (solid lines) and a light-hole exciton(dashed lines) as a function of the GaAs quantum-well size (�) forA� concentration X = 0.15 and 0.3, and for an infinite potentialwell.

for large L . Both increase as L is reduced, E1s(�) lessrapidly than E1s(h) as proportionately more of the light-hole exciton wave function penetrates into the barrier, thusreducing the increase in E1s(�). At a certain value of L ,which depends on X , the two values become equal, andthen E1s(�) becomes smaller as L is reduced further. Thiscontrasts with the behavior of E1s(h) and E1s(�) for infi-nite potential barriers where E1s(�) is always larger thanE1s(h).

In the case of quantum dots, two general configurationsare considered. The energy levels in the first case, type Iquantum dots, are similar to those in Fig. 14. Here the con-duction band of the quantum dot is lower in energy thanthe bulk conduction band and the valence-band energy ishigher that that of the bulk valence band. If the combi-nation dot size and potential barrier are large enough, anexciton may be bound to the quantum dot. The overlapof the electron and hole wave functions of the dot exci-ton will be larger than that of the bulk exciton leading tovery interesting studies of the exchange and correlation

energy terms of the exciton. One can also create excitonsand other multiexcitons, which are also found within thequantum dot, leading to detailed studies of these many-body quasi-states as well.

In the second case, type II quantum dots, either thevalence or conduction band energy of the quantum dotis lower than the corresponding bulk energy bands. Thisleads to one of the carriers of the exciton being foundwithin the quantum dot, while the other carrier is madeup of bulk band states and, in general, insensitive to theenergy states of the dot. As in the case of quantum wells,the superlattice dots of both type I and type II quantumdots will lead to more and novel device applications.

C. Hole-Electron Droplets

Nonequilibrium electrons and holes in semiconductors arebound in excitons at low temperatures by Coulomb attrac-tion. The exciton forms because it represents a state ofslightly lower energy than the unbound hole–electron. Athigh exciting intensities the density of hole–electron pairsis increased and excitons are formed at a higher rate. Athigh concentrations the interaction among electrons be-comes very important, and when a certain threshold den-sity is reached liquid droplets are formed. These collectivedroplets consist of nonequilibrium electrons and holes;therefore, when electron–hole recombination occurs, spe-cific radiation is emitted.

This intense radiation was first observed in Si. The spec-tra contained, in addition to the well-known peaks due tothe annihilation of free excitons with appropriate phonons,some broad bands of radiation shifted toward lowerenergies.

The formation of a condensed phase of nonequilibriumcarriers was considered, to account for the new radiationpeaks. If the collective interaction predominates in thecondensed phase, survival of excitons as quasi-particlesis doubtful. It was shown that the condensed phase con-sists of a degenerate electron–hole plasma characterizedby metallic properties.

The most convincing evidence that the new substanceis in a separate phase was obtained from light-scatteringexperiments. The effect is analogous to the scattering oflight by drops dispersed in a fog. The range of anglesthrough which the radiation is scattered depends on thesize of the particles, the larger scattering angles resultingfrom smaller particles.

The far-infrared absorption of the condensed phase inGe was investigated. The results were explained by as-suming that the absorption is caused by the excitation ofplasmons in the drops of the condensed phase. From theseexperiments the drop radius was estimated to be R ≈ 10−3

to 10−4 cm.

Page 113: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYD Final Pages

Encyclopedia of Physical Science and Technology E005-237 June 15, 2001 20:43

Excitons, Semiconductor 705

When the excitons enter the liquid state (electron–holedrops), the electron and hole give up their exclusive asso-ciation and enter a sea of particles in which they are boundequally to all of the other charge carriers in the droplet.The droplet therefore is made up of independent electronsand holes. Since the density in the droplet is greater thanin the exciton gas, it can be qualitatively understood whydroplets form, by considering the relation of energy tocharge carrier distance. Free electrons and holes whichhave the greatest separation recombine with the highestenergy. Excitons form only when the electron and holeare coupled, resulting in an appropriate Bohr radius forthe particular material being considered (in the case of Geit is approximately 115 A). The energy of the resultingrecombination radiation from the excitons is less than theenergy of the recombination radiation of the free electronsand holes, reflecting the exciton binding energy. In the caseof the liquid droplet, the electron–hole distance is still fur-ther reduced (100 A for Ge). Therefore, the liquid stateforms because it is a reduced energy state of the system.

The liquid is made up of independent electrons andholes, which gives a metallic character to the liquid,whereas the exciton gas is an insulator. In the exciton gasthe particles are far enough apart to behave as a classicalgas, they move independently and their velocities are de-termined by random processes. The probability of findinga particle with a given energy falls off exponentially asthe energy is increased. The exciton gas obeys Maxwell–Boltzmann statistics. The combination of the statisticaldistribution in the gas with the density of states gives theshape of the luminescent line as shown in Fig. 16. In theliquid droplet the electrons and holes are close enough to-gether so that the droplet must be considered as a singlesystem. The availability of states for occupation in such asystem is determined by the Pauli exclusion principle. Theprobability of the charge carriers occupying the availablestates is determined by Fermi–Dirac statistics. The upper-most filled state at the absolute zero of temperature canbe considered the Fermi level. Here again, the statisticaldistribution coupled with the density of states determinesthe shape of the luminescent peak. Here it is assumed thatany electron is equally likely to recombine with any hole.Recombination in which both particles have very high orvery low energies is unlikely. The maximum intensity willoccur when the difference in energy of the two particlesis near the medium value as shown in Fig. 16. The lu-minescence spectrum of free excitons and electron–holedrops in Ge is also shown in Fig. 16. The line shapes agreereasonably well with theory.

D. Exciton Mechanism in Superconductivity

The experimental discovery of superconductivity inLa2−x Srx CuO4 with an upper limit of the critical tempera-

FIGURE 16 Line shapes for free excitons and electron–holedrops in Ge( ), the combination of the statistical distribution withthe density states(– – –), and experimental spectra. [From Lo, T.K. (1974). Solid State Comm. 15, 1231.]

ture Tc in the 30–40 K range in 1986, and in YBa2Cu3O7−x

with Tc in the 90–100 K range in 1987 surpasses the Tc’sthought possible for normal phonon processes. Soon after-ward Tc’s up to 107 and 125 K were discovered in similarcompounds containing Bi and T1, respectively. Analyz-ing a large number of experiments, one can conclude thatthe superconductivity in these ceramic materials is due toweakly coupled, quasi-particle pairs. This indicates thatthe Bardeen, Cooper, Schrieffer (BCS) theory can be ap-plied. This model would explain the superconductivity inthese materials as resulting from the replacement of thephonon by the exciton as the mediating particle of thecoupling field.

It is generally considered that one must have twodifferent electronic systems closely located in space butseparated by energy and character for one to have anexcitonic mechanism for superconductivity. Such a situa-tion exhibits itself in the compounds discussed above. Forexample, the YBa2Cu3O7−x system has some localizedelectrons, some electrons that are covalently bonded, andsome electrons which form a free, quasi-particle structurenear the Fermi surface. YBa2Cu3O7−x is made up of lay-ers or layers or layer structures which are predominantlybonded together by ionic forces. In the first layer, Y givesup three electrons, then comes a Cu-O plane, where theoxygen can be assigned nearly two extra electrons andthe Cu has given up two electrons and is almost doublypositively charged. Then, comes a Ba-O layer followedby a Cu-O chainlike structure. This is followed by a Ba-O

Page 114: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYD Final Pages

Encyclopedia of Physical Science and Technology E005-237 June 15, 2001 20:43

706 Excitons, Semiconductor

plane, another Cu-O plane, and Y3+ plane. Focusing onthe Cu-O plane and chain, it is found that the plane valenceand conduction-like states are pushed to lower energy bythe close proximity of the Y3+ compared to the chainlikeelectronic states. One finds that some of the chain has hole-like quasi-particle states at the Fermi surface. These statesare located at the Brillouin zone boundary. These holelikestates are intinerant and are the current carrying states.

There is another state found both in the chain and theCu-O plane which appears to be caused by approxima-tions that have been made in the electronic band structurecalculations which do not account fully for correlationeffects. To understand this more fully, one has to resortto atomiclike calculations and focus on the Cu-like statesembedded both in the plane and the chain including cor-relation. One finds a ground state for the Cu to be Cu2+

with one of the d electronic states unoccupied and at amuch higher energy. Namely, one of the d (x2 − y2) statesis not converged and Cu2+ does not exist in the groundstate. Returning now to the approximations that are madein one electron band structure, one finds this artificial statearising very quickly from the center of the Brillouin zoneto 2 to 3 eV above the Fermi surface. In this way the bandcalculation tries to account for these correlation effectsand form Cu+2 (d9). There is no experimental verificationfor this state, as it is only an artifact of the approximation.Also, using the atomiclike calculations one finds the firstexcited state (the primary exciton) in the 4–6 eV rangefor the Cu electronic system. These states are seen in theoptical spectra experimentally.

The above discussion demonstrates the two electron-like systems with which to develop an excitonic theory forsuperconductivity. The holelike states are predominantlyrelated to the O2− electronic states of the chain. As thisquasi-particle travels along the chain, it approaches the lo-calized Cu-like states and polarizes the Cu valence states.The polarization states are represented by the mixing inof the 4 s and p conduction bands of the Cu electronicsystem. This is represented in the model by the excitondiscussed above.

One would ordinarily expect that the holelike states ofO2− would be able to screen out this interaction muchmore readily than the polarization of the d-like electrons.However, as stated before, the planelike states of Cu-O aresuppressed to lower energy than those of the chain. Thus,when the holelike states attempt to readjust they tend tomix in the conductionlike states from the plane, causingtheir polarization charge to be distributed into other layersaway from the chain. This leaves the d-like states polarizedwithout the total dampening out of the effect by the quasi-particles at the Fermi surface.

Another important feature is that the polarization distur-bance cannot keep up with the quasi-particle, but remains

behind at the Cu2+ site and cannot dissipate before thesecond quasi-particle arrives to absorb the energy fromthe field. The two quasi-particles thus form a Cooper pairmediated by the excitonic field of the Cu ion.

Starting with the energy band results and the correla-tion effects related to the Cu2+(d9) one can add in the twoquasi-particle pair states (Cooper pair states) and the po-larization field to derive an effective interaction betweenthe two quasi-particle states given below. These two quasi-particles are two hole states from O2− -like states that areat the Fermi surface. This effective interaction is given by

Veff(q) = 4πε2

|q|2κ(q)

[(ω2

exc

/κ(q)

)(1 − 1/ε∞)

q20 − ω2

exc

/κ(q) + iδ

+ 1

](50)

In the above expression ωexc is the Cu 3d − 4s and p exci-ton frequency and κ(q) is the screening function caused bythe O2− quasi-particles. q0 are the calculated band energiesof the hole states and ε∞ is the dielectric constant in infi-nite frequency of the crystal. (4πe2)/(|q|2κ(q)) representsthe Coulomb interaction. One finds if the first term in thebracket on the right-hand side is negative and has an ab-solute value larger than one, superconductivity will exist.It is interesting to note that the first term on the right-handside will have a different sign as long as ω2

exc\κ(q) is largerthan q2. As more and more hole states are added κ(q) be-comes larger; thus, the ratio of ω2

exc\κ(q) becomes smallerand the superconductivity will disappear. This happensexperimentally in YBa2Cu3O7−x as x increases, and su-perconductivity is totally gone in these compounds by thetime x has reached 0.5. As we remove oxygen from thiscompound, the vacancy appears in the chain and has theeffect of lowering the Fermi energy. This creates the num-ber of holes near the Fermi surface and, in turn, increasesκ and diminishes the ratio.

There are also four other well-defined experiments thatto date are inexplicable without invoking a superconduct-ing state in which the phonon-mediated electron–electroninteraction is replaced by excitons.

The first experiment is one in which an attempt wasmade to measure the effects of gravity on the position of anelectron in a copper tube. A large temperature-dependenttransition in the magnitude of the ambient axial electricfield inside the vertical copper tube was found. Above atemperature of 4.5 K the ambient field was 3 × 10−7 V/mor greater. Below 4.5 K the magnitude of the ambient fielddrops very rapidly, to 5 × 10−11 V/m at 4.2 K.

These measurements were made using time-of-flightspectra of an electron traveling in the center of the coppertube, and the field effects on the tube were screened out.The equation for the time of flight spectra is

Page 115: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYD Final Pages

Encyclopedia of Physical Science and Technology E005-237 June 15, 2001 20:43

Excitons, Semiconductor 707

T =(

m

2

)1/2 ∫ h

0[W − ezE(amb)(z)

− ezEapp − mgz]−1/2dz (51)

where W is the related kinetic-energy term of the elec-tron, h is the length of the tube, mgz is the gravitationalterm, Eamb is the effective ambient electric field [Eamb isassumed to consist of a constant term due to gravitation-ally induced distortions of the tube and a term due to thepatch (roughness of the surface) effect with a complicatedz dependence], and Eapp is the uniform applied field. It isbelieved that this ambient field is screened out by super-conducting electrons in the oxide layer that forms on thecopper tube.

The next two experiments were measurements of CuClthat revealed large changes in electrical conductivity andmagnetic susceptibility. It was found that when polycrys-talline samples of CuCl under hydrostatic pressure of ap-proximately 5 kbar were rapidly cooled (20 K/min), theywent through repeated transitions from a state of weakdiamagnetism to a state of strong diamagnetism. The di-magnetic susceptibility (X ) varied between 10−5 and ap-proximately −1. When X ≈ −1 the Meissner effect,in which the magnet field is totally excluded from thesample, was observed. This phenomenon is characteris-tic of superconducting materials. The strong diamagneticstate was accompanied by a sharp increase in electricalconductivity.

Similar experiments were performed on the same ma-terial in carefully controlled environments. In these ex-periments, the diamagnetic anomaly was observed above90 K over a temperature range of 10–20 K, accompaniedby a sharp increase in electrical conductivity.

The above experiment can be explained using the ex-citon model to describe events within the calculated bandstructure of CuCl. In zero applied pressure, CuCl is adirect-gap material with minimum energy at the � point.When pressure is introduced into the band calculation byreducing the lattice constant, the conduction band at theX point moves down in energy relative to the conductionband at the � point. The conduction band at X becomesdegenerate with the conduction band at � for a latticeconstant reduction of 0.2%, which is consistent with thepressures used in the experiments.

Another important point is that oxygen was present inall of the CuCl samples. The calculations reveal that theenergy of the oxygen-bound electron is less than the bind-ing energy of the exciton, and oxygen will give up someof its electrons to the conduction band at the experimentaltemperature.

An upper limit on the critical temperature Tc of anelectron–exciton coupled superconductor can be obtained

assuming the static limit of the coupling of the electronand exciton and assuming the Tc can be approximated by

Tc = 1.14hωexc

KBexp[−1/u D(EF )] (52)

Here hωexc is the energy of the exciton, u is the couplingcoefficient of the exciton, KB is Boltzmann’s constant, andD(EF ) is the density of electrons at the Fermi surface.When the conduction bands at � and X are degenerate,D(EF ) is a maximum. It was found that for carrier den-sities N (E) = 10−1 e/unit cell, Tc for a nondegenerateconduction band was 38 K, while it was 1745 K for thedegenerate case. For N (E) = 10−2 e/unit cell, the cor-responding numbers are 10−2 K and 55 K, respectively.Therefore it is possible that with 10−2 e/unit cell, super-conductivity close to or above liquid-nitrogen tempera-tures can be achieved.

In the final experiment, studies of CdS samples thathad been pressure-quenched at 77 K showed strong dia-magnetic effects. A super paramagnetic effect was alsodetected. In this experiment one must explain the pres-sure effects, the super paramagnetic effect, the diamag-netic anomaly, and the fact that these are only observed inselected samples. As in the case of CuCl, these effects areinterpreted in terms of a superconducting state induced bythe interaction of the band structure with applied pressureand specific impurity effects.

E. Lasing Transitions

Semiconductor lasers were first reported in 1962, the firstbeing the GaAs injection laser. Since that time many semi-conductor lasers have been produced from III–V com-pounds, and they cover an appreciable portion of the spec-trum from 0.65 to 8.5 µm.

Shortly thereafter, rapid developments were made in thearea of II–VI compound lasers. The first report of high-efficiency laser action was in electron-beam-pumped CdS.Even higher efficiencies were later achieved in electron-beam-pumped CdSe. The spontaneous line, centered at6800 A for CdSe, corresponds to an emission line that hasbeen observed in photoluminescence experiments and hasbeen attributed to an exciton bound to an acceptor. Thespontaneous line in CdS at 4.2 K is the 4888-A line andwas also associated with an exciton bound to a neutralacceptor site.

The recombination radiation from highly excited CdScrystals was investigated. The experimental technique al-lowed the determination of the spectral dependence of theoptical gain. From these investigations, it was concludedthat at least three different processes can contribute to laseraction. A low-gain process results from the annihilationof a free exciton and the emission of a photon and an

Page 116: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYD Final Pages

Encyclopedia of Physical Science and Technology E005-237 June 15, 2001 20:43

708 Excitons, Semiconductor

LO phonon. A medium-gain process is due to an exciton–exciton interaction, and a high-gain process involves anexciton–electron interaction.

The interpretation of the excitation dependence of thespontaneous emission and of the gain for the free exciton-related processes are as follows:

1. For the excitation intensity J < 1 A/cm2, only theEx –LO process yielded some gain.

2. For 1 A/cm2 < J < 3 A/cm2, the low-energy tailresulting from electron–exciton interaction linedominates.

3. For J > 3 A/cm2, the low-energy tail resulting fromelectron–exciton interaction is the dominant gainprocess. CdS, CdS: Se, and CdSe lasers provide atunability from 0.5 to 0.7 m. In CdS, mode-lockedpulses shorter than 4 psec have been obtained.

F. Optical Bistability

Optical bistability can be defined as any optical systempossessing two different steady-state transmissions for thesame input intensity. To achieve optical bistability, the op-tical device must have feedback. This implies that thetransmission intensity must have some dependence on theoutput intensity. Many of the bistable devices have beenFabry–Perot etalons containing materials having nonlin-ear indices of refraction at high input light intensities. Inthis type of device, the cavity is tuned so that a trans-mission maximum lies close to the laser frequency, butstill having low transmission at low input intensities. Asthe input intensity is increased, the light penetrating thecavity will be sufficient to cause the nonlinear index ma-terial to tune the cavity toward the laser frequency. Thishas been termed intrinsic dispersive optical bistability—intrinsic because the nonlinear index material provides thefeedback, and dispersive because the reflective or real partof the nonlinear susceptibility is more important than theimaginary or absorption part.

For practical optical bistable devices, attention hasbeen focused on semiconductor materials. Semiconduc-

tors have high absorption coefficients, particularly at res-onant excitonic transitions. These materials will produceabsorptive and dispersive bistable devices. Useful absorp-tion in these materials is achieved in very short transvers-ing paths, making very fast switching devices achievable.Both the free- and bound-exciton transitions in semicon-ductors have shown promise for high-speed, low-powerswitching devices. These characteristics make possiblefast, all-optical, signal-processing devices.

SEE ALSO THE FOLLOWING ARTICLES

BONDING AND STRUCTURE IN SOLIDS • CRYSTALLOGRA-PHY • GROUP THEORY, APPLIED • LASERS, SEMICONDUC-TOR • METALORGANIC CHEMICAL VAPOR DEPOSITION

(MOCVD) • MOLECULAR BEAM EPITAXY, SEMICONDUC-TORS • QUANTUM MECHANICS • SUPERCONDUCTIVITY

BIBLIOGRAPHY

Baldereschi, A., and Lipari, N. O., (1971). Phys. Rev. B3, 439.Cardona, M. (1969). “Modulation Spectroscopy,” by F. Seitz and D.

Turnbull and E. Ehrenreich. Solid State Physics, Suppl. 11. AcademicPress, New York.

Craig, D. P., and Walmsley, (1968). “Excitons in Molecular Crystals,”Benjamin, Elmsford, New York.

Davydov, A. S. (1962). “The Theory of Molecular Excitons,” McGraw-Hill, New York.

Dexter, D. L., and Knox, R. S. (1965). “Excitons,” Interscience Publish-ers, New York.

Dimmock, J. O., (1967). “Theory of Excton States,” Semicond. andSemimetals 3, Chap. 7, Academic Press, New York.

Green, R. L., Bajaj, K. K., and Phelps, D. E., (1984). Phys. Rev. B29,1807.

Knox, R. S., (1963). “Theory of Excitons,” Solid State Phys. Suppl. 5,Academic Press, New York.

Rashba, E. I., and Sturge, M. D., ed. (1982). “Excitons,” North HollandPublishing Co., Amsterdam.

Reynolds, D. C., and Collins, T. C., (1981). “Excitons, Their Propertiesand Uses,” Academic Press, New York.

Thomas, D. G., and Hopfield, J. J., (1959). Phys. Rev. 116, 573.Thomas, D. G., and Hopfield, J. J., (1962). Phys. Rev. 128, 2135.Wheeler, R. G., and Dimmock, J. O., (1962). Phys. Rev. 125, 1805.

Page 117: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYK/FQW P2: FYK Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN005A-240 June 26, 2001 19:22

FerromagnetismH. R. KhanForschungsinstitut fur Edelmetalle und Metallchemieand University of Tennessee at Knoxville

I. Basic Concept of MagnetismII. Origin of MagnetismIII. Magnetization CurvesIV. The Hysteresis LoopV. Anisotropic Magnetization

VI. Magnetic OrderVII. Ferromagnetic Domains

VIII. MagnetostrictionIX. MagnonsX. Ferromagnetism and SuperconductivityXI. Magnetoresistance and Giant

MagnetoresistanceXII. Ferromagnetic Materials and Their Applications

GLOSSARY

Magnetic field A magnet attracts a piece of iron at a dis-tance and this is caused by the magnetic field or thefield of force of the magnet.

Magnetic poles A magnet has north and south poles. Likepoles repel and unlike poles attract each other with aforce that varies inversely as the square of the distancebetween them. A unit pole is defined in such a way thattwo like unit poles placed one centimeter apart in vac-uum would repel each other with a force of one dyne.

Magnetic field strength The magnetic field strength maybe defined in terms of magnetic poles, for example, one

centimeter from a unit pole the field strength is oneOersted. In the MKS system, the unit of field strengthis one ampere-turn/meter.

Magnetic moment The magnetic moment of a smallplane coil is a product of the current I flowing in thecoil and the area of the coil A, IA (Am2). The magneticmoment of a small magnet is equal to the magneticmoment of a small coil that would experience the sametorque when placed in the same orientation at the samelocation in the same magnetic field.

Magnetic flux density In the international system of units(SI), the magnetic field intensity H (A/m), and themagnetization M (A/m) are related to the magnetic

759

Page 118: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYK/FQW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN005A-240 June 15, 2001 20:47

760 Ferromagnetism

induction or the magnetic flux density B (Wb/m2)through the relation

B = µ0(H + M)

where µ0 is the permeability of the free space and hasthe value 12.57 × 10−7 Wb/Am.

Magnetization (M) Magnetic moment per unit volume.Coercive force (Hc) Negative value of the magnetizing

field H that makes the magnetic induction B of a fer-romagnetic material zero.

Remanence (Mr ) If the magnetic flux density does notdrop to zero upon reducing the magnetic field on thespecimen to zero, then this remained magnetic flux den-sity in the specimen is called the remanence.

Permeability (µ) The ratio B/H is the permeability of amaterial.

Magnetic susceptibility (χ) The ratio M/H is called themagnetic susceptibility of a material and may be ex-pressed in mass, volume, or molar units.

Curie temperature (Tc) Temperature above which thespontaneous magnetization of a ferromagnetic materialvanishes.

Neel temperature (T N) Temperature below which theinteraction between the atomic moments affecting an-tiparallel orientation surmounts the thermal agitation.At Neel temperature TN , the susceptibility of a materialhas its maximum.

A SPECIAL ARRANGEMENT of electrons in theatoms causes a material to become a ferromagnetic mate-rial. For example, the incompletely filled M shells of iron,cobalt, and nickel atoms are responsible for the ferromag-netism in these metals. Atoms behave as small magnets or-dered in parallel arrangement in ferromagnetic materials.The magnetization curve and the hysteresis loop determinewhether it is a hard or soft ferromagnetic. The parametersdetermined from the hysteresis loop are the permeability,coercivity, remanence, and the area of the loop itself. Thearea of the loop gives the energy loss per unit volume of thespecimen per cycle and is dissipated as heat energy calledas hysteresis loss. The hysteresis loop and the parameterderived from it determine the suitability of a material in aparticular application.

I. BASIC CONCEPT OF MAGNETISM

All materials occurring in nature are magnetic. They maybe paramagnetic, diamagnetic, ferromagnetic, antiferro-magnetic, or ferrimagnetic. The magnetic behavior of amaterial depends on its electronic structure. For example,the ferromagnetism of iron, cobalt, and nickel in the pe-

riodic system is due to the incompletely filled M shellsof their atoms. Due to these incompletely filled shells,the atoms behave as magnets ordered in parallel arrange-ment in ferromagnetic materials. In antiferromagnetic ma-terials, the atomic magnets are ordered in antiparallel ar-rangement. Ferrimagnetic materials are a special case offerromagnetic materials. The neighboring atoms interactwith each other in a material, and this interaction forceis dependent on the distance between neighboring atomsand the diameter of the atomic shell responsible for theatomic magnetic moment. The sign and magnitude of thisinteraction force cause a material to show different mag-netic behavior. The usefulness of a ferromagnetic materialis shown by its magnetization curve and hysteresis loop.The hysteresis loop provides information about the “per-meability” and “coercivity” of a ferromagnetic material.For example, a soft magnetic material to be used as atransformer core should have a high value of permeabil-ity, whereas a hard or permanent magnetic material shouldhave a high value of coercivity. The physical condition,purity, and composition of a material control the usefulmagnetic properties like permeability and coercivity, andthey can be modified by controlling different parametersof a material. By rapid solidification of materials from themelt, very soft magnetic and hard magnetic materials canbe produced. Both the hard magnetic and soft magneticmaterials find applications in the electrical and electronicindustry. Some examples of their uses are transformers,motors, generators, relays, telephone cables, audio andvideo recording and replaying, and data memory systemsand computers.

A magnetic field gradient is generated by a magnet con-sisting of a flat north pole and a pointed south pole, asshown in Fig. 1. The magnetic field is stronger near thepointed pole. lf a piece of material in the cylindrical form issuspended with a string between the poles, then a magneticforce is generated on this material. The kind of magnetismon the material is determined from the direction of the

FIGURE 1 An experimental setup to check the paramagnetism,diamagnetism, and ferromagnetism of a material.

Page 119: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYK/FQW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN005A-240 June 15, 2001 20:47

Ferromagnetism 761

magnetic force on this material. If the material is stronglyattracted toward the pointed pole, then it is ferromag-netic. A paramagnetic material is weakly attracted towardthe pointed pole, and a diamagnetic material is repelledby the pointed pole. For example, iron is ferromagnetic,aluminum is paramagnetic, and bismuth is diamagnetic.Materials can be synthesized that show ferromagnetism,although the constituents may be paramagnetic or diamag-netic. Ferrimagnetism and antiferromagnetism are closelyrelated to ferromagnetism. Many compounds are ferri-magnetic. Ferrimagnetic materials in general are oxidesof the ferromagnetic metals.

II. ORIGIN OF MAGNETISM

The earliest human experience with magnetism involveda mineral magnetite (also known as lodestone). This is theonly material that occurs naturally in a magnetic state. Inthe classical picture, an atom consists of a nucleus sur-rounded by a number of electrons that depends on theelement and its position in the periodic system. The elec-trons are distributed in different shells, named K, L, M,and N, outward from the center. Each shell can accom-modate only a certain number of electrons, the maximumnumber being 2n2, where n is the number of the shell. Theinnermost shell K is complete with 2 × 12 = 2 electrons,the second shell L is complete with 2 × 22 = 8, the thirdshell M with 2 × 32 = 18 electrons, etc. For example, theferromagnetic element iron has its M shell incompletelyfilled and contains only 14 electrons. An electron carriesa negative charge, and its motion in an orbit gives rise toan electric current. The orbital motion of the electron isequivalent to a thin magnet and produces a magnetic field.Besides the orbital motion, the electron also spins aroundits own axis. The spinning negative charge also gives riseto an electric current and behaves like a small magnet. Thecompletely filled shells are magnetically neutral, becausean equal number of electrons spin in clockwise and an-ticlockwise directions. As mentioned earlier, the M shellof iron contains only 14 electrons instead of 18, and nineelectrons spin in clockwise and five in anticlockwise direc-tions. The uncompensated four electrons produce a mag-netic field at a distance equal to four electrons.

Therefore iron has a magnetic moment of 4 units. Theother ferromagnetic element, cobalt, has only 15 electronsin the M shell and carries a magnetic moment of 3 units.Nickel is also ferromagnetic because its atom in the Mshell has only 16 electrons and carries a magnetic mo-ment of 2 units. Thus we have seen that ferromagnetismoriginates from the unfilled M shells in iron, cobalt, andnickel atoms.

FIGURE 2 The crystal structures of the ferromagnetic metalsiron, cobalt, and nickel.

The free isolated atoms of iron, cobalt, and nickel havemagnetic moments equivalent to 4, 3, and 2 units. In met-als, the atoms are not isolated but are packed together. Thispacking of atoms influences the distribution of electrons inthe M shell, because this is outside of an atom. Thereforethe experimentally measured values of the magnetic mo-ments of iron, cobalt, and nickel are 2.22, 1.71, and 0.606units, and are lower compared to the isolated free atoms.

Each element in the periodic system has a definite crys-tal structure. In iron and nickel atoms, the atoms occupythe body-centered cubic (bcc) and face-centered cubic(fcc) lattice sites, respectively, whereas in cobalt the atomsoccupy the hexagonal lattice sites as shown in Fig. 2. Inferromagnetic elements, each atom carries a magnetic mo-ment and a magnetic axis, and even in the absence of anexternally applied magnetic field these atomic magnetspoint in one direction, as shown in Fig. 3. The internalmagnetic field required to order the atomic magnets in onedirection is called a Weiss molecular field. For example,in the case of iron its intensity is 5.5 × 106 A/cm.

The interaction of neighboring atoms with permanentdipole moments causes the alignment of atomic mag-nets in the same direction (ferromagnets; for example,iron, nickel, rare earths with 64 < Z < 69; alloys likeCu2MnAl, etc.; magnitude of χ is large below Tc), inopposite directions (antiferromagnets; for example, MnO,CoO, NiO, Cr2O3, CuCl2 etc., magnitude of χ is like para-magnetic materials), or in alignment in opposite directionsbut moments are unequal [ferrimagnetic; Fe3O4 (mag-netite), γ Fe2O3 (maghemite) etc., magnitude of χ is like

FIGURE 3 Arrangement of the atomic magnets in a ferromag-netic material.

Page 120: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYK/FQW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN005A-240 June 15, 2001 20:47

762 Ferromagnetism

FIGURE 4 Ratios of the interatomic distance D to the unfilledatomic shell diameter d causing different kinds of magnetism.

ferromagnetic materials]. The force of interaction is afunction of the ratio of the distance D between the neigh-boring atoms and the diameter d of the atomic shell respon-sible for the atomic magnetic moment. The force of inter-action changing from positive to negative value causes amaterial to become ferromagnetic, weakly ferromagnetic,paramagnetic, or antiferromagnetic as shown in Figs. 4and 5. The iron, cobalt, and nickel are strongly ferromag-netic because the ratio D/d is larger than 1.5, whereasgadolinium is weakly ferromagnetic because the value ofD/d is about 3.1.

When a ferromagnetic material with its atomic magnetspointed in one direction is heated, the thermal vibrationsof the atoms become stronger with the increase in tem-perature. When the temperature reaches a value called amagnetic change point or Curie temperature, the atomicmagnets orient themselves randomly and the ferromag-netic material transforms to a paramagnetic state. TheCurie temperature of iron is 770◦C, 1130◦C for cobalt,and 360◦C for nickel.

FIGURE 5 Variation of the interatomic interaction force with theratio D/d of the interatomic distance D to the diameter d of theunfilled atomic shell.

FIGURE 6 Experimental setup for plotting the magnetizationcurve.

III. MAGNETIZATION CURVES

The magnetization M of a material is defined as the mag-netic moment per unit volume. The practical usefulnessof a ferromagnetic material is determined from its mag-netization curve. The experimental setup for plotting themagnetization curve is shown in Fig. 6. A thin torroidalring of the ferromagnetic material of cross section A iswound with N turns per meter. A current of I amperesflowing through the winding generates a magnetic flux� = B A in the ring. A flux meter connected with a sec-ondary coil of few turns measures this flux �. The fluxdensity B = �/A is composed of two parts: one arisingfrom the external current flowing in the winding, and thesecond arising from the internal current associated withthe motion of electrons in the ferromagnetic material.

The flux density arising from the external current NIper meter of the winding is µ0 NI, where µ0 is a constant.The magnetization M arising from the internal currentsin the ferromagnetic material may be considered constantacross the cross section. The magnetic field set up by thismagnetization is µ0 M . The magnetic intensity in the coreis the sum of these two contributions:

Bcore = µ0 NI + µ0 M (1)

In the absence of a ferromagnetic core

Bno core = µ0 NI (2)

The ratio Bcore/Bno core is defined as the permeability of aferromagnetic material. The permeability is the ratio of themagnetic intensity in a torroidal core to the intensity thatthe same current in the same winding would produce in theabsence of a ferromagnetic core, and it is dimensionless.The permeability of a material depends on its history andis very high (∼1000) for the soft magnetic materials likeiron.

The value of the magnetizing field H = NI is increasedby increasing the current I in the winding, which increases

Page 121: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYK/FQW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN005A-240 June 15, 2001 20:47

Ferromagnetism 763

FIGURE 7 Plot of the magnetic induction B as a function of themagnetizing field H of a ferromagnetic material, and plot of thepermeability µ as a function of the magnetizing field H (ampereturns per meter).

the value of B measured by the flux meter. A plot of Bversus H is called a magnetization curve (B–H curve)and is shown in Fig. 7. The term µI is the initial perme-ability of the material obtained from initial slope. Whenthe magnetizing field H is increased, the ratio B/H alsoincreases until a maximum value is reached. The slope atthis point, µm = B/H , is called the maximum permeabil-ity. The term B also achieves its maximum value, calledthe saturation magnetic intensity Bs.

The variation of the initial and maximum permeabilityas a function of the magnetizing field of a ferromagneticmaterial is shown in Fig. 7. All ferromagnetic materialsshow this kind of B–H and µ–H behavior, but the mag-nitudes of the permeability and the scales of the B and Hare different for different materials. Many practical uses offerromagnetic materials require them to possess high val-ues of magnetizing fields. Typical examples are the coreof low-current transformers, low-current relays, inductiveloading of telephone cables, and the sensitive detectorsof small field changes. The best ferromagnetic materialsfor these applications are those with highest µI and µm

values.

IV. THE HYSTERESIS LOOP

The initial magnetization as shown in Fig. 7 is not re-versible. When H slowly increases, the value of B also

increases until it reaches its maximum value at point A asshown in Fig. 8.

Thus 0A represents the initial magnetization curve.When the magnetic field is slowly decreased, the flux den-sity B follows the curve AC. The magnetic field is zero atC, but at this point a flux density equal to 0C remains in theferromagnetic material. If the magnetic field is reversedto 0D, the flux density is completely removed from thematerial, and further reducing the field in the negative di-rection brings the flux density to point E. The flux densityfollows the curve EFA if the direction of the magnetic fieldis changed and slowly increased. lf the cycle is repeateda few times, it brings the material in the cyclic state. Theloop ACEFA is called the hysteresis loop. The flux densityat the point C is called remanence, and the reverse field atpoint D is called coercive force.

The total energy required to magnetize a unit volumeof the specimen from 0 to A on the initial curve is givenby

W =∫ A

0H dB

lf the magnetic field is reversed to zero, then the returnedpath on the hysteresis loop is AC and the total energytaken from the magnetizing field H is the area 0AC. If thecurve 0A were traced back to the original path, then theenergy taken from the magnetizing field H would havebeen returned to it and there would have been no loss ofenergy. But in the case of the hysteresis loop shown inFig. 8, there is a loss of energy. The total energy per unitvolume of the specimen taken from the magnetizing fieldH for one complete cycle of the hysteresis loop is the areaof the hysteresis loop, which is

∫ A

0H dB

FIGURE 8 Plot of the magnetic induction B as a function of mag-netizing field H , hysteresis loop.

Page 122: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYK/FQW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN005A-240 June 15, 2001 20:47

764 Ferromagnetism

FIGURE 9 Hysteresis loops of some ferromagnetic materials.

This energy is dissipated as heat energy and is called thehysteresis loss.

Hysteresis loops of small areas are observed for thesoft ferromagnetic materials, whereas the hard ferromag-netic materials show large areas. The hysteresis loop of anideal soft ferromagnet should show just a line, but in prac-tice such materials do not exist. Some amorphous iron-based materials show hysteresis loops of very small area.A typical example of a hard ferromagnetic with a relativelylarge hysteresis loop area is carbon steel. The hysteresisloop area is also an important parameter in determiningthe application of a ferromagnetic material. The hystere-sis loops of some ferromagnetic materials are shown inFig. 9.

Good permanent magnets possess high values of resid-ual flux and coercive force. These ferromagnetic materi-als with high residual flux and coercive force cannot beused in motors or transformers, because the flux changescontinuously and there is an energy loss in each cycle.The energy loss is proportional to the area of the hys-teresis loop. Therefore a ferromagnetic material subjectedto cyclic magnetization, as in motors and transformers,should have as narrow a hysteresis loop as possible.

V. ANISOTROPIC MAGNETIZATION

The energy in single crystals of ferromagnetic materialthat governs the magnetization along the crystallographicaxes is called the magnetocrystalline or anisotropy energy.It is easy to magnetize an iron crystal along the cubic-edge directions rather than along directions of other crys-tal axes. However, a nickel crystal is easily magnetizedalong the long diagonal axis compared to the cubic-edgedirections. Cobalt with its hexagonal crystal structure canbe easily magnetized in the direction of the hexagonalaxis. These crystallographic axes of a ferromagnetic crys-

tal along which they are easily magnetized are called di-rections of easy magnetization.

VI. MAGNETIC ORDER

The magnetic susceptibility χ per unit volume of a mag-netic material is defined as the ratio of the magnetiza-tion M to the macroscopic magnetizing field intensityB:χ = M/B. The magnetic susceptibility may be definedin terms of the unit mass, “mass susceptibility,” or mole(molar susceptibility) of a material, or unit volume, calledvolume susceptibility. The variation of χ with tempera-ture of a paramagnetic material is shown in Fig. 9 and isrelated to the temperature as χ = C/T , called the Curielaw. Here the constant C is called the Curie constant.

A. Ferromagnetic Order

In a ferromagnetic material, the individual magnetic mo-ments are ordered in parallel arangement, as shown inFig. 3. A ferromagnet possesses a magnetic moment evenin the absence of an externally applied magnetic field, andthis spontaneous magnetic moment is also called the satu-ration moment. When a ferromagnet is heated, the parallelarrangement disappears above the Curie temperature. Themagnetic susceptibility of a ferromagnetic material at tem-peratures close to the Curie temperature is related to thetemperature by

χ = C

(T − Tc)which is also shown in Fig. 9.

B. Antiferromagnetic Order

The atomic magnetic moments order in an antiparallelarrangement in an antiferromagnetic material is shown inFig. 10. The resultant moment is zero below the orderingor Neel

χ = C

(T + θ )

FIGURE 10 The arrangement of the atomic magnets in an anti-ferromagnetic material.

Page 123: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYK/FQW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN005A-240 June 15, 2001 20:47

Ferromagnetism 765

FIGURE 11 Temperature-dependent magnetic susceptibility χ

behavior of a paramagnetic, ferromagnetic material.

tempetature TN . The variation of the magnetic suscepti-bility χ of an antiferromagnetic material with temperatureis shown in Fig. 11 and is given by the relation

χ = C

(T + θ )where θ is the Neel temperature TN .

C. Ferrimagnetic Order

In some ferromagnetic materials, the saturation magneti-zation does not correspond to the parallel alignment of theindividual atomic magnetic moments. These materials arethe magnetic oxides with the general chemical formulaMO · Fe2O3, where M may be the metals, for example,zinc, cadmium, iron, nickel, copper, cobalt, or magnesium.These ferrites have the spinel crystal structure. There areeight occupied tetrahedral sites and 16 occupied octahe-dral sites in this crystal structure. In these ferisites iron hastwo different ionic states, ferrous (Fe2+) and ferric (Fe3+).The eight tetrahedral sites in the cubic spinel structure areoccupied by Fe3+, whereas half of the 16 octahedral sitesare occupied by Fe3+ and the rest by Fe2+ ions. The mag-netic moments of eight Fe3+ ions on the tetrahedral andoctahedral sites cancel each other, leaving only the mag-netic moments of the eight Fe2+ ions, as shown in Fig. 12.

VII. FERROMAGNETIC DOMAINS

At temperatures below the Curie point, the magnetic mo-ment may be much less than the saturation moment of aferromagnetic material.

The polycrystalline as well as the single-crystal speci-men consist of small regions called domains, within each

FIGURE 12 The arrangement of the atomic magnets in a ferriteof cubic spinel structure.

of which the local magnetization is saturated. The mag-netic axes of these domains may point in different direc-tions, and it is possible that for a certain arrangement itmight give a zero resultant magnetic moment of the spec-imen. The application of an external magnetic field satu-rates the specimen, because the external field causes theorientation of the domain magnetization in the directionof the applied magnetic field. Small cylindrical magneticdomains may be stabilized in a thin crystal of uniaxialmaterial by applying a bias magnetic field. The bubble di-ameter is on the order of 10 µm. These magnetic bubblesare of interest in high-density memory-storage devices.

As discussed in Section III, the permeability and co-ercivity are important parameters that control the practi-cal application of a ferromagnetic material. The domainstructure of a ferromagnetic material affects both of theseparameters. A pure, well-oriented, and homogeneous ma-terial facilitates the domain boundary displacement andpossesses high permeability. On the other hand, an inho-mogeneous material consisting of multiple phases sup-presses the boundary displacement and possesses high“coercivity.”

F. Bitter developed a simple method to observe the do-main boundaries. A drop of a colloidal suspension of afinely divided ferromagnetic material such as magnetiteis placed on the surface of a ferromagnetic material. Thecolloidal particles in the suspension concentrate stronglyon the boundaries between the domains where the stronglocal magnetic fields exist that attract the magnetic par-ticles. A simple domain structure in a silicon-iron singlecrystal is shown in Fig. 13.

VIII. MAGNETOSTRICTION

When a ferromagnetic material is magnetized, smallchanges in the physical dimensions of the specimen takeplace, and this effect is called magnetostriction. This mag-netostriction of a material is defined as the increase inlength per unit length in the direction of the magnetiza-tion. Magnetostriction is different for different axes of aferromagnetic single crystal. The useful parameter “per-meability” of a ferromagnetic material is related to themagnetostriction.

IX. MAGNONS

A ferromagnetic material in the ground state has all itsspins arranged parallel in one direction, as shown inFig. 14a. The excited state is obtained if the spins arereversed. Figure 14b shows the excited state where onespin is antiparallel. The elementare excitations are the spin

Page 124: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYK/FQW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN005A-240 June 15, 2001 20:47

766 Ferromagnetism

FIGURE 13 Simple domain structure in silicon-iron single crys-tal. [After Williams, Bozorth, and Shockley (1949). Phys. Rev. 75,155.1

waves, and one wavelength is shown in Fig. 14c. Theseelementare excitations are called magnons and are analo-gous to the lattice vibrations or phonons. Spin waves arethe relative orientation of the spins on a lattice, whereasthe lattice vibrations are the oscillations in the relativeposition of the atoms on a lattice. Spin waves have beenobserved by neutron scattering experiments near the Curietemperature or even above the Curie temperature.

X. FERROMAGNETISM ANDSUPERCONDUCTIVITY

Both ferromagnetism and superconductivity involve spinordering. The difference is that in a ferromagnet the spinsorder parallel, whereas in a superconductor they order an-

FIGURE 14 (a) The arrangement of spins in a ferromagnetic ma-terial. (b) The elementare excitation occurs when one spin is an-tiparallel. (c) One wavelength of the spin wave.

tiparallel below the superconducting transition tempera-ture and form the “Cooper pairs.”

A possibility of coexistence of superconductivity andferromagnetism in the same material was proposed. Toobserve this coexistence, the ferromagnetic impuritieswere dissolved in superconducting materials, for example,gadolinium (ferromagnetic) in lanthanum (superconduct-ing). The lanthanum-gadolinium compounds were super-conducting up to 1 at.% gadolinium and became ferromag-netic for the concentrations of gadolinium above 2.5 at.%.

Recently some temary compounds of the formulaMRh4B4, with B = as thorium, yttrium, neodymium,samarium, gadolinium, terbium, dysprosium, holmium,erbium, or lutetium, having CeCo4B4 structure, have beendiscovered. A typical example of a compound showingboth ferromagnetism and superconductivity is Er4Rh4B4,which is superconducting at 8.7 K and ferromagneticat 0.9 K.

XI. MAGNETORESISTANCE AND GIANTMAGNETORESISTANCE

William Thompson in 1857 showed that the electrical re-sistance of a ferromagnetic material, for example, iron,changes under the influence of a magnetic field. Thisphenomen is called magnetoresistance. In most magneticmaterials this magnetoresistance increases with magneti-zation when current and magnetization are parallel anddecreases when they are at right angles to each other. Themagnitude of the change in resistivity caused by magne-tization to the saturation magnetization is usually a fewpercent and rarely exeeds 5% at room temperature.

This change in resistance may be used for readingmagnetically recorded information. Magnetoresistive readheads use permalloy as the magnetoresistive material. Theadvantages of magnetoresistive read head over the induc-tive read head are that it can be made very small to readthe high-density recorded information and the head doesnot have to move relative to the medium. In the conven-tional inductive case, the read head has to move relative tothe medium, because electromagnetic induction can onlybe produced by a changing magnetic flux. The advancedread head is based on the multilayered thin films system.The change in resistance of some of the multilayer systemin a magnetic field can be as large as 60–70% and thiseffect is called Giant Magnetoresistace (Baibich et al. andBinasch et al.). These multilayered systems can be used tomake very sensitive and small read heads. Typical exam-ples of multilayer systems are the iron-chromium systemwith alternate layers of iron and chromium or the cobalt-copper system with alternate layers of cobalt and copper.The thickness of the magnetic and nonmagnetic layers is

Page 125: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYK/FQW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN005A-240 June 15, 2001 20:47

Ferromagnetism 767

few nanometers. In these systems the magnetic layers aremagnetized antiparallel via exchange interaction acrossthe nonmagnetic layers, for example, copper or chromium.When a sufficiently large magnetic field is applied, it canalign the magnetization of the magnetic layers in a paral-lel direction. In a magnetic metal, the free electrons whichcarry the electric curent have their spins aligned eitherparallel or antiparallel to the magnetization of the metal.Electrons experience different resistance to their motiondepending on the direction of their spin relative to themagnetization. When current flows in the plane of themultilayers with antiparallel magnetization, the electronsexperience a high resistance; but when a magnetic fieldis applied and the magnetization of the magnetic layerschanges to a parallel state, the electrons experience lowerresistance. In the case of conventional magnetoresistance,the resistance increases with the application of magneticfield; whereas in the case of Giant Magnetoresistance, theresistance decreases with the application of magnetic fieldbecause of the change from antiparallel to parellel mag-netization of the magnetic layers.

XII. FERROMAGNETIC MATERIALSAND THEIR APPLICATIONS

The hysteresis loop of a ferromagnetic material providesinformation about its usefulness in technical applications,as discussed in Section IV. The hysteresis loop depends onthe physical condition, composition, and purity of a spec-imen. Depending on the application of a ferromagneticmaterial, the important properties are the “permeability”and “coercivity.”

When a strain-free material is cold-worked, the perme-ability of the material is reduced and the hysteresis loss isincreased. The strain-relieving heat treatment of the cold-worked specimen again brings the original magnetic prop-erties back, for example, the permeability is increased andthe hysteresis loss reduced. In general, the strain-free crys-tals show the minimum hysteresis loss.

The presence of the impurities carbon, oxygen, nitro-gen, sulfur, etc., affects the permeability and hysteresisloss of a ferromagnetic material. In general, the materialswith high permeability and low hysteresis loss are purematerials.

The composition of a ferromagnetic material also influ-ences its magnetic properties. The addition of silicon toiron increases the permeability and reduces the hystere-sis loss. However, high concentrations of silicon decreasethe saturation magnetization. Therefore, the iron-siliconalloys with low concentrations of silicon are desirable inapplications like the cores of transformers and in electricmotors and generators.

The iron-nickel alloys possess high values of initial andmaximum permeability and very low hysteresis loss com-pared to the iron-silicon alloys. An alloy of compositionwith 78.5% nickel and 21.5% iron is called permalloy andhas an initial relative permeability of ∼10,000 comparedto 250 for the pure iron. These alloys are in general usedfor the magnetic screening of the electronic equipment.Small additions of the metals chromium or molybdenumfurther modify the magnetic properties of these materialsto be used as cores in transformers or inductors work-ing at the audio or higher frequencies. For example, themagnetic cores of inductors and transformers working atradiofrequencies (∼100 Mc/sec) show high eddy currentlosses. Used here are the ferrites, which have high resistiv-ity (∼106 times that of metals) and high permeability. Inother applications, the ferromagnetic materials with a highvalue of “coercivity” and large area of hysteresis loop arerequired. These materials possess hard magnetism com-pared to the already discussed soft magnetism. The ad-dition of carbon to iron increases the hysteresis loss. Thecarbon steel was used as a material for permanent magnetsin earlier days. However, aging degenerates the magneticproperties of carbon-steel magnets. Addition of metalssuch as cobalt, chromium, or tungsten improves the mag-netic properties, and these materials are less susceptible toaging. A large number of alloys composed of iron, nickel,cobalt, aluminum, copper, platinum, manganese, and ox-ides of iron and rare-earth metals have been developedthat show high values of coercivity and are suitable forpermanent magnets.

Some oxides like γ -Fe2O3 and CrO2 possess high co-ercivity and are used as recording tapes in the form of thislayers of fine powders. Permalloys are used to constructthe inductive magnetic heads to write signals as residualmagnetization on the tapes or to reproduce the electri-cal signals from the magnetized tapes. Magnetoresistiveread heads based on permalloy are also used. Significantprogress has been made in the development of very sensi-tive and small read heads based on magnetic-nonmagneticmultilayer systems. Magnetic discs or drums are made forthe memory systems in computers. Development of mod-ifying the magnetic properties by rapid solidification ofalloys from the melt has created a new field. By rapid so-lidification, the microstructure can be affected—in somecases the phases may be finely dispersed, and in otherthe alloys may become noncrystalline or amorphous. Theamorphous alloys have no crystal lattice and no magneticanisotropy.

There are no extended defects that would otherwise in-teract strongly with the domain walls in these noncrys-talline materials. In certain cobalt-based noncrystallinealloys, the magnetostriction can be adjusted to zero suchthat the internal and applied stresses have minimal effect

Page 126: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FYK/FQW P2: FYK Final Pages

Encyclopedia of Physical Science and Technology EN005A-240 June 15, 2001 20:47

768 Ferromagnetism

on the magnetic properties. Amorphous magnetic alloyshave high hardness and yield strength and are magnet-ically soft. In particular, cobalt-containing alloys havevanishingly small magnetostriction. Combining the goodmechanical and magnetic properties, they are very usefulmaterials. They can be strained elastically over wide lim-its, are insensitive to irreversible magnetic damage, andare very suitable materials where the elastic deformabil-ity is desired. They are also useful materials for makingrecording heads in audio, video, and data recording sys-tems, due to their good high-frequency response and wearresistance.

Soft magnetic and highly elastic mechanical behaviorhas led to the development of flexible magnetic shielding.

The amorphous magnetic materials consist of two maingroups. In one group, the materials are composed of tran-sition metals and metalloids; in the second group, they arecomposed of only different metals. Some amorphous ma-terials used as soft magnetic materials are Fe81(Si, B, C)19;(FeNi)78(Mo, Si, B)22; (Co, Fe)70 . . . 76(Mo, Si, B)30 . . . 24;and (Co, Mn)70 . . . 76(Mo, Si, B)30 . . . 24.

Hard magnetic materials can also be produced by rapidsolidification techniques. For example, a magnetic mate-rial of composition Fe14Nd2B produced by rapid solid-ification is superior to the Co-Sm material. Other hardmagnetic materials (some transition and rare-earth metals,and boron), have also been produced by rapid cooling. Therapid cooling technique is also less expensive comparedto the conventional methods.

SEE ALSO THE FOLLOWING ARTICLES

CRYSTALLOGRAPHY • ELECTROMAGNETISM • GEOMAG-NETISM • MAGNETIC MATERIALS • SUPERCONDUCTIVITY

• TRANSFORMERS, ELECTRICAL

BIBLIOGRAPHY

Baibich, M. N., et al. (1988). Phys. Rev. Lett. 61, 2472.Binasch, G., Grunberg, P., Saurenbach, F., and Zinn, W. (1989). Phys.

Rev. B39, 4282.Brailsford, F. (1968). “An Introduction to the Magnetic Properties of

Materials,” Longmans Green and Co., London.Chikazumi, S. (1964). “Physics of Magnetism,” John Wiley and Sons,

New York.Craik, D. J., and Tebble, R. S. (1966). “Ferromagnetism and Ferromag-

netic Domains,” North-Holland, Amsterdam.Della Torre, E., and Bobeck, A. H. (1974). “Magnetic Bubbles,” North-

Holland, Amsterdam.Kittel, C. (1979). “Introduction to Solid State Physics,” John Wiley and

Sons, New York.Morrish, A. H. (1965). “Physical Principles of Magnetism,” John Wiley

and Sons, New York.Standley, K. J. (1972). “Oxide Magnetic Materials,” 2nd ed., Oxford

University Press, London.Steeb, S., and Warlimont, H., eds. (1985). “Rapidly Quenched Metals,”

Vol. 11, North-Holland, Amsterdam.Vonsovski, S. V. (1975). “Magnetism,” Halsted Press, New York.Wohlfarth, E. P. (1980–1982). “Ferromagnetic Materials,” Vol. I (1980),

Vol. II (1980), Vol. III (1982), North-Holland, Amsterdam.Zeiger, H. J. (1973). “Magnetic Interaction in Solids,” Oxford University

Press, London.

Page 127: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ Final Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology En006G-273 June 29, 2001 21:10

Gamma-Ray SpectroscopyR. F. CastenC. W. BeausangWNSL, Yale University

I. IntroductionII. Gamma-Ray Detection

III. Gamma-Ray Spectroscopy and NuclearStructure

IV. Conclusions

GLOSSARY

Detector efficiency Loosely defined as the probability ofdetecting the full energy of a gamma-ray.

Detector resolution Refers to the width of the fullenergy gamma-ray peak measured in the detector. Typ-ically the resolution is ∼2 keV for semiconductor and10–20 keV for scintillator detectors for a 1000-keVgamma-ray.

Doppler shift The shift in frequency or energy of wavesemitted from a moving source.

Nuclear level scheme A graph of the excited energylevels of a nucleus and their connecting gamma-raytransitions. The levels are usually labeled by their an-gular momentum and parity quantum numbers.

Nucleons The protons and neutrons that make up thenucleus.

Pauli exclusion principle Fundamental principle ofquantum mechanics. It states that for certain types ofelementary particles, including electrons, protons, andneutrons, no two identical particles can be in the samequantum state.

Potential energy surface A contour plot of the potential

energy of the nucleus as a function of deformation.Stable deformations correspond to minima in thepotential energy.

Scintillator detector A material, liquid or solid, that con-verts the energy lost by a gamma-ray into pulses oflight.

Semiconductor detector Essentially a large diode usu-ally constructed out of either silicon or germanium.

Spin Angular momentum.

I. INTRODUCTION

THE NUCLEUS is a unique, strongly interacting, quan-tum mechanical system. Consisting of a few to a fewhundred protons and neutrons, its structure combines themacroscopic features expected of bulk nuclear matter(shape, size, etc.) with the microscopic properties asso-ciated with the motion of a finite number of nucleons in apotential.

Atomic nuclei studied in the laboratory (whether theyare produced in reactions or populated via radioactivedecay) are often found in excited states. Since any physical

433

Page 128: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ Final

Encyclopedia of Physical Science and Technology En006G-273 June 29, 2001 21:10

434 Gamma-Ray Spectroscopy

system seeks its lowest possible energy level, such ex-cited nuclear configurations are unstable. They generallyde-excite to the nuclear ground state in time scales of10−15 to 10−6 sec by the emission of one or more gamma-rays. Hence, the study of the gamma-rays emitted fromexcited nuclei provides a means of studying the levels,de-excitation rates, and structure of these objects.

The study of the decay properties of the atomic nucleushas provided an enormous quantity of information on thebehavior of such systems when stressed by the applica-tion of high temperature, high angular momentum, largedeformation or by large isospin values (isospin is a quan-tum number which basically counts the difference in thenumbers of protons and neutrons in a nucleus).

In this article we will touch upon some of these topicswhile attempting to give a flavor of the field of nucleargamma-ray spectroscopy and charting some possible fu-ture directions. We begin by introducing some of the de-tector types used to detect gamma-rays and briefly dis-cuss some of the design criteria for modern gamma-rayspectrometers. This is followed by a discussion of somefeatures found in excited nuclear states, broadly separatedinto low-spin and high-spin properties, and chosen to il-lustrate the variety of macroscopic and microscopic fea-tures of the nuclear system. To discuss or even list theenormous number of practical applications of gamma-rayspectroscopy in medicine, in industry (e.g., the oil indus-try), in other sciences such as archeology and astronomy,and in the areas of security and defense is far beyond thepossible scope of this article.

II. GAMMA-RAY DETECTION

In this section we discuss the mechanisms by whichgamma-rays interact with matter (i.e., detectors), the dif-ferent types of detectors and detector systems, and thecriteria that go into the design and choice of particularsystems. To facilitate this discussion, a simplified exam-ple of a nuclear level scheme is shown in Fig. 1.

A. Interaction Mechanisms

For energies ranging from a few kilo-electron volts to a fewmega-electron volts, gamma-rays interact with matter viaone of three principal mechanisms: the photoelectric ef-fect, Compton scattering, or for energies above ∼1 MeV,the electron-positron pair production. Most gamma-raydetectors exploit one or more of these effects both to de-tect the gamma-ray and to measure its energy. Of course,gamma-rays are electromagnetic waves and sometimestheir wave properties are also used in their measurement,for example, with diffraction techniques. Before we dis-

FIGURE 1 A simplified nuclear level scheme showing some ofthe levels and gamma-ray transitions that might be observed ina typical heavy-ion fusion-evaporation reaction. The levels are la-beled by their angular momentum and parity quantum numbers.

cuss the design of gamma-ray detectors and spectrometers,we first briefly describe these interaction mechanisms. Therelative probability for each mechanism is shown schemat-ically in Fig. 2 as a function of gamma-ray energy.

The photoelectric effect is the dominant interactionmechanism for low gamma-ray energies, below a few hun-dred kilo-electron volts. In this case, the gamma-ray inter-acts with an atomic electron somewhere in the bulk of thedetector material. The gamma-ray energy is transferred tothe electron, which is ejected from the atom with energyEe = Eγ − EBE , where EBE is the electron binding energy.The probability for the photoelectric effect interaction in-creases very rapidly with the atomic number (Z ) of thematerial. This is why high-Z materials are favored bothfor gamma-ray detectors and for absorbers and shields.

The Compton scattering mechanism is similar to thephotoelectric effect in that the gamma-ray also interactswith an atomic electron in the detector material. In thiscase, however, the initial gamma-ray energy is sharedbetween the electron and a scattered (lower energy)

FIGURE 2 Schematic diagram showing the relative probabilitiesfor photoelectric, Compton scattering, and pair production as afunction of gamma-ray energy.

kumarang
Page 129: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ Final

Encyclopedia of Physical Science and Technology En006G-273 June 29, 2001 21:10

Gamma-Ray Spectroscopy 435

gamma-ray. Compton scattering is the dominant inter-action mechanism for gamma-ray energies in the rangefrom a few hundred kilo-electron volts up to a few mega-electron volts. It is important to realize that this is theenergy range of most gamma-rays produced in typical nu-clear structure experiments.

For gamma-ray energies greater than 2mec2 (mec2 be-ing the rest mass of the electron ∼511 keV), electron-positron pair production is possible, and, for energiessignificantly higher than this threshold, pair productionbegins to dominate the interaction cross section. In thiscase, some of the incident gamma-ray energy is used tocreate the electron-positron pair while the remainder (inexcess of 2mec2) is shared as kinetic energy between theelectron and positron. Eventually, the positron annihilateswith another electron in the detector medium, producingtwo photons each of energy 511 keV emitted back to back.

As one can see, all three of these interaction mecha-nisms result in the production of a single energetic elec-tron (or, in the case of pair production, of an electron-positron pair) with a kinetic energy less than or equal tothat of the incident gamma-ray energy. These energeticelectrons recoil through the bulk material of the detector,their range being typically less than a millimeter or two.They rapidly slow down, losing energy through many col-lisions with other atomic electrons. In an ideal detector allof the incident gamma-ray energy is eventually absorbedin the detector material by a combination of photoelectric,Compton scattering, and (for high enough gamma-ray en-ergies) pair-production processes.

It is intuitively obvious that the number of collisions,and hence the number of secondary electronic excitationsproduced, is proportional to the primary electron energyand, hence, is directly related to the incident gamma-rayenergy. For example, in semiconductor detectors, whichare essentially diodes made out of germanium (Ge) orsilicon (Si), the electron-hole pairs produced followingelectron–electron collisions are extracted by a high volt-age placed across the detector and produce a current pulsewhich is proportional to the deposited gamma-ray energy.For scintillation detectors, such as sodium iodide (NaI(Tl))detectors, the collisions of the primary electron produceexcited atomic or molecular states. The subsequent decayof these states produces scintillation photons (typically inthe UV range). These photons are converted into a cur-rent pulse using a photocathode and photomultiplier tube.The size of the current pulse is again proportional to thedeposited gamma-ray energy.

B. Gamma-Ray Detectors

Different types of detectors have quite different efficien-cies and energy resolutions. Indeed, generally speaking,

gamma-ray spectroscopy is a constant trade-off betweenthese two properties. For example, scintillation detectors,such as NaI(Tl) detectors, typically have high efficien-cies but poor energy resolution compared to Ge detectors.Detectors based on the technique of crystal diffraction(see below) have superb energy resolution but very smallefficiency.

1. Scintillation Detectors

The detection of gamma-rays (or other types of ionizingradiation) by the scintillation light produced in certainmaterials is one of the oldest techniques on record, andit is still one of the most useful and common techniquestoday.

A scintillator, either a solid or a liquid, is a materialwhich converts the energy lost by the gamma-ray intopulses of light. The scintillation light is detected in turn bya light-sensitive material which usually forms the cathodeof a photomultiplier tube. The light pulses are convertedinto electrons in the photocathode. These electrons arethen accelerated and their number vastly (and linearly) am-plified in the photomultiplier. The resulting current pulseis proportional to the energy of the absorbed gamma-ray.

The energy required to produce a light pulse is fairlylarge, on the order of 30–50 eV. Thus, the average numberof light pulses produced when, say, a 500-keV gamma-ray is absorbed is on the order of 10,000. Fluctuations inthis number and in the light collection process limit theresolution obtainable with scintillation detectors.

2. Semiconductor Detectors

A semiconductor detector is essentially a large diode con-structed out of either Si or Ge. For gamma-ray spec-troscopy, Ge detectors are preferred, as Ge has a largerstopping power. The diode is operated under a reverse biasand is normally fully depleted (i.e., with no free charge car-riers). The gamma-ray interaction produces electron-holepairs in the depletion region, which are collected becauseof the detector bias voltage and which produce a currentpulse proportional to the absorbed gamma-ray energy.

In contrast to scintillator detectors, the average energyrequired to produce a single electron-hole pair is onlyabout 2–3 eV. Therefore, a 500-keV gamma-ray can pro-duce around 250,000 primary charge carriers, much largerthan the corresponding number for scintillation detectorswith a corresponding decrease in the statistical fluctua-tions and improvement in detector resolution. Figure 3ashows a typical spectrum of a 60Co source obtained usinga modern Ge detector. The energy resolution obtained isabout 2 keV for gamma-ray energies of about 1000 keV.Using a NaI(Tl) scintillation detector, the two peaks in

Page 130: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ Final

Encyclopedia of Physical Science and Technology En006G-273 June 29, 2001 21:10

436 Gamma-Ray Spectroscopy

Fig. 3a (having energies of 1174 and 1332 keV) wouldbe barely resolved. Following a brief discussion of crystaldiffraction detectors, we will focus most of the remain-der of this article on gamma-ray spectroscopy using Gedetectors.

3. Ultra-High Energy Resolution Spectroscopy:Crystal Diffraction

Gamma-rays are, after all, electromagnetic photons, andunder certain circumstances their wave properties may beused in their detection and measurement. Indeed, the ul-timate in current gamma-ray energy resolution and mea-surement precision is obtained by the use of crystal diffrac-tion. With such techniques it is routine to measure a 1-MeVgamma-ray with an energy resolution of ∼3 eV and an en-ergy precision of better than 1 eV. The cost, however, islow efficiency and the need to scan the energy spectrumone small energy bite at a time.

The technique uses Bragg diffraction from a nearly per-fect crystal, usually of Si. As for optical and X-ray transi-tions, the gamma-ray wavelength λ and diffraction angle θ

are related by the Bragg law: nλ = 2d sin θ , where the lat-tice spacing d is known to an accuracy of 1 part in 1010 and

FIGURE 3 (a) Typical spectrum of a 60Co source obtained using a modern Ge detector with and without escapesuppression. The vertical scale has been greatly expanded in order to show the Compton background. The dramaticreduction in the height of the background when using a Compton suppression shield is obvious. The insert shows thesame spectra but now with the full vertical scale to illustrate the height of the photopeaks compared to the background.(b) An example of the very high energy resolution obtainable using a crystal diffraction system. The top spectrum,obtained using a Ge detector, shows seven peaks, some of which are not resolved. The lower spectrum, obtainedusing the GAMS crystal diffraction system, shows the same portion of the spectrum, but with a dramatic improvementin resolution. In both spectra, the gamma-rays are labeled with their energies in kilo-electron volts. Continued.

n is the order of diffraction. Clearly, the resolution scaleswith n. Higher order diffraction gives greater dispersionand, hence, energy precision, although the efficiency gen-erally falls off with n. The accuracy depends on the pre-cision of the angle measurement. In the realization of thistechnique at the Institut Laue Langevin in Grenoble, inthe GAMS (GAMma-ray Spectrometer) family of instru-ments, accuracies of the latter are typically in the milli-arcseconds range (Koch et al., 1980). Given the nature of thetechnique, the gamma-ray energy spectrum is stored en-ergy interval by interval rather than sampled fully at eachpoint in time. An example of a crystal diffraction spectrumcompared to the corresponding Ge detector spectrum isshown in Fig. 3b. Generally speaking, crystal spectrom-eters offer the greatest advantages over Ge detectors forgamma-ray energies below ∼1 MeV. At higher energiestheir efficiency drops quickly, and hence, lower orders ofdiffraction are used with poorer energy resolution.

C. Level Scheme Construction

One might ask why the extraordinary energy precision ofcrystal diffraction techniques can be useful since nuclearmodels seldom predict nuclear states to accuracies better

Page 131: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ Final

Encyclopedia of Physical Science and Technology En006G-273 June 29, 2001 21:10

Gamma-Ray Spectroscopy 437

FIGURE 3 Continued.

Page 132: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ Final

Encyclopedia of Physical Science and Technology En006G-273 June 29, 2001 21:10

438 Gamma-Ray Spectroscopy

than many kilo-electron volts? The principal reason relatesto the construction of reliable level schemes. One methodof constructing level schemes utilizes the Ritz Combina-tion Principle. Here, nuclear level energies are determinedby demanding that their energy differences be equal tothe energy of the gamma-ray transition connecting them(see Fig. 1). A brief discussion of this technique enablesus to see both the role of the ultra-high energy resolu-tion in level scheme construction and the complementaryand much more commonly used technique of coincidencespectroscopy with Ge detectors.

To see the point, imagine constructing a level schemewhich consists of 50 levels spanning the excitation en-ergy range from zero up to 2.0 MeV simply by using theRitz Combination Principle in a case where one has de-tected, say, 500 γ -rays with energies below 1.2 MeV, withGe detector energy accuracy of ±0.1 keV. This is a typi-cal situation encountered in the spectroscopy of low-spinstates of heavy nuclei. In such a case the probability ofan accidental Ritz Combination, that is, a level energydifference that inadvertently coincides within uncertain-ties with a gamma-ray energy, is about 10%. Even withother experimental input, such as information on the an-gular momenta of the levels to rule out certain transitionplacements from conservation of angular momentum, itis clear that a large number of incorrect placements and,hence, incorrect physics will result. There are two waysof resolving this situation: either using time coincidencerelations between successive gamma-rays to place themcorrectly in the level scheme of a nucleus or improvingthe energy resolution significantly. For the latter, the crys-tal diffraction approach is ideal. With an energy precisionof, say, ±5eV, the probability of an accidental sum dropsto negligible levels. Indeed, data from the GAMS spec-trometers at the ILL have often shown that existing Gedetector results (usually data taken without coincidences)are in error.

However, the usual solution to this problem is the useof coincidence spectroscopy. This technique exploits thefact that nuclear levels are generally short lived, with typ-ical half-lives in the pico- to nanosecond range, so thatsuccessive gamma-ray de-excitations effectively occur si-multaneously, on the time scale of standard pulse analysiselectronics. Therefore, if two or more gamma-rays are ob-served in separate detectors, in time coincidence (withinsay a few nenoseconds) then they must occur in a cas-cade in the nuclear level scheme. For example, in Fig. 1the 6+

1 → 4+1 and 4+

1 → 2+1 transition would be in coinci-

dence, as would the 3+1 → 2+

2 and 2+2 → 4+

1 or 2+2 → 2+

1transitions. However, the 3+

1 → 2+2 transition is not in co-

incidence with the 6+1 → 4+

1 transition. Such coincidencerelations are of inestimable help in constructing complexnuclear level schemes.

While coincidence spectroscopy is a powerful tech-nique, it also has limitations. For example, it does not helpplace ground state transitions or very weak transitions,and, in some cases, experimental constraints preclude itsuse. Nevertheless, it is, by far, the most common approachto sorting out the plethora of gamma-rays observed in nu-clear de-excitation. We will further discuss coincidencespectroscopy below when we introduce advanced multi-detector arrays of Ge detectors.

We briefly mention that another application of ultra-high resolution crystal spectroscopy is in the determi-nation of fundamental constants such as the acceptedstandard for length measurements (the definition of themeter) through the precise measurement of gamma-raywavelengths. These applications fall outside the scope ofthis article.

D. The Evolution of Detector Arrays

To illustrate the increasing power and sophistication ofgamma-ray detectors, particularly of Ge detector arrays,it is useful to consider the nuclear reactions by which thenuclei to be studied are formed. One of the most commonreaction mechanisms used to populate high-spin states inatomic nuclei is the heavy-ion fusion-evaporation reac-tion. This type of reaction has the advantage of bringinglarge quantities of angular momentum into the productnucleus (often up to the limit allowed by fission), while atthe same time populating only a few product nuclei withsignificant probability.

In such reactions, a heavy-ion beam is incident on atarget at an energy just above the Coulomb barrier. A typ-ical reaction might involve a 48Ca (Z = 20, N = 28) beamincident on a 108Pd (Z = 46, N = 62) target at a beam en-ergy of 200 MeV. Following the collision, the beam andtarget nuclei fuse to form a compound nucleus, in this case156Dy (Z = 66 = 20 + 46, N = 90 = 28 + 62). The com-pound nuclear system will be produced in a highly ex-cited and rapidly rotating state, with typically 60 MeV ofexcitation energy and about 70h✏ of angular momentum.

The initial decay of the compound system is via theemission of a few (3–5) particles, usually neutrons andless frequently protons or alpha particles. This first stageof the decay process typically removes about 40 MeV ofexcitation energy and about 10h✏ of angular momentum.The remainder of the excitation energy and most of the an-gular momentum is subsequently removed by gamma-rayemission. Each gamma-ray photon removes either one ortwo units of angular momentum. Thus, we can expect theemission of cascades of up to 30 gamma-rays followingeach reaction. Because of this multiplicity of gamma-rays,the study of transitions following production of the com-pound nucleus imposes stringent requirements on detector

Page 133: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ Final

Encyclopedia of Physical Science and Technology En006G-273 June 29, 2001 21:10

Gamma-Ray Spectroscopy 439

systems. Other reactions used, such as Coulomb excita-tion, (n, γ ), or β-γ decay, generally present simpler ex-perimental challenges. The driving force in the impressivedevelopments in gamma-ray detector systems in the last40 years has been the requirements imposed by the fusionevaporation reaction studies.

Information about the changes in nuclear structure dur-ing the decay, as the nucleus loses energy and angularmomentum, is obtained by measuring the properties ofthe gamma-rays in these cascades, such as the gamma-rayenergy, angular distribution, linear polarization, emissionsequence, etc.

The evolution of gamma-ray spectroscopy with timeover the past 40 years or so is illustrated in Fig. 4, whichplots the population intensity of various nuclear statesas a function of angular momentum or spin of the state.Pioneering experiments in the early 1960s were carried outwith one or a few NaI(Tl) scintillation detectors (Morinagaand Gugelot, 1963). The sensitivity of these experimentswas limited, both by the poor energy resolution of NaI(Tl)detectors (about 80 keV at 1000 keV) and by the smallnumber and size of the detectors, to spins up to about

FIGURE 4 A schematic diagram illustrating the evolution ofgamma-ray spectroscopy. The various symbols plot the measuredintensity of various nuclear states vs angular momentum, givingan indication of the sensitivity of various detector systems. Earlyexperiments using NaI(Tl) and a few Ge detectors were sensi-tive to excited states which were populated with intensities downto about one-tenth of the reaction channel (solid symbols). Astime went on, more sensitive arrays were developed. The currentgeneration of arrays, the Gammasphere and Euroball arrays, arecapable of observing excited states populated with a fraction assmall as 10−6 of the reaction channel. The open symbols and starsplot the intensity of various superdeformed bands as a function ofangular momentum.

spin 8–10 h✏ and to states populated with about 10% theintensity of the strongest transition.

The introduction of reversed bias, lithium drifted, Gedetectors in the mid 1960s led to a major increase in sensi-tivity and major breakthroughs in our physics knowledge.Germanium detectors have very good energy resolution,about 1 keV for Eγ ∼ 100 keV and 2 keV for Eγ ∼1000 keV. On the other hand, the detection efficiency ofearly Ge detectors was often much lower than NaI(Tl) de-tectors. To compensate for the lower efficiency, and also tomeasure the time coincidence relationships of successivegamma-rays in a cascade, experiments with more than oneGe detector were soon commonplace. The phenomenonof backbending, at spin ∼15 h✏ (see below), was discov-ered by Johnson, Ryde, and Sztarkier (1971) using justtwo Ge(Li) detectors, while the structure of 160,161Yb wasinvestigated by Riedinger et al. (1980). up to spin ∼30 h✏

using only four Ge detectors. In the last three decades,the study of the properties of the atomic nucleus throughgamma-ray spectroscopy has evolved through the devel-opment of larger and more efficient Ge detector arraysand, indeed, has driven the development of these arrays.

Starting in the 1980s and continuing to today,large arrays of Ge detectors such as TESSA, GASP,Eurogam, Gammasphere (Lee, 1990) and Euroball (Gerland Lieder, 1992) further revolutionized gamma-ray spec-troscopy. Future arrays such as the proposed GRETA(Gamma-Ray Energy Tracing Array) spectrometer(Deleplanque et al., 1999), which promise very large in-creases in sensitivity resulting from modern manufactur-ing techniques, electronics, and digital data processing,are in the planning stages.

E. Germanium Detector Performance

1. Peak-to-Total Ratio and Escape Suppression

A major problem encountered with early Ge detectors, andstill a problem today, is the poor peak-to-background ratioin the spectrum. The background, clearly seen in Fig. 3a, iscaused by incomplete energy collection in the Ge detectoroccurring when a Compton scattered gamma-ray leavesthe active bulk of the detector before being absorbed.

Even with today’s large volume Ge detectors, irra-diation with a standard 60Co source (which emits twogamma-rays with energies of 1174 and 1332 keV) yieldsa spectrum where only ∼25% of the events lie in the fullenergy photopeaks. This number, the ratio of the num-ber of counts in the photopeak(s) to the total number ofcounts in the spectrum, is termed the peak-to-total ratio(PT). For PT = 0.25, the remaining 75% of the events inthe detector form a continuous background extending tolower energies.

Page 134: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ Final

Encyclopedia of Physical Science and Technology En006G-273 June 29, 2001 21:10

440 Gamma-Ray Spectroscopy

The preferred solution is to detect these scattered pho-tons in a second, surrounding detector, termed an escapesuppression or an anti-Compton shield, and to reject, us-ing fast electronics, coincidence events between the Gedetector and the shield detector. The combination of Gedetector and suppression shield is termed an escape sup-pressed spectrometer (ESS). The material commonly usedin the anti-Compton shield is bismuth germanate (BGO),a dense, high-efficiency scintillator material.

After suppression, typically about 65% of the remainingevents are in the photopeaks (PT = 0.65). A typical ESSconfiguration showing a Ge detector and shield is shownin Fig. 5, while the improvement in the background, andhence spectrum quality, is illustrated in Fig. 3a.

The PT ratio is of prime importance for coincidencespectroscopy. For example, when requiring a coincidencebetween two Ge detectors, a PT ratio of 0.25 implies thatonly (0.25)2 or ∼6% of the events will be photopeak-photopeak coincidences. The remaining 94% will bebackground events. Using an ESS, however, the photo-peak-photopeak coincidence fraction increases to (0.65)2

or 42%, an improvement of a factor of 7. Even larger im-provements are obtained when three- or higher fold coinci-dence events are recorded. For example, the improvementis a factor of 17 for triples coincidences, 45 for quadru-ples, and 120 for quintuples. Today’s largest gamma-ray spectrometers, the Gammasphere array in the UnitedStates and the Euroball array in Europe, regularly recordeven higher fold coincidence events (Lee, 1990; Gerl andLieder, 1992).

The efficiency and sensitivity of ESS arrays improvedrapidly, so that by the mid 1980s arrays with more than20 ESS having total absolute peak efficiencies of up to1% were constructed. By convention, the total photopeakefficiency is defined as the probability of measuring thefull energy of the 1332-keV 60Co gamma-ray when thesource is placed at the center of the array. These ESS ar-rays enabled nuclear phenomena that occur at an intensityof about 1% of the total intensity of the nucleus to bestudied. Worldwide there were about a dozen arrays withthis level of sensitivity. One of the earliest of these, theTESSA3 array (Nolan, Gifford, and Twin, 1985), locatedat Daresbury Laboratory in the United Kingdom, was usedin the discovery of the classic discrete line superdeformedband in 152Dy. Superdeformation will be discussed furtherbelow.

In the mid 1990s the latest generation of gamma-rayspectrometers with total photopeak efficiencies of up to∼10% came on line. These spectrometers, namely theGammasphere (Lee, 1990) and Eurogam/Euroball (Gerland Lieder, 1992; Beausang et al., 1992) arrays, containup to 240 individual Ge elements and have sensitivities ofbetter than 0.001% of the production cross section. Some

of the detectors in these arrays are composites formed byclosely packing several Ge detectors together as a unit.Two varieties of such units, called clover (Duchene etal., 1999) or cluster (Eberth et al., 1996) detectors, arenowadays the backbone of advanced arrays such as theYRAST Ball array at Yale University (Beausang et al.,2000) or the planned Exogam and Miniball arrays inEurope (Simpson et al., 2000). Even more powerful detec-tors, termed tracking detectors, are under development.These will be discussed below.

2. Counting Rates

It is informative to look at some of the numbers involvedin a typical nuclear physics reaction carried out in thelaboratory. Once again, we consider the example of the48Ca + 108Pd reaction, which was used in the experimentin which the first superdeformed band in 152Dy was dis-covered (Twin et al., 1986).

Typically, the beam intensity from an accelerator isabout 1010–1011 particles per second incident on a tar-get. This corresponds to an electric current on the orderof a few nano-Amperes (nA). About one beam particle ina million will actually strike a target nucleus and inducea nuclear reaction. Therefore, we expect about 100,000reactions per second. About 20% of the reactions produce152Dy and about 1% of these will populate the nucleus inthe superdeformed state, corresponding to about 200 suchevents per second.

The array used in the original discovery of the super-deformed band in 152Dy, the TESSA3 array (Nolan,Gifford, and Twin, 1985), had a total photopeak efficiencyof about 0.5%. Assuming that each superdeformed nu-cleus decays by emitting a cascade of ∼25 gamma-rays,and that we require a coincidence between two detectors(γ 2) before accepting an event, we might expect to detectabout 1 gamma-ray coincidence event per second originat-ing from a superdeformed cascade. Since each cascade is∼25 transitions long, we expect about 1 count per gamma-ray transition every 20 sec or so. The background rate fromother processes is many hundreds of times greater.

3. Doppler Effects and Segmentation

The lifetimes of the highest spin states populated viaheavy-ion fusion-evaporation reactions are often compa-rable to, or shorter than, the stopping time of the recoilingnucleus (recoiling due to the momentum imparted by theincident beam nucleus that initiates the reaction) in thetarget material. Typical recoil velocities are on the orderof a few percent the speed of light. Therefore, Dopplereffects play a major role.

Page 135: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ Final

Encyclopedia of Physical Science and Technology En006G-273 June 29, 2001 21:10

Gamma-Ray Spectroscopy 441

FIGURE 5 Escape suppression spectrometer showing Ge detector and shield. This is the type of ESS used for cloverGe detectors in the Eurogam/Euroball array. The clover Ge detector position is indicated inside the suppression shield.The liquid nitrogen storage dewar is also shown.

The Doppler shifted energy of a gamma-ray emittedfrom a nucleus in flight is given by

Eγ = E0

[1 + v

ccos θ

],

where E0 is the unshifted energy and θ is the detectorangle. If a detector records the same gamma-ray emit-ted from different nuclei having a wide range of velo-cities or traveling at different angles with respect to the

beam direction, the resulting energy resolution can be verypoor.

One solution is to use very thin targets in order to mini-mize slowing down effects and detectors that subtend onlya small range of angles. Knowing the detector angles, onecan correct for the Doppler shift and recover most of theresolution. The limit on detector resolution now becomesthe finite opening angle of the Ge detector itself, in otherwords the uncertainty in knowing in which part of the Ge

Page 136: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ Final

Encyclopedia of Physical Science and Technology En006G-273 June 29, 2001 21:10

442 Gamma-Ray Spectroscopy

detector the gamma-ray actually interacted. For a constantrecoil velocity, the Doppler broadening is given by

dE = E0v

csin θ dθ,

where dθ is the opening angle of the Ge detector, typi-cally about 5–10◦. For experiments on very high spin nu-clear states, the energy resolution is dominated by Dopplerbroadening effects and is often a factor of two or moreworse than the intrinsic resolution of the Ge detector. Forvery high recoil velocities the problem is much worse. Be-cause of the sin θ dependence, the Doppler broadening isworst for detectors placed at θ = 90◦ to the beam direction(even though the Doppler shift is zero at 90◦).

Various methods have been developed to minimizeDoppler broadening effects. Most involve the concept ofdetector segmentation in which one determines in whatpart of a detector a given photon was detected. The devel-opment of the clover detector, for example, with four sepa-rate Ge detectors closely packaged in a single vacuum ves-sel, was driven by such concerns (Duchene et al., 1999).A schematic diagram of a clover Ge detector is shown inFig. 6. The idea is that by using four small detectors, oneeffectively has a much larger detector while preservingthe smaller opening angle for each individual segment.Gamma-rays may interact in only a single element of theclover detector. In this case one takes the angle θ to be the

FIGURE 6 Schematic diagram showing the four Ge crystals ofa segmented clover Ge detector. In this type of clover detector,to further improve position information, signals are taken from thecenter contacts of each crystal (labeled 1–4) and also from theleft, right, and middle parts of the outer electrical contacts.

center of this element and dθ is the opening angle of thissegment. A gamma-ray may also scatter between two el-ements of the clover detector. In this case simulations andmeasurements have shown that the gamma-ray interactionusually takes place close to the boundary between the twocrystals. Thus, one is justified in taking θ as the angle ofthe boundary. The other enormous advantage of the cloverdetector is that the energy is measured accurately, even forsuch scattering events. The energies measured in each sep-arate crystal may be added together while preserving thegood energy resolution of the individual crystals. Becauseof this add-back feature, the efficiency of a clover detec-tor, consisting of four individual Ge crystals, is actuallyabout six times the efficiency of the individual detectorcrystals.

4. Tracking Detectors

Recently, further advances in detector manufacturing tech-nology allow the electronic segmentation of a single crys-tal into smaller elements, thus further localizing the inter-action site within the volume of the detector. The ultimategoal of these developments is the development of a track-ing detector array, which actually allows one to follow thetrajectory of each individual gamma-ray as it traverses adetector, even if it undergoes multiple scattering events enroute.

Ideally, such an array needs to cover all the availablesolid angle and localize each gamma-ray interaction towithin 1–2 mm in three dimensions. A variety of trackingdetectors are under development worldwide, including theGamma-Ray Energy Tracking Array (GRETA for short)in the United States (Deleplanque et al., 1999).

Such an array would be very efficient. Simulations forthe proposed GRETA array indicate that it may be up toa thousand times more sensitive than the best of today’sspectrometers. This sensitivity comes about because of thehigh-count rate capability (the relatively low-count rate ineach segment is the limiting factor, rather than the highrate in the entire detector), excellent PT ratio, resolution,and efficiency.

A prototype detector for the GRETA array has alreadybeen extensively tested in Lawrence Berkeley NationalLaboratory, Berkeley, CA. One key test involved the de-termination of the gamma-ray interaction position by useof a closely collimated source. The interaction positionsare determined by detailed measurements of pulse shapeson an event-by-event basis. A comparison of measuredpulse shapes, with calculations show excellent agreement,which is a major first step in a proof of principle for thedetector. The next step in this project is to purchase amini-array of such detectors. These multiple detectors,assembled into a closely packed array, allow one both to

Page 137: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ Final

Encyclopedia of Physical Science and Technology En006G-273 June 29, 2001 21:10

Gamma-Ray Spectroscopy 443

do physics and to prove the principle of practical gamma-ray tracking for the first time. The proposal to constructthis array is currently awaiting funding.

5. Pair Spectrometers

High-energy gamma-rays (with energies, say, in the1–10 MeV range) can interact with matter to producepositron-electron pairs. When the positron annihilates,two photons, each of 511 keV, are emitted at an angleof 180◦ to each other. When such a pair-production pro-cess occurs in a Ge detector, one or both of the 511-keVphotons may escape from the detector without being de-tected. Hence, each gamma-ray transition leads to threepeaks in the spectrum, the full energy peak plus the so-called single and double “escape” peaks. This proliferationof peaks can significantly complicate spectral analysis andadversely affect nuclear level scheme construction.

To improve such spectra one often uses a pair spectrom-eter, which, in essence, is the inverse of the anti-Comptonshield spectrometer discussed earlier. Whereas in an anti-Compton shield any event detected in the shield is usedto veto the coincident event in the Ge detector, in a pairspectrometer the simultaneous detection of 511-keV γ -rays on opposite sides of the Ge detector is used to posi-tively trigger (i.e., to select) the double escape peak in theGe detector spectrum.

F. Measurements of Nuclear Level Lifetimes

Aside from the measurement of gamma-ray energies andintensities, and the determination of gamma-ray transi-tion placements in nuclear level schemes by coincidenceand Ritz Combination techniques, gamma-ray detectorscan be used to measure another extremely important ob-servable, namely, nuclear level lifetimes. These lifetimesare proportional to squares of quantum mechanical quan-tities called transition matrix elements and therefore candirectly reveal insights into nuclear structure and the prop-erties of nuclear excitations.

Two general classes of techniques are used: those in-volved in directly measuring the time difference betweensuccessive gamma de-excitations in a nucleus and thosebased on Doppler effects. The former has traditionallybeen limited by the rise time of voltage pulses from de-tectors to the nanosecond range, but advances using fasterscintillation detectors have pushed the frontiers of elec-tronic time measurements farther down to nearly the pi-cosecond range. Doppler techniques are typically used tomeasure lifetimes from hundreds of picoseconds down tothe few femtoseconds range. These techniques cover arange of lifetimes characteristic of a wide variety of nu-clear decays.

1. Recoil Distance and Doppler Shift AttenuationMethods

Typical recoil velocities following heavy-ion fusion-evaporation reactions are a few percent the speed of light.The associated Doppler shifts of emitted gamma-rayscan be used to obtain level lifetimes. The recoil velocitycorresponds to an easily measured, maximum Dopplershift of about 20–30 keV in a 1000-keV gamma-ray. Thefraction of the gamma-ray intensity which lies in theDoppler shifted peak can be proportional to the lifetime ofthe nuclear state. Two Doppler-based techniques are com-monly used. The first, termed the Recoil Decay Method(RDM), utilizes two parallel foils separated by a distanced. The nuclei of interest are produced in the first, thin foiland recoil out of the foil with a well-defined recoil veloc-ity. Having flown a distance d, they are rapidly stoppedin the second, thicker stopper foil. If the nuclear state ofinterest decays while the nucleus is flying between the twofoils, then the gamma-ray will be emitted with the appro-priate Doppler shift. On the other hand, if the lifetime islong enough that the nucleus reaches the stopper foil andis stopped, the gamma-ray will be emitted from a nucleusat rest, without a Doppler shift. Changing the distanced between the foils can access different lifetime ranges.Typically, the RDM technique is used to probe lifetimesin the nanosecond to picosecond range.

The Doppler Shift Attenuation Method (DSAM) is sim-ilar to the RDM in that two foils are used. However, inthis case, the foils are placed in intimate contact with eachother. Now the recoiling nuclei immediately enter the sec-ond foil and begin to slow down and stop. If the nuclearlifetime of interest is of the same order of magnitude asthe slowing down time of the nuclei in the foil, around1–2 ps, then the gamma-ray transitions will be emittedwith a range of Doppler shifts, ranging from the maximumshift down to zero. Level lifetimes may be extracted bycarefully analyzing the resulting, complicated peak shapesand comparing them to model calculations. Of course thecalculations also have to include the slowing down processitself. The DSAM method is sensitive to level lifetimes onthe order of picoseconds to femtoseconds, i.e., somewhatshorter than those accessible with the RDM method.

2. The GRID Technique

Another Doppler-based method is used at the ILL inGrenoble, referred to earlier (Koch et al., 1980), usingthe ultra-high resolution crystal diffraction instrumentsGAMS4 and GAMS5. In this approach, a thermal neu-tron from a reactor is captured by a target nucleus whichthen emits a series of gamma-rays from the capture state(typically lying at an excitation energy of about 6 MeV) to

Page 138: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ Final

Encyclopedia of Physical Science and Technology En006G-273 June 29, 2001 21:10

444 Gamma-Ray Spectroscopy

lower lying states. Each emitted gamma-ray carries a smalllinear momentum, p = E/c. Hence, the emitting nucleusrecoils in a direction opposite to that of the gamma-ray. Ifthe same nucleus then emits a subsequent gamma-ray priorto stopping in the target material, the second gamma-raywill be Doppler shifted.

The shifts are exceptionally small. Typical recoil ener-gies are a few electron volts, and therefore, the techniquerelies of measuring Doppler broadening effects of thisorder using crystal diffraction techniques. Note that onemeasures a Doppler broadening rather than a shift be-cause the gamma-ray emission of the ensemble of nucleiis effectively isotropic. The technique is known as theGRID (Gamma-Ray Induced Doppler) technique (Bornerand Jolie, 1993) and, like the DSAM, is useful for nuclearlevel lifetimes shorter than or on the order of the stoppingtime, typically ∼1 ps.

3. Fast Timing Spectroscopy

The coincidence measurements with Ge detectors de-scribed above are typically used to establish nuclear levelschemes. Such measurements utilize coincidence resol-ving times on the nanosecond time scale (10−9 sec). How-ever, the time response of BaF2 scintillation detectors ismuch faster than that of Ge detectors and, with specialcare, can be reduced to the few picosecond range. Hence,coincidence timing can also be used to directly measurenuclear level lifetimes in the few tens of picosecondsrange, which is typical of the lifetimes of many collectiveexcitations in medium mass and heavy nuclei. In practice,the technique, called FEST (Fast Electron ScintillationTiming) [see Buescher et al. (1990) for a simplified dis-cussion and for references to more technical literature],is most commonly used in β-decay experiments wherethe time is measured between the emission of a β-ray (de-tected in a thin, fast plastic scintillator) and the subsequentgamma-ray emission in the daughter nucleus.

The technique must be used with great care. One prob-lem is that the BaF2 detectors have very poor energy res-olution (∼10%). Additional gamma-ray selection, by co-incidence with cascade gamma-rays using Ge detectors(with “normal” nanosecond timing), is normally neededto simplify the BaF2 spectra to one of two gamma-rays atmost. Therefore, most applications are in low multiplicityexperiments. Another serious problem relates to the en-ergy dependence of the timing. A gamma-ray moves atthe speed of light and in 3 ps travels ∼1 mm. Since typicalBaF2 detectors have sizes on the order of centimeters, it isclear that the timing is sensitive to the exact position in thecrystal where the gamma-ray absorption occurs. Hence,the time properties of such detectors are energy dependentand must be carefully calibrated. Nevertheless, the tech-

nique has proven to be quite useful in studies of nuclei offthe line of nuclear stability in β-decay experiments.

III. GAMMA-RAY SPECTROSCOPYAND NUCLEAR STRUCTURE

The atomic nucleus is a unique, many-body quantummechanical system. When describing nuclei, numbers ofthe order of 100 seem to occur frequently. For example,the depth of the potential holding the protons and neu-trons, collectively known as nucleons, together is about50 MeV. The maximum angular momentum the nucleuscan hold before centrifugal forces break it apart is about100 h✏ , which occurs for nuclei around mass 100.

Typical nuclei have a few hundred constituent nucleons.This number implies that the nucleus occupies a uniqueposition in the plethora of quantum systems found in na-ture. A few hundred particles grouped together is sufficientto allow one to contemplate macroscopic nuclear proper-ties such as shape and surface area and thickness. One theother hand, it is few enough that the addition or subtrac-tion of a single proton or neutron can radically change thebehavior of the whole system. Indeed, one of the appeal-ing features of the nucleus is that it is a many-body quantalsystem in which the number of interacting bodies can beprecisely controlled, measured, and varied. We will see astunning example of the microscopic nature of the nucleusbelow when we discuss the phenomenon of backbending.This mixture of macroscopic and microscopic behavior ina strongly interacting system (the nucleons are after allbound together in the nucleus by the effects of the strongforce) is nearly unique in nature.

The behavior of the nucleons inside the nucleus can belikened to the behavior of a herd of wild animals. Theherd clusters together for protection, defining a shape andform. (The Hungarian word for such a herd is gulyas, sothe nucleus is a bit like a goulash soup of nucleons.) How-ever, the behavior of a single animal can have dramaticeffects on the collective motion of the whole system. Inthe following sections, we will describe some of the fea-tures of the excited atomic nucleus and attempt to describea few of the many manifestations of its macroscopic andmicroscopic behavior.

Generally speaking, atomic nuclei can be excited fromthe “bottom up” using reactions such as Coulomb excita-tion, inelastic scattering, or direct reactions, or from the“top down” using β-decay, neutron capture, and heavy-ion fusion-evaporation reactions. The former approachmost often excites states selectively, while the latter ap-proach is much less selective, tending to populate moststates along a myriad of possible de-excitation routes,subject only to constraints due to angular momentum

Page 139: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ Final

Encyclopedia of Physical Science and Technology En006G-273 June 29, 2001 21:10

Gamma-Ray Spectroscopy 445

selection rules or phase space considerations. Gamma-rayspectroscopy is most often used in this second approach.When gamma-ray spectrometry is used in “bottom up”techniques, such as Coulomb excitation, it is exploited pri-marily as an indicator of the excitation probability of par-ticular levels rather than as a study of de-excitation modesper se.

In this section, we will discuss a number of aspectsof gamma-ray spectrometry. Although the distinction is abit artificial, it is convenient, and historically pertinent, tobreak the discussion up into the study of low- and high-spin states.

A. Low-Spin States

The study of the low-spin nuclear states dates back tothe beginning of nuclear structure and is the basis for ourunderstanding of the equilibrium structure of nuclei andits evolution with nucleon number. Low-spin states aretypically populated following β-decay, neutron capture,Coulomb excitation, or photon scattering reactions.

1. Beta-Decay

Nuclei formed off the valley of stability decay back towardstable nuclei via β-decay (which includes the processes ofβ−, β+, and electron capture decay). Typically, β-decaypopulates several excited levels in the daughter nucleus.Half-lives near stability range from seconds to days. Pro-

FIGURE 7 Diagram of the Yale moving tape collector showing the target box, counting area, and tape holding box.Activity is deposited on the tape in the target box, with the beam entering from the left. It is then transported to thecounting area. The holding box provides a delay to let unwanted extraneous activity decay away before the tape oncemore returns to the target box.

duction of β-decay parent nuclei can be achieved by sim-ple reactions such as (p, n) or by heavy-ion reactions. Thesimpler, lower energy reactions tend to form only one or acouple of parent nuclei, whereas heavy-ion reactions mayform many times more, and, in that case, selection tech-niques are needed to select the decay products of interest.

A popular technique in β-decay is the use of movingtape collectors in which the activity is collected on a tape(e.g., movie reel tape or aluminized Mylar) for some pe-riod of time (typically ∼1.8 times the half-life for thedesired β-decay). The tape is then moved to a low back-ground area for detection of gamma-rays following decay.Collection of a new activity at another spot on the tape pro-ceeds simultaneously.

Gamma-ray spectroscopy following β-decay was formany years in the 1950s–1970s a standard technique usedto elucidate nuclear structure. Since β-decay itself carriesoff little or no angular momentum, the spin states acces-sible with this technique are generally those within ±2–3 h✏

of the parent (ground or isomeric) state.In recent years the technique has enjoyed a renaissance

with the use of arrays of much higher efficiency Ge detec-tors (e.g., clover or cluster detectors). Since the gamma-ray multiplicity following β-decay is low and there is noDoppler effect, the detectors can often be mounted in closegeometry to maximize count rates and achieve consider-able coincidence efficiencies.

One current setup for such studies is the Yale mov-ing tape collector (Casten, 2000). Illustrated in Fig. 7,

Page 140: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ Final

Encyclopedia of Physical Science and Technology En006G-273 June 29, 2001 21:10

446 Gamma-Ray Spectroscopy

it uses up to four Compton suppressed clover detectorsthat can be positioned at any angle in a horizontal plane.Studies with this instrument have included searches forpossible multi-phonon states in 162Dy and 164Er. Nucleiwith ellipsoidal shapes can undergo vibrational oscilla-tions (called phonons) of these shapes about their equilib-rium position. In principle, it is possible to superpose twoor more identical vibrations. However, the effects of thePauli Principle acting on the particles in the nucleus maydestroy such states. One test of their intact character is tostudy their gamma decay. If they have predominantly atwo-phonon character, then they should decay to the one-phonon state. Experimental searches for weak gamma-raydecay branches to the single phonon excitation are beingsought in these two nuclei.

Another application of β-decay exploits the fourfoldsegmentation of the clover detectors. In the Yale arrange-ment, four such detectors allow simultaneous coincidencemeasurements at a large number of different relative an-gles of emission between the two detected gamma-rays.These angular correlation measurements can be used toconstrain spin arguments for levels in the gamma-ray cas-cade. With clover detectors situated at appropriate angles,it is also possible to exploit their segmentation to measurethe linear polarization of the gamma-ray and thereby todeduce the parity relations of the nuclear levels involved(Duchene et al., 1999).

Finally, β-decay measurements are also an importanttool in mass measurements, since, often, the daughter orgranddaughter mass is known but not that of the parent.Nuclear masses (that is, in effect, binding energies) are ofimportance in a number of contexts. The binding energyreflects the sum of all the nucleonic interactions. Differ-ences of binding energies for neighboring nuclei give theseparation energy of the last nucleon and are therefore sen-sitive to single particle energies of nucleons in a mean fieldnuclear potential, as well as to shape and structure changesfrom one nucleus to the next. Mass measurements are alsoimportant for understanding the astrophysical processesoccurring in the interiors of stars that lead to nucleosyn-thesis. Recent studies of nuclei in the mass A ∼70 region,for example, are helping to set constraints on the termina-tion of the rapid proton capture process in certain classes ofstars.

Nuclear mass measurements are carried out by mea-suring gamma-ray spectra in coincidence with β-particledetection in order to deduce the β-decay end point, that is,the maximum β-decay energy (where energy sharing withthe simultaneously emitted anti-neutrino is insignificant).The end point energy directly gives the mass of the parentnucleus if the daughter mass is known. The gamma-raycoincidence is used to cleanly select the product nucleusof interest.

2. Coulomb Excitation

When a beam particle passes close to a target nucleus, oneor both nuclei may be excited by the changing electromag-netic Coulomb field between them (without any nuclearreaction occurring). Usually, a series of low-spin levels ofthe target nuclei are excited. The excitation probabilitiesare deduced by observing the subsequent de-excitationgamma-rays. A typical Coulomb excitation experimentinvolves bombarding a target of the (stable) isotope to bestudied with beams of particles (the beams used rangefrom protons to very heavy ions) at beam energies ofroughly 80% of the Coulomb barrier.

Coulomb excitation is a powerful technique to study nu-clear structure. Since the excitation mechanism is purelyelectromagnetic, it is known and calculable. Therefore,one can extract nuclear information from the excitationprobabilities. This is in contrast, for example, to inelasticscattering processes at beam energies above the Coulombbarrier where nuclear effects enter in both the excitationmechanism and the nuclear structure itself and must there-fore be disentangled.

In typical Coulomb excitation experiments, to correctlyaccount for Doppler effects, the gamma-rays are detectedin coincidence with the scattered beam particle. As noted,the excitation probability is enhanced by smaller impactparameters, which often result in scattering at backwardangles in the laboratory frame of reference. Hence, of-ten, annular particle detectors are placed at back angles(say, 140◦ ≤ θ ≤ 170◦). These detectors allow the beam topass through and then selectively identify those scatteringevents most likely to have resulted in nuclear excitations.

3. (n, γ ) Reactions

Historically, an immense amount of critical data onmedium and heavy mass nuclei came from the study ofradiative neutron capture, or (n, γ ), reactions with reactorneutrons. Like other reactions, such as heavy-ion fusion-evaporation reactions or β-decay that populate nuclearlevels from the top down, the process is non-selectiveand, therefore, gives access to a wide variety of nuclearstates. Indeed, when used in the average resonance capture(ARC) mode, the technique can actually guarantee that allstates in a given angular momentum and excitation en-ergy range can be identified, thus providing very sensitivetests of models (Caston et al., 1980). Such states can bedirectly observed from the so-called primary transitionsthat de-excite the capture state. The use of pair spectrome-ters is important here. When low-energy gamma-ray spec-tra are studied, one typically observes hundreds of transi-tions. Therefore, gamma-gamma coincidence techniquesare crucial. Alternatively, many of the most important

Page 141: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ Final

Encyclopedia of Physical Science and Technology En006G-273 June 29, 2001 21:10

Gamma-Ray Spectroscopy 447

(n, γ ) studies have used the ultra-high energy resolutionGAMS crystal diffraction detectors (see Fig. 3b). Studiesof nuclei such as 196Pt (Cizewski et al., 1978) and 168Er(Davidson et al., 1981) with (n , γ ) have provided some ofthe most comprehensive and complete level schemes everproduced and have provided key tests of nuclear models,such as the Interacting Boson Model. Today, most fore-front (n, γ ) work is carried out using the GRID techniqueto measure lifetimes with GAMS detectors.

B. High-Spin States

1. Backbending and the Pauli Principle

One of the fundamental questions to ask about a nucleus is:What is the shape? Is the nucleus spherical, like a soccerball, or deformed, stretched out like an American foot-ball or perhaps flattened like a Frisbee? It turns out thatsome nuclei are spherical; some are deformed like foot-balls; and some are deformed like Frisbees. The excitationspectrum of deformed nuclei is particularly easy to un-derstand. A deformed system has a defined orientation inspace (it is not isotropic), and rotations of this shape can beobserved. A quantum mechanical rotor has an excitationenergy given by

E(I ) = h✏ 2

2JI (I + 1),

where I is quantum number counting the angular mo-mentum of the state (I = 0, 2, 4, . . . for the ground staterotational band, the odd spins are missing for symmetryreasons which are not relevant here) and J is the momentof inertia of the nucleus. The gamma-ray energy (which ismeasured in the experiment) is just the energy differencebetween adjacent states.

Eγ (I → I − 2) = h✏ 2

2J[I (I + 1) − (I − 2)(I − 1)]

= h✏ 2

2J[4I − 2].

Thus, the gamma-ray energy increases linearly with angu-lar momentum. For gamma-rays linking adjacent levels,the energy difference is given by

Eγ = Eγ (I → I − 2) − Eγ (I − 2 → I − 4)

= h✏ 2

2J[(4I − 2) − 4((I − 2) − 2)]

= 4h✏ 2

J.

If the moment of inertia, J , does not change, then Eγ

is a constant, independent of spin. Usually, this is not thecase in nuclei. A rare example of a nearly ideal rotational

band, where the spacing between adjacent transitions isconstant, is shown in Fig. 8 (the most intensely popu-lated superdeformed band in 150Gd). However, usually,dramatic changes in structure occur (e.g., due to centrifu-gal forces or quenching of pairing) as a nucleus rotatesfaster and faster. These are manifest as deviations fromthe simple linear dependence outlined above. For exam-ple, a spectrum of the ground state rotational band of 158Eris illustrated in Fig. 9, where the lines indicate transitionslinking states with increasing spin. Notice that at gamma-ray energies of about 400 keV the transitions double backon themselves. This phenomenon is called backbendingand corresponds to a dramatic change in the internal struc-ture of the nucleus.

The origin of this structural change lies in the effects ofthe familiar Coriolis force on the microscopic structure ofthe nucleus. As we have stressed, the nucleus is not a rigidbody, but instead is made up of only a few hundred protonsand neutrons that orbit the center of mass in orbits char-acterized by particular angular momenta. We know thatmany medium mass and heavy nuclei exhibit propertiessimilar to those of a superconductor. In the ground stateof an even-even nucleus, all of the protons are coupledpairwise, in identical but time-reversed orbits, so that thetotal angular momentum of each pair is zero. Similarly, theneutrons are also paired. Hence, the total angular momen-tum of the ground state of any even-even nucleus is zero.As an interesting aside, it follows that in an odd-proton orodd-neutron nucleus, the ground state spin and parity isusually determined by the quantum numbers of the finalunpaired proton or neutron.

The question, therefore, becomes, what happens tothese pairs of protons and neutrons as the nucleus as awhole begins to rotate? Just as a person walking on amerry-go-round experiences a force on the moving plat-form, the so-called Coriolis force, the nucleons in the nu-cleus also experience the effect of the rotating bulk. Justas with the merry-go-round, the Coriolis force increasesthe faster the nuclear rotation or the orbital velocity. In-deed, the size of the Coriolis force is such that at moderatenuclear rotational frequencies it perturbs the orbits of theparticles sufficiently that the pairs of nucleons will beginto break apart. This has the effect of dramatically chang-ing the excitation energies of the states and the gamma-rayenergies for transitions between them. It is this breakingof the superconducting pairs that is responsible for thebackbending observed in Fig. 9.

An illustration of the effects of the Pauli exclusionprinciple can be seen in the rotational spectra of odd-evennuclei. Figure 10 is a plot of the angular momentum asa function of rotational frequency for 133Pr, which has59 protons and 74 neutrons. The two curves shown inFig. 10 correspond to rotational bands in which the final

Page 142: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ Final

Encyclopedia of Physical Science and Technology En006G-273 June 29, 2001 21:10

448 Gamma-Ray Spectroscopy

FIGURE 8 Spectrum of the most intensely populated superdeformed band in 150Gd. In addition to the regularpicket-fence pattern of gamma-rays associated with decays of superdeformed states, the spectrum also shows,at lower energies, the complex pattern of transitions depopulating the nearly spherical normal deformed states in150Gd.

unpaired proton is in different orbits about the nucleus.In one of these cases the odd-proton acts like a spectatorto the underlying even-even nucleus, and in this case theabove backbending phenomenon occurs as before. In theother band, however, the odd-proton occupies one of theorbits of the pair of aligning nucleons. The pair breakingis therefore prohibited by the Pauli exclusion principle,and the backbending is delayed until higher rotational fre-quencies when it becomes possible to occupy higher lyingorbits.

2. Superdeformation

One of the forefront areas of research in high-spin nu-clear structure physics over the last decade has been thestudy of superdeformed (SD) nuclei. These states exist ina second minimum in the nuclear potential energy surfacein which the nucleus takes on an ellipsoidally deformedshape which roughly corresponds to an integer ratio of

major to minor axes, typically 2:1 or 3:2. The observationof the first high-spin SD bands in 152Dy and 132Ce, by theLiverpool University groups of Peter Twin and Paul Nolan,respectively (Twin et al., 1986; Nolan et al., 1985),sparked an enormous worldwide effort to discover addi-tional examples of highly deformed nuclei and to charac-terize the properties of such highly stressed systems(stressed both by the application of very high angularmomenta and by extreme values of deformation). Today,about 40 nuclei in four main mass regions, or islands, havebeen shown to exhibit SD behavior. Most of these nucleihave more than one known SD band.

Superdeformed rotational bands are generally charac-terized by extremely regular gamma-ray energy spacing.The energy spacing from one transition to the next in therotational band is either constant or varying slowly andregularly from one transition to the next. For example,the strongest SD band in 150Gd is illustrated in Fig. 8.The regular picket-fencelike pattern of SD transitions is

Page 143: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ Final

Encyclopedia of Physical Science and Technology En006G-273 June 29, 2001 21:10

Gamma-Ray Spectroscopy 449

FIGURE 9 Spectrum illustrating the ground state rotational band of 162Er illustrating the backbending phenomenon.(Figure courtesy of Mark Riley).

unmistakable in this spectrum as is the irregular pattern oftransitions de-exciting lower lying normal deformed states(150Gd is nearly spherical in its ground state).

Due to this regularity, which is the rule rather than theexception for SD bands, one can feel confident in predict-ing where transitions in a given band should occur. How-ever, detailed measurements of the strongest SD band in149Gd, using the Eurogam Ge detector array, revealed avery small deviation from this smooth behavior (Flibotteet al., 1993), Indeed, it was found that every secondenergy spacing was larger/smaller than the average. Thedeviation, illustrated in Fig. 11, is very small, only about0.25 keV, and is measurable only due to the very highquality spectra available from the Eurogam array. It is be-lieved that the deviation is caused by alternate states in

the rotational band being perturbed up and down in en-ergy by very small amounts, on the order of 60 eV. Thisstaggering essentially separates the rotational band intotwo I = 4 h✏ sequences. The origin of the perturbation,which affects states differing in spin by 4 h✏ , is still unclear.Several theoretical models have been proposed to explainthis phenomena, none of which, however, can reliably pre-dict which SD bands should exhibit staggering and whichshould not.

3. Magnetic Rotation and Chiral Symmetry

Interesting effects have also emerged from the study ofnear-spherical nuclei. One of the consequences of quan-tum mechanics is that the rotation of a spherical shape

Page 144: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ Final

Encyclopedia of Physical Science and Technology En006G-273 June 29, 2001 21:10

450 Gamma-Ray Spectroscopy

FIGURE 10 Component of the angular momentum along the ro-tation axis vs rotational frequency for different rotational bandsin 133Pr. Notice that the backbending observed at a frequency of∼0.25 MeV is completely missing in one of the bands. The ab-sence of a crossing in this band is a dramatic illustration of thePauli exclusion principle.

cannot be observed. How then does a spherical or near-spherical nucleus generate angular momentum? Ratherthan a collective rotation of the whole shape, it does so byrearranging the orbits of its constituent protons and neu-trons, by single particle excitations to higher lying excitedstates with high values of angular momentum. Typically,such excitations have irregularly spaced energies resultingin a gamma-decay spectrum with many irregularly spacedpeaks (see the lower energy portion of Fig. 8).

It was a surprise, therefore, when regularly spacedsequences of gamma-rays were observed in some almostspherical light Pb nuclei, near the doubly closed shell208Pb. Furthermore, these apparently rotational-like cas-cades were found to consist of very strong magnetic dipole(M1) transitions which change the angular momentumby I = 1 h✏ , with very weak, or unobserved, I = 2 h✏

electric quadrupole transitions (E2). In contrast, a rota-tional band in a well-deformed nucleus consists of a sequ-ence of strong I = 2 h✏ E2 transitions. The absence ofE2 transitions in these new bands is an indication ofthe near-spherical nuclear shape. However, the regularityof the new band structure implies a type of collectivebehavior.

The tilted axis-cranking model provides an explana-tion for these bands (Frauendorf, 1993). For certain near-spherical nuclei with proton and neutron particle numbersclose to magic numbers, the angular momentum vectors ofthe unpaired proton particles and neutron holes prefer toalign perpendicular to each other, with one vector point-ing along the rotation axis and the other perpendicularto the rotation axis. The vector sum of these two angu-lar momenta then lies at an angle to the nuclear symmetryaxis. Furthermore, the vector accounts for almost all of thenuclear angular momentum, since the collective rotationof the near-spherical shape is small. Higher angular mo-mentum states are generated by slowly closing these twoangular momentum blades, or shears, pushing against therepulsive particle-hole nuclear interaction. The enhancedM1 transitions arrise because the magnetic dipole momentis proportional to the component of the individual protonand neutron angular momenta perpendicular to the totalangular momentum.

An interesting extension of the idea of tilted axis crank-ing comes when we consider the possibilities in doublyodd deformed, triaxial nuclei. As for shears bands, forcertain favorable particle numbers the angular momentaof the final unpaired proton and neutron align preferen-tially perpendicular to each other, along the nuclear ro-tation (short) and symmetry (long) axes. For a triaxialnuclear shape, considerations of irrotational flow indicatethat the collective angular momentum should align pref-erentially with the intermediate length nuclear axis. Thus,the three angular momentum vectors can form either a

FIGURE 11 Energy staggering in the strongest SD band in 149Gd(Flibotte et al., 1993). The figure shows the deviation of the mea-sured gamma-ray energy from a smooth reference as a func-tion of rotational frequency. Notice that the deviation is extremelysmall, usually less than 0.25 keV. This deviation correspondsto a tiny perturbation in the nuclear energy levels of only about60 eV.

Page 145: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ Final

Encyclopedia of Physical Science and Technology En006G-273 June 29, 2001 21:10

Gamma-Ray Spectroscopy 451

FIGURE 12 A partial level scheme of the odd-odd nuclei 136Pm (61 protons and 75 neutrons) and 138Eu (63 protonsand 75 neutrons). The proposed chiral twin bands are shown on the left of each level scheme.

right- or a left-handed coordinate system. The so-called3d-tilted axis-cranking model, developed by S. Frauendorfand J. Meng (1997), addresses such a system andpredicts a doubling of energy levels, one corresponding toeach chirality or handedness. For complete symmetry, thelevels of the same spin and parity would be degenerate. Ifthe solutions for different chiralities mix, then the degen-eracy will be broken, and one set of states, correspondingto a I = 1 h✏ rotational band, will be lifted with respectto the second band. Indeed, two I = 1 h✏ bands in thedoubly-odd nucleus 134Pr have been proposed as a pos-sible chiral candidate (Frauendorf and Meng, 1997). Fol-lowing on this suggestion, several other candidate bandshave been observed in nearby nuclei (Starosta et al., 2001;Beausang et al., 2001; Hecht et al., 2001), while candidatebands have also recently been reported in doubly-odd 188Ir(Balabanski et al., unpublished). The proposed chiral twinbands in 136Pm and 138Eu are shown in Fig. 12.

C. Spectroscopy in Coincidencewith Separators

A great deal of exciting new spectroscopy of nuclei farfrom stability or with very large Z has been achieved overthe last several years when large Ge detector arrays have

been coupled to high-transmission magnetic separators.A magnetic separator is a device placed behind the targetposition which will selectively transport nuclei, producedin a reaction, to its focal plane where they can be de-tected and identified using a variety of different detectors.Residual nuclei that are not of interest, or scattered beamparticles, will not be transmitted through the separator.Very small fractions of the total reaction cross section canbe selected using this method. Nuclear structure informa-tion is obtained by detecting gamma-rays produced at thetarget position, in coincidence with recoils detected at thefocal plane. One example of the use of this technique isillustrated here.

One of the goals of nuclear physics is to understandthe limits of nuclear existence as functions, for example,of angular momentum, isospin, or indeed mass. For ex-ample, what are the heaviest nuclei that can exist? Formany years now, various models have predicted that anisland of superheavy nuclei should exist. However, mostmodels disagree as to the exact proton and neutron num-bers categorizing this island and indeed on the extent ofthe island. Recently, models have predicted that these su-perheavy nuclei might indeed be deformed. Therefore, itis very relevant to inquire as to what is the structure ofthe heaviest nuclei accessible to gamma-ray spectroscopyand to ask the simplest type of questions about them, for

Page 146: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ Final

Encyclopedia of Physical Science and Technology En006G-273 June 29, 2001 21:10

452 Gamma-Ray Spectroscopy

example, are they spherical or deformed? Unfortunately,the production cross sections for superheavy nuclei aresuch that, even using very intense beams, only one ortwo nuclei are produced per week or two. These smallnumbers are clearly beyond what we can measure withexisting gamma-ray facilities. Therefore, we cannot ad-dress the spectroscopy of the superheavy elements (yet).However, we can look at the structure of very heavy nucleilying just below these unattainable regions.

Recently, groups at Argonne National Laboratory inthe United States and at the University of Jyvaskyla inFinland carried out tour de force experiments to study theexcitation spectrum of 254No (Leino et al., 1999). WithZ = 102, No is the heaviest nucleus for which gamma-ray spectroscopy has ever been carried out. The gamma-ray spectrum of transitions de-exciting states in 254No isshown in Fig. 13. A rotational band structure is clearlyvisible, indicating that 254No is in fact a deformed nu-cleus. A very surprising feature of the spectrum is thatthe rotational band is observed up to very high spins∼18 h✏ , (an amazing number for such a heavy, fissile nu-cleus). The existence of a rotational cascade up to spin∼18 h✏ , well beyond the classical fission barrier limit, indi-cates that 254No is held together primarily by microscopicshell effects, rather than macroscopic liquid drop binding,as in normal nuclei. Shell effects, for certain favorable pro-ton and neutron numbers and for favorable deformation,can provide an additional 1–2 MeV of binding energy. Itis this binding energy, which does not depend strongly onangular momentum, which holds 254No together to suchhigh spin.

D. Experiments with Radioactive Beams

Today, a new era in nuclear structure physics is opening upwith access to a much wider selection of nuclei, extendingfar beyond the valley of stability and encompassing nu-clei that are expected to be exotic in both proton/neutroncomposition and structure. The physics opportunities withsuch beams have been discussed elsewhere (RIA PhysicsWhite Paper, 2000) and need not be repeated here. Whatare relevant are the particular methods of carrying outgamma-ray spectroscopy on exotic nuclei. Basically, thetechniques to be used will be familiar ones, such as β-decay, Coulomb excitation, and fusion-evaporation reac-tions. High-, medium-, and low-spin states will all presenttopics of interest.

Experiments with radioactive beams differ primarily intwo critical respects from their stable beam siblings. First,beam intensities will often be much lower than with sta-ble beams. Instead of beams of 1011 particles per second,many experiments will need to be carried out with inten-sities that are less than 106 particles per second and, at

the limits of accessibility, down to 1 particle per secondor even less. Therefore, detectors will have to be corre-spondingly more efficient. Second, because the nucleusto be studied is sometimes the one produced as a beamby the radioactive beam facility, most experiments will bedone in inverse kinematics in which the roles of beam andtarget are interchanged.

In inverse kinematics, mb > m t where mb and m t are themasses of the beam and target nuclei. Therefore, the reac-tion products all go forward in the laboratory system. Formb � m t , this forward focusing results in a quite narrowcone of reaction products. For example, for elastic scat-tering of 62Ni on 12C, the maximum allowed scatteringangle is ∼10◦. This has two principal effects. First, mea-suring angular distributions of reaction products is muchmore difficult. Second, on the other hand, it is possible tocapture much larger percentages of the reaction productsin the acceptance angles of various types of charged par-ticle spectrometers and mass separators, thereby enhanc-ing counting rates. These considerations impose designconstraints on gamma-ray detectors surrounding the tar-get. First of all, ultra-high efficiency is needed. Second,generally, a forward angled cone needs to be left free ofdetectors.

The requirement of maximal gamma-ray counting ef-ficiency generally means a close geometry and detectorsthat subtend large solid angles. However, Doppler effectscan then be very large, especially when using inverse kine-matics, and high detector granularity will generally becritical. This granularity can currently be achieved in twoways and considerable development in both directions isneeded. One is the use of highly segmented tracking arrayssuch as GRETA discussed earlier. The other is the use ofposition-sensitive Ge detectors of the type developed byGlasmacher and colleagues for use in intermediate energyCoulomb excitation experiments at MSU (Muller et al.,in press). In these detectors a resistive readout at the twoends of a linear Ge crystal allows the localization of theγ -ray interaction to an accuracy of ∼2 mm. This detec-tor is capable of measurements even with beam intensi-ties of ∼1 particle per second or even less. An exampleof a gamma-ray spectrum (corrected for Doppler effects)from intermediate energy Coulomb excitation taken withan early generation detector system (using NaI(Tl) detec-tors) is shown in Fig. 14. This data was taken on 40S inorder to test predictions of the underlying particle mo-tion in exotic nuclei with a high excess of neutrons overprotons (the heaviest stable isotope of sulfur has 20 neu-trons, 40S has 24 neutrons). The Coulomb excitation wasaccomplished in this case using a 197Au target.

For specialized experiments, such as low-energyCoulomb excitation in inverse kinematics designedexplicitly to excite only the lowest one or two states,

Page 147: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ Final

Encyclopedia of Physical Science and Technology En006G-273 June 29, 2001 21:10

Gamma-Ray Spectroscopy 453

FIGURE 13 Spectrum of gamma-rays depopulating excited states in 254No. With Z = 102, 254No is the heaviestnucleus for which gamma-ray spectroscopy has ever been accomplished. The gamma-rays are labeled by theirtransition energies in kilo-electron volts and also by the spin of the state they depopulate. The inserts show thepopulation intensity as a function of spin for the two beam energies, 215 (top) and 219 MeV (bottom). More angularmomentum is brought into the system at higher beam energies, and this is reflected in the stronger population ofhigher spin states in the lower spectrum (Leino et al., 1999).

the gamma-ray spectra are particularly simple. Therefore,energy resolution is not a problem and high efficiencycan be obtained with, for example, low-resolution NaI(Tl)detectors placed in close geometry. Here, Doppler effects

are both unimportant and undetected. One typical design,the GRAFIK detectors (Sheit et al., 1996) actually incor-porates the target inside an annular hole in the detector,achieving ∼80% of 4π solid angle coverage.

Page 148: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ Final

Encyclopedia of Physical Science and Technology En006G-273 June 29, 2001 21:10

454 Gamma-Ray Spectroscopy

FIGURE 14 Gamma-ray spectra following intermediate energyCoulomb excitation of a radioactive 40S beam on a 197Au tar-get. The spectra are corrected for Doppler effects for gamma-raysemitted from nuclei at rest in the laboratory frame (top) or fromnuclei moving at the beam velocity (bottom). The gamma-ray de-exciting the first excited state of 40S is clearly visible in the lowerspectrum. [From Muller et al. (in press). Nucl. Instrum. Methods.]

IV. CONCLUSIONS

Although the dominant interaction binding nucleons intonuclei is the strong force, the electromagnetic interaction,as manifested primarily in gamma-ray spectroscopy, pro-vides an ideal probe of the structure and excitations of thenucleus. Indeed, gamma-ray spectroscopy in nuclear andastrophysics research is a broad and diverse field, utiliz-ing a variety of detector systems that vary greatly (accord-ing to the needs of particular experiments) in resolution,efficiency, gamma-ray energy range, and other properties.These detectors are often used alone or in conjunction withauxiliary devices such as charged particle or neutron de-tectors. The areas of nuclear structure addressed with suchinstrumentation cover the whole gamut of nuclei spanningthe entire nuclear chart and physics problems,, rangingfrom the motion of individual nucleons to collective flows(e.g., vibrations or rotations) of the nucleus as a whole.

ACKNOWLEDGMENT

We are grateful to many colleagues for advice and discussions and, inparticular, to Mark Caprio, Thomas Glasmacher, Hans Borner, and Mark

Riley for providing the figures we have used. This work was supportedin part by the U.S. DOE grant number DE-FG02-91ER-40609.

SEE ALSO THE FOLLOWING ARTICLES

GAMMA-RAY ASTRONOMY • ION BEAMS FOR MATERIAL

ANALYSIS • NUCLEAR PHYSICS • POTENTIAL ENERGY

SURFACES

BIBLIOGRAPHY

Barton, C. J., et al. (1997). Nucl. Instrum. Methods A391, 289.Balabanski, D., et al., to be published.Beausang, C. W., et al. (1992). Nucl. Instrum. Methods A313, 37; Beck,

F. A., et al. (1992). Prog. Part. Nucl. Phys. 28, 443; Nolan, P. J. (1990).Nucl. Phys. A520, 657c.

Beausang, C. W., et al. (2000). Nucl. Instrum. Methods A452, 431.Beausang, C. W., et al. (2001). Nucl. Phys. A682, 394c.Buescher, M., et al. (1990). Phys. Rev. C41, 1115; Mach, H., et al. (1990).

Phys. Rev. C41, 1141.Borner, H. G., and Jolie, J. (1993). J. Phys. G19, 217.Casten, R. F., et al. (1980). Phys. Rev. Lett. 45, 1077.Casten, R. F. (2000). Nucl. Phys. News Int. 10, 4.Casten, R. F., and Nazarewicz, W. (2000). “White Paper for the RIA

Workshop, Raleigh-Durham, North Carolina, July 24–26, 2000.”Cizewski, J. A., et al. (1978). Phys. Rev. Lett. 40, 167.Davidson, W. F., et al. (1981). J. Phys. G7, 443, 455.Deleplanque, M. A., et al. (1999). Nucl. Instrum. Methods A430,

292.Duchene, G., et al. (1999). Nucl. Instrum. Methods A432, 90.Eberth, J., et al. (1996). Nucl. Instrum. Methods A369, 135.Frauendorf, S. (1993). Nucl. Phys. A557, 259c.Frauendorf, S., and Meng, J. (1997). Nucl. Phys. A617, 131.Flibotte, S., et al. (1993). Phys. Rev. Lett. 71, 4299.Gerl, J., and Lieder, R. (1992). “Euroball III,” GSI Darmstadt Report.

Darmstadt, Germany.Hecht, A., et al. (2001). Phys. Rev. C63, 051302(R).Johnson, A., Ryde, H., and Sztarkier, J. (1971) Phys. Lett. B34, 605.Koch, H. R., et al. (1980). Nucl. Instrum. Methods 175, 401; Kessler,

E. G., et al. (2001). Nucl. Instrum. Methods A457, 187.Lee, I. Y. (1990). Nucl. Phys. A520, 641c; Deleplanque, M. A., and

Diamond, R. M., eds. (March 1988). “The Gammasphere Proposal: ANational Gamma-Ray Facility,” LBL, Berkeley, CA.

Leino, M., et al. (1999). Eur. Phys. J. A6, 63; Reiter, P., et al. (1999).Phys. Rev. Lett. 82, 509.

Morinaga, H., and Gugelot, P. C. (1963). Nucl. Phys. 46, 210.Muller, W. F., et al. (in press). Nucl. Instrum. Methods.Nolan, P. J., Gifford, D. W., and Twin, P. J. (1985). Nuicl. Instrum.

Methods A236, 95.Nolan, P. J., et al. (1985). J. Phys. G11, L17.Riedinger, L. L., et al. (1980). Phys. Rev. Lett. 44, 568.Sheit, H., et al. (1996). Phys. Rev. Lett. 77, 3967.Simpson, J., et al. (2000). Heavy Ion Phys. 11, 159.Starosta, K., et al. (2001). Phys. Rev. Lett. 86, 971.Twin, P. J., et al. (1986). Phys. Rev. Lett. 57, 811.

Page 149: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN007G-315 June 30, 2001 17:33

High-Pressure ResearchE. F. SkeltonA. W. WebbNaval Research Laboratory

I. IntroductionII. High-Pressure Research Environments

III. Measurement of PressureIV. Research at High Pressure

GLOSSARY

Bridgman anvil Anvil design used originally with veryhard metallic anvils and most recently with diamondanvils employing the principle of massive support.

Diamond cell Compact device used for generating pres-sures to 500 GPa in microscopic samples with the useof gem quality diamond anvils.

Equation of state Mathematical expression describingthe relationship between the volume, mass, tempera-ture, and pressure of a system under conditions of ther-modynamic equilibrium.

Hugoniot curve The locus of points in either the shockvelocity-particle velocity plane or the stress-volumeplane derived from experimental measurements.

Phase diagram Graph combining two thermodynamicalvariables, such as pressure and temperature, of a ma-terial drawn so that a particular curve represents theboundary between two phases of the material.

Shock compression State in matter achieved by the pas-sage of a very large amplitude mechanical pulse ofdisturbance through a material for necessarily briefdurations.

PRESSURE is an important thermodynamical variable;it provides the most efficient means of altering interatomicdistances while leaving the thermal energy of the sys-tem invarient. It therefore provides an important mecha-nism for testing theoretical models that are based uponinteratomic separations and crystallographic configura-tions. Pressure can also be used along with temperatureto assist chemical reactions or to bring about crystallo-graphic phase transformations. New allotropes, formedunder conditions of extreme pressure or temperature, mayhave physical properties that are significantly differentfrom those of the material formed under normal condi-tions. A classic example is carbon: the hardness, electri-cal and thermal conductivities, and transparency of dia-mond, the phase of carbon formed at elevated pressuresand temperatures, are significantly different from those ofgraphite, the phase of carbon that is stable under normalconditions. Another example of how pressure has beeneffective in producing a new and better polymorph is thatof Nb3Si. Empirical arguments have suggested that if thismaterial could be formed in the cubic A15 structure, ratherthan in its normal tetragonal Ti3P phase, it would exhibitsuperior superconducting properties. High-pressure shock

345

Page 150: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB Final Pages

Encyclopedia of Physical Science and Technology EN007G-315 June 30, 2001 17:33

346 High-Pressure Research

treatments have been successful in producing this transfor-mation. In this cubic phase, the superconducting transitiontemperature of Nb3Si is 18.5–19 K, as compared to 0.29 Kin the tetragonal phase.

The role of pressure in understanding physical proper-ties of materials can also be of importance. An examplewhere it may be vital is in understanding the origin ofsuperconductivity in certain organic charge transfer saltsbased on the cation molecule ditetramethyltetraselenoful-valenium (TMTSF), since all but one of these salts is onlysuperconducting at elevated pressures.

I. INTRODUCTION

A. Definition of Pressure

Pressure is defined as the ratio of a force divided by thearea over which that force acts; thus, the units of pressure,force per unit area, are newtons per square meter (=1 Pa),dynes per square centimeter (=10−6 bars), or poundsper square inch (at sea level 1 atm pressure = 1.0133 ×105 Pa = 1.0133 bars = 14.696 lb in.−2). In this article weshall use the S.I. unit of pressure, the Pascal, abbreviatedPa. Since the pressure, or more correctly the stress state,will vary depending on the direction of the force relativeto the area over which it is applied, in a strict sense, it isnecessary to consider the six independent components ofthe stress tensor. In terms of a working definition, how-ever, most researchers usually presume that they are deal-ing with hydrostatic pressure or something close to it, inwhich case the three diagonal elements of the stress ten-sor are the same value and all off diagonal terms, whichrepresent the shear components, are zero.

In modern high-pressure research, the force is usuallytransmitted to the sample of interest via some medium. Ifthat medium is a fluid, then a true hydrostatic pressure en-vironment does exist, although today much high-pressureresearch is carried out in the range of tens of gauss Pascalspressure or above (1 GPa = 10 kbar), where few materialsremain in the fluid state. Consequently, much work is donetoday under conditions of quasi-hydrostatic pressure (i.e.,since the forces are transmitted through solidified fluidsor relatively soft solids, shear stress components can be aproblem).

Natural pressures found in our universe range from100 kPa in the atmosphere at sea level, to about 100 MPa atthe deepest part of the oceans (the bottom of the Marianastrench in the Pacific Ocean), to 0.36 TPa at the center of theearth, to tens of tesla Pascals at the center of our sun. To-day, synthetic, or man-made, pressure environments canbe produced in the laboratory to span this range. In termsof static pressures, the relatively large volume systems

(∼50 cm3) can achieve pressures up to 1 GPa, while themore recently developed diamond-anvil cells can be usedto subject microscopic-size samples (∼10−8 cm3) to pres-sures in excess of 500 GPa. Each of these systems is dis-cussed in more detail below.

Much higher pressures can be achieved for brief periodsof time by using conventional or nuclear explosive-drivenshock waves. Current research in this area also involvesthe study of shock waves produced by high-powered laserbeams. These shock pressures are usually accompaniedby significant elevations in the sample temperature. Thepressures achieved by shock waves can be in range of afew tesla Pascals.

B. Historical Review

The father of modern high-pressure research is Percy W.Bridgman, a man who dedicated his professional life tohigh-pressure experimentation. Working at Harvard Uni-versity, he published over 200 research papers and, in ad-dition to a wealth of basic scientific research, was alsoresponsible for several important technological discover-ies: the principle of unsupported area, known today asthe Bridgman seal, and the principle of massive support,upon which the Bridgman anvil is based. In 1946 he wasawarded the Nobel Prize for his pioneering work in thisfield.

The work following Bridgman can be divided intothree parts based on the experimental technique employed:dynamic or shock pressures, large-volume studies, anddiamond-anvil cell work. Each of these methods will bediscussed in detail in the body of this article.

II. HIGH-PRESSURE RESEARCHENVIRONMENTS

A. Static Pressures

1. Fluid Systems

The application of pressure by means of a fluid, whetherliquid or gas, has generally proven to be the most satis-factory method. These hydrostatic environments will notsubject a test specimen directly to shear stresses; how-ever, oftimes the reaction in the sample may be anisotropicand internal shear components can exist. The maximumpressure obtainable in a fluid system is limited by the de-sign and strength of the container, often referred to as a“pressure bomb.” The ultimate strength of even compoundcylinders poses a limit, but more often the sealing of suchsystems presents the defining pressure limit.

One of the best solutions to the sealing problem isthe “unsupported seal” designed by Bridgman: in this

Page 151: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB Final Pages

Encyclopedia of Physical Science and Technology EN007G-315 June 30, 2001 17:33

High-Pressure Research 347

mechanism, the confining pressure is itself used to helpseal the system. A mushroom-shaped piston or plug facesthe system on its full diametrical face, but in turn trans-mits the total resultant force to soft annular packing that isplaced between the plug and the driving rod. The smallerstem of the plug projects into a well in the rod for align-ment purposes, but the well has sufficient depth that thepacking must bear the full load. Therefore, the pressureon the packing is always larger than the fluid pressure bya constant factor. This factor is related to the relative areasof the piston face and the annulus. In practice, the packingmaterial may vary from rubber, for lower-pressure sys-tems, to stacks of softer and harder metals, with In, Pb, oreven annealed Cu serving as the sealing agent.

Although the working pressures obtainable in fluid sys-tems are generally below 1 GPa, even at these modestpressures, the viscosity of many fluids will have increasedto the point where small-bore plumbing no longer al-lows equilibration of the pressure throughout the system.We shall return to this problem in the discussion of thediamond-anvil cells.

2. Means of Pressure Generation

Liquids can be pumped on by any of several meth-ods. Hand-operated piston pumps have served for small-volume systems, although special modifications for thesliding seal and check valves may be needed as the ulti-mate pumping pressure increases. In some cases, doublechecks have been found to be effective, and all parts mustbe machined to extremely close tolerances and must re-ceive fine finishes. For larger systems or very high pres-sures, these pumps are tedious to operate.

Another manually operated system employs a largehand-operated wheel driving a screw that forces a pistoninto the pump cylinder. This type of system is limited inmaximum pressure by the relative volumes of the pump,the system, and the stroke of the piston.

Pumps adapted from manually operated units anddriven electrically are common at hydraulic system pres-sures. Air-driven diaphragm pumps using normal air areconvenient for the operation of relatively large systems topressures approaching 200 MPa.

Liquid systems are also encountered in which the liquidis sealed in a small cylinder with a piston that is loadedwith a hydraulic press and then clamped. Such systemscan be made quite small and, for this reason, often findapplications in research involving cryogenic temperaturesor high magnetic fields.

Gas systems can be pumped directly to pressures ap-proaching 100 MPa; pressures above this must be obtainedindirectly. One of the oldest methods is to use a U-shaped,high-pressure chamber with Hg forced into one leg by gas

pumped to the system capability. This flooded leg is thenpumped with oil, thereby raising the pressure in the gasto the maximum attainable with the liquid system. Thisoften takes several cycles of the system with appropriatevalving, since the gas is significantly more compressiblethan the pump fluid.

A second method of increasing gas pressure is throughthe use of a stepped piston intensifier. In this device, alarge oil-driven piston is mechanically linked to a smallerdiameter piston, which uses the mechanical advantagesof the ratios of the areas to pump the gas to pressuresperhaps as much as 10 times that of the oil. As in theprevious case, several cycles may be needed in order toattain the maximum effect.

3. Experimental Probes

One subject receiving extensive study is the fluids thatserve as compression agents, as discussed above. Fusioncurves, especially at low temperatures, are, of course, ofconsiderable interest. The transformation from the Fluid tothe solid state requires assumptions concerning the magni-tude of the strains present and their effect on the pressure–temperature status.

Experiments designed to map out equations of state forselected fluids are more ambitious. These require simulta-neous determination of the pressure, volume, and temper-ature (P , V , T ) of a given system. The measurement ofthe volume is the most difficult, since even heavy walledcontainers will undergo some, albeit small, deformationunder load. One technique is to seal a known amount ofthe fluid of interest in a system separated from the pressurefluid by a bellows. A sensing system, either internal or ex-ternal, monitors the change in the extension of the bellowsof the system and, thus, the volume as the pressure andtemperature are varied.

If the pressure containers employed are fabricated ofnonmagnetic materials, such as Be–Cu, then changes inthe magnetic properties of the contained material can beassessed through the walls, thereby obviating the need fordirect contact into the pressure chamber. Although thiswill greatly simplify the experiment by removing the needto pass electrical leads through the pressure walls, it canhave the drawback of a large experimental error since thesample usually represents only a small portion of the ex-ternal sensing coil volume.

The issue of making electrical contact to the containedsample is not a simple one. Insulated wires must bebrought through the pressure wall into the fluid-filled vol-ume without pinching off or extruding the wires them-selves or destroying the insulation. Extrusion is usuallycontrolled by placing a conical portion of the lead on thehigh-pressure side with an insulating sleeve that is only

Page 152: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB Final Pages

Encyclopedia of Physical Science and Technology EN007G-315 June 30, 2001 17:33

348 High-Pressure Research

driven tighter into its seat with the application of pres-sure. A preload is useful in setting the seal before the ini-tial application of pressure. Liquids will not pass throughthe insulation as readily as gases, especially He, andtherefore, this type of probe is often used in clamp-typecells.

Another sealing technique involves the use of leadsswaged in high-temperature insulation with stainless steeljacketing. These leads are common with thermocouple as-semblages; other special conductors can also be used. Inthese applications, the metal sheathing is silver solderedinto a threaded plug that passes the leads into the pressurecavity. The interior end is sealed with epoxy.

A third technique that has been successfully used topressures up to 1 GPa involves passing the leads fromthe high-pressure environment through a second length ofhigh-pressure tubing, with the first being used to pressur-ize the system. A large U-bend is placed in this tubingthat is filled with an oil and then submerged in a liquidnitrogen dewar, thus freezing the oil through which theleads pass. Care must be taken to insure that liquid oil ispresent above the frozen solid on the high-pressure sidein gas-pressurized systems.

4. Piston-Cylinder Devices

a. General. Simpler, in principle, than fluid bombsare the piston-in-cylinder devices. In their simplest form,a hole is drilled in a block of solid material and, afterplugging one end or using a blind hole, the test sampleis inserted (see Fig. 1). A strong, close-fitting piston isthen inserted and the pressure is applied. Solid materialsare readily studied in such a device, unless they are verysoft, such as Pb, In, or polymers, or have a low coefficient

FIGURE 1 Piston and cylinder device. A is the WC piston; B isthe WC fixed piston or nib; C is the WC cylinder; D is the shim;E and F are the hardened steel compression binding rings; G isthe soft steel safety ring; and H is the press plattens. D and E aremachined with tapers giving an interference fit when pressed intothe final assembly.

of friction, such as some of the transition metal dichalco-genides, for example, MoS2 or WSe2. In these latter cases,the aforementioned unsupported seal can be employed.

b. Design principles. Normally, a piston is groundwith a very small clearance in the lapped cylinder, onthe order of 0.0005 in. This is usually sufficient to sealmost solids, except as noted above. For lower pressure ap-plications, the piston and cylinder are made of hardenedtool steel, with the piston made somewhat harder thanthe cylinder. With quality steels, pressures approaching2.5 GPa can be attained in these systems. Higher pres-sures require harder materials, with WC being the mosttractable. This is a brittle material, and while it does havea high compressive strength (∼800,000 psi), it is weak intension—its tensile strength is usually only about 2% ofits compressive strength. Therefore, it needs support thatis usually provided by two or three interference fit supportrings or hardened steel. These will serve to prestress theWC and allow it to be worked to much higher pressures.These binding rings are designed to maintain a compres-sive load on the WC to its maximum working pressure.

c. Failure modes. Shortly after the internal pressureexceeds the sum of the interference compression and themodest WC tensile strength, radial tensile fracture occurs.If the problems with radial fracture can be forestalled,the next mode of failure is breakage along a plane per-pendicular to the cylinder axis. The solution to this is toagain prestress the WC, this time with end clamping ofthe cylinder. Some researchers have utilized sophisticatedhydraulic clamping systems to adjust the end loading pro-portional to the sample pressure; however, the equipmentneeded for such work can be costly, and the ultimate pres-sures achieved can usually be attained more easily by othertechniques. Heavy clamping bolts and support rings canalso be used for such loading.

d. Multi-staging. Since the failure of the cylinder isdue to the difference in pressure between the sample cham-ber and the outside environment, then the achievable in-ternal pressure of the system could be elevated if a meanscould be found to pressurize the entire assemblage. Thus,if a device capable of generating, for example, 10 GPawere placed in a similar device, then theoretically a max-imum pressure of 20 GPa could be attained. Of course,this requires a system large enough to contain the entiresecond system within its pressure cavity, and a third stagewould then go within the second cylinder. Although sim-ple in theory, application has proven difficult. Bridgmanbuilt an operating two-stage system and the High PressureInstitute, located near Moscow, Russia, has a very largepress with three-stage operation.

Page 153: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB Final Pages

Encyclopedia of Physical Science and Technology EN007G-315 June 30, 2001 17:33

High-Pressure Research 349

FIGURE 2 Bridgman anvils: A is WC anvils with 5–20◦ angle onthe outer portion, providing massive support on the load bearingcenter; B is the compression binding ring.

5. Other Uniaxial Systems

Second to the Bridgman anvils (Fig. 2), the “belt” is per-haps the most important apparatus used for large-volume,high-pressure work (see Fig. 3). This stems from its devel-opment, which provided the increased pressure and tem-perature conditions necessary for the original synthesis ofdiamonds. Although few research laboratories have needfor this large sample volume, it is used by several manufac-turers for the routine, commercial production of diamondgrit.

As shown in Fig. 3, elements of both the piston-cylinderand the Bridgman anvil device are used in the belt appara-tus. The cylindrical belt contains a large-volume sample,and the truncated concial pistons utilize the massive sup-port and the compressive gasket concepts. With applica-tion of internal electrical resistance heaters, temperaturesof 2000◦C can be maintained at pressures up to 10 GPa.The belt anvils and cylinders are usually designed witha curvature, although some devices have utilized straightconical sections. These employ the same principles andare simply somewhat easier to machine.

FIGURE 3 Belt apparatus: A is WC tapered pistons; B, C, F, andG are hardened steel binding rings; D and H are soft steel safetyrings; E is the WC belt cylinder; J is the compressible gasket;K is the cylindrical sample container; and L is the sample endcaps. Binding rings are assembled with interference fits to givecompressive support to the WC parts.

As noted above, the belt is principally a high-temperature, high-pressure device. It can also be used forelectrical measurements at elevated temperatures by pass-ing contact and thermocouple leads out through one orboth of the gaskets. An equitorially split belt was onceused for X-ray diffraction studies at temperatures up to1000◦C.

6. Multi-Anvil Devices

a. Tetrahedral press. In the hierarchy of pressure-producing apparatus, the tetrahedral press follows the uni-axial devices. It consists of four hydraulically driven ramsthat are designed to converge on the faces of a regulartetrahedron. The sample container is usually formed of thesame material used to form the gaskets in the Bridgmananvils, and either the tetrahedron is formed with an edgeface about 25% greater than the anvil face edges or gaskettabs are added. This system relies on compressible gas-kets to contain the pressure and allow the ram some addi-tional stroke for pressure generation after contact has beenmade. These anvils also utilize the principle of massivesupport.

The tetrahedral concept was employed with a uniaxialpress by workers at the National Bureau of Standards bynesting three of the anvils in a cone, inserting a sample,and then driving the fourth anvil down. This had the effectof moving the lower anvils down and in at the appropriaterate.

These four-ram units are difficult to control, and in aneffort to ensure equal advance of each ram, an anvil guidewas developed. This consists of a linkage of heavy pinsbetween holes in the anvil support plates and forms a nestthat forces the system to open or close with complete syn-chronism.

The four rams of the tetrahedral press are generallymounted so as to be electrically isolated from each other.This allows access to the pressurized area by up to fourleads, and as many as ten have been passed through thegaskets; however, as with any gasket passage, pinching offis not uncommon.

The pressure capabilities with tetrahedral presses run toabout 10 GPa, although a much longer life of the WC anvilscan be attained by limiting operations to below 7 GPa. Aswith the Bridgman anvils, the working piece is usuallysupported in an interference fit, high-strength steel retain-ing ring.

Temperatures in excess of 2000◦C can be attained inthese presses with internal electrical resistance graphitetube furnaces: By using smaller faced anvils and acceptinga higher breakage rate of anvils, the mineral stishovite hasbeen synthesized in this type of device. Stishovite is ametastable form of silica formed at pressures in excess of

Page 154: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB Final Pages

Encyclopedia of Physical Science and Technology EN007G-315 June 30, 2001 17:33

350 High-Pressure Research

9 GPa and temperatures above 2000◦C. This is indicativeof the upper limits of this system.

In addition to electrical resistance measurements,Mossbauer and X-ray diffraction studies have been car-ried out in tetrahedral presses. In either case, a portion ofthe tetrahedral sample container or the gasket is replacedby a light element (i.e., low X- or gamma-ray absorbingmaterial) such as LiH, B, or B loaded epoxy. In somecases, the entire tetrahedron is made from these materials.Radiation reaches the sample either through a Be-pluggedhole in the face of an anvil or through a gasket; the scat-tered radiation is scanned angularly through the other threeor through one opposing gasket. In this mode, the devicecan be operated to pressure/temperature limits of about8 GPa/600◦C. Sample container sizes are typically about2.5 cm on an edge.

b. Cubic presses. Following the tetrahedron, thenext regular solid is the cube. Cubic symmetry is moreeasily implemented, and thus, there are a somewhat largernumber of cubic presses in operation. As with the tetra-hedral press, the six rams of the cubic press are linkedtogether, either with massive tie-bolts or by hinges. Gen-erally, the former offers easier access to the sample area,and, as with the tetrahedral press, the utilization of a guidemechanism speeds and simplifies operation. Other hexa-hedral presses have been built, generally with one sampleaxis slightly elongated. These are generally used in anattempt to extend the working volume at minimal cost.

The development of uniaxially powered cubic pressesproceeded along two courses. In each case, the top andbottom anvils were driven by a uniaxial press, but they dif-fered in how the four remaining anvils were to be powered.In one case, the top and bottom rams included large blockswith tapered internal faces that forced the side anvils inas the system closed. In the second system, often termed“DIA,” large links were hinged to the rams and the fourside anvils, thus generating the desired motion. This latterdevice has been favored in Japan in recent decades. Pres-sure and temperature capabilities are about the same asfor the tetrahedral devices, while sample sizes range froma few millimeters on an edge to 5 cm. Higher pressuresin most of these anvil devices can be attained with harderanvil tips; for this reason, some researchers are employingsintered diamond anvils.

c. Multi-anvil sliding system (MASS). Anotherlarge-volume concept that has been proposed but hasfound only limited application is that of the multi-anvilsliding system or MASS. The principle was indepen-dently proposed by R. Epain and M. Kumazawa. Theprinciple of operation may be understood by examiningthe two-dimensional processes that are possible (Fig. 4).

FIGURE 4 Multi-anvil sliding system (MASS). The two basicforms are shown: A is rotational, where the anvils move tangen-tially to the central void enclosing a decreasing volume; B is irro-tational, with some anvils retracting as the others advance to yielda decreasing volume.

A set of four anvils moves tangentially to enclose an ever-decreasing area. The same idea is carried over to three di-mensions where a sample volume is contained: the anvilpieces must slide past one another easily and yet not allowextrusion of the sample. The tangential nature of the forceapplication leads to the description of the first mechanismas rotational. The second mechanism controls the dis-placement of the two anvils during compression, and, sinceno rotation is involved, this concept is termed irrotational.Although there appears to be a definite mechanical ad-vantage to MASS, implementation is not easy. Kumazawahas identified 48 possible MASS mechanisms, but most ofthese require complicated hydraulic rams and/or high loadscrew systems to operate. Further complicating the issueis that of extracting information from the pressure cavityduring operation (i.e., the passage of electrical leads orradiation beam is not a simple matter).

d. Split-sphere apparatus. The last of the multiple-anvil systems to be described here also involves the con-cepts of massive support and compressive gaskets. Thestrength of the materials used in the anvils and support sys-tems have been the limiting factor: ultimately, breakageoccurs. In the case of the split-sphere apparatus, breakageis actually anticipated by cutting a sphere into six, eight,or more equivalent segments. Each segment is formed intoan anvil, and the entire assemblage is reformed and heldtogether with a flexible membrane (see Fig. 5).

Pressurization of the sample contained at the center ofthe split-sphere anvils is achieved by immersing the en-tire assemblage in a fluid that is, in turn, pressurized. Theaction of the pressurized bath is to force the advance ofthe anvils toward the sphere center. Very high pressures,on the order of 100 GPa, have been claimed with this ap-paratus. Although the pressures actually achieved are indispute, the advantages of the technique are readily ap-parent. Power leads have been passed into the system toenergize resistive heaters and diamond synthesis has been

Page 155: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB Final Pages

Encyclopedia of Physical Science and Technology EN007G-315 June 30, 2001 17:33

High-Pressure Research 351

FIGURE 5 Split-sphere apparatus showing a hardened metalsphere formed of eight equal volume segments, each having atriangular face impinging on the central, octahedral sample. Onesegment has been omitted for purposes of clarity in the figure.

achieved, thus indicating pressure/temperature conditionsin excess of 6.7 GPa and 1500◦C, respectively. Typically,the central cubic sample volume is ∼4 mm on an edge.Like the MASS, however, assemblage can often be a te-dious task and the extraction of data from the pressurechamber can be difficult.

7. Diamond-Anvil Cells (DAC)

a. Origin of the DAC. The most powerful instrumentfor the performance of basic research with extreme staticpressures is the diamond-anvil cell (DAC). It offers threedistinct advantages over the older, larger systems.

1. It is compact—a typical DAC can be held in the handand for this same reason it can be readily cooled orheated, as desired.

2. It is relatively inexpensive—a DAC can be cons-tructed for a small fraction of the cost of the largersystems (for this reason, many modern high- pressurelaboratories usually have several DACs in use).

3. Most importantly, the diamond anvils are themselvestransparent to a broad spectrum of electromagneticradiation; therefore, the pressure chamber can bereadily probed in a variety of ways and samples canbe readily studied in situ.

The concept of bringing flat surfaces of the hardestknown material, diamond, into opposition for the pur-

poses of creating a high-pressure chamber was employedindependently and almost simultaneously by researchersat the National Bureau of Standards (NBS) and the Uni-versity of Chicago in the late 1950s. During the following15 years, extensive development took place at NBS as wellas at other high-pressure laboratories, leading to a devicethat is capable of producing the highest static pressures,∼500 GPa as of this writing.

Initially, the DAC was used to visually study phasechanges in materials that were partially or totally trans-parent; even metals could be examined in reflection. Thisrepresented a natural extension of optical absorptioninvestigations in the near infrared (IR) and the visible,often with the objective of quantifying earlier obser-vations. The X-ray absorption cross section for carbonis 40,000 barns/atom for low-energy photons (energies≤1 KeV), but it falls more than three orders of magnitudeto <10 barns/atom for photons having energies >20 keV.The upshot of this is that high-energy X-ray photons canreadily be used to study crystallographic structures atextreme pressures and, because the DAC can be so easilyheated or cooled, over a wide range of temperatures aswell.

Comparatively simple in design, the DAC consists ofa pair of brilliant-cut, gem-grade diamonds with slightlyenlarged culets, or tips (see Fig. 6). Differences in cell de-sign have developed depending on the mechanism for anvilalignment, the method of force production and control,the temperature environment, and the experimental probe.

FIGURE 6 Diamond-anvil pressure cell. A is the diamond anvils;B is the gasket; C is the pressure chamber; D and E are theincident and scattered radiation paths, respectively; and F is thehard WC support surfaces. The mechanism for advancing thediamonds and thus generating the pressure is not shown.

Page 156: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB Final Pages

Encyclopedia of Physical Science and Technology EN007G-315 June 30, 2001 17:33

352 High-Pressure Research

Some designs, such as a cell developed at the Universityof Rochester of single crystal X-ray studies, are quite sim-ple, consisting of two triangular plates clamped togetherby three adjustable screws.

Critical in the design and operation of the DAC is thealignment of the anvils themselves. It is very importantthat the culet faces be parallel and normal to the uniaxialforce direction. In earlier work, one anvil face was typi-cally made much larger than the other, for example, 1.2 vs0.6 mm diameters. This frequently led, at higher loads, tosurface fracture of the larger diamond face due to penetra-tion by the smaller. The modification to avoid this was touse a matched pair of diamond anvils with approximatelyequal surface areas and to align their axes as well as theirplanar surfaces.

The principle of massive support is being applied to thediamond anvils by the addition of culet faces that optimallymake an angle of 5–10◦ with the culet flat; the load on thesefaces decreases as one travels radially out from the center.The highest pressure attained with these beveled anvils ison the order of 550 GPa, as of this writing.

In achieving these two alignment criteria, two designshave become somewhat generally adopted. One designhas the anvils mounted in hemicylindrical rockers thatcan be translated along their cylindrical axes, which areset orthogonally in the cell body. Thus, axial alignment isachieved by appropriate translational adjustment, and fa-cial alignment is achieved by appropriate rocking motions;both of these can be carefully controlled with small adjust-ment screws. In the other design, one diamond is mountedin a hemispherical support whose orientation can be care-fully adjusted for the facal alignment; the other diamondis mounted in a flat plate whose position can be translatedin a plane normal to the diamond axes, thus allowing theaxial alignment.

It is important to maintain the alignment of the dia-monds during operation of the DAC. This is usually ac-complished by placing one diamond on a piston that slidesin a closely fitting cylinder equipped with a guide pin toprevent rotation during loading. Some designs provide foralignment during operation, but this is generally unneces-sary for sub-50 GPa work.

In operation, one of the anvils is usually kept station-ary and the other is driven toward it. The magnitude ofthe force required for this is modest and can be devel-oped in one of several ways. In the earliest design, a leverarrangement is used to provide a mechanical advantageof 3 or 4, and the force is manually generated throughcompression of a load screw, either directly of through astack of Belleville spring washers. Similarly, in the three-bolt design referenced above, manual advancement of thescrews will generate sufficient load to achieve the limitsof the DAC. Other mechanical systems involving leversand pivots with screw adjustments, or a lever operated

by a screw-driven wedge, have also proven effective. Inorder to achieve remote and/or programmable control ofthe load, the piston anvil can be driven by a hydrauli-cally operated mechanism or by an electric stepping mo-tor. One DAC designed for operation at cryogenic tem-peratures employs a metal bellows chamber pressurizedwith He gas cooled from room temperature; this systemcould achieve pressures of 10 GPa at temperatures as lowas 30 mK.

As with Bridgman anvils, the pressure across the dia-mond faces is not uniform. If a sample is placed directlybetween the two anvils, as the load is applied, the samplewill extrude laterally from the containment region until thefrictional forces between the sample and the anvil facesbecome sufficient to contain the load. Under these condi-tions, the sample will be exposed to an extreme pressuregradient, ranging from a maximum near the center to al-most atmospheric pressure at the periphery. This carriesthe added feature of subjecting the material under study tolarge shear strains. An advantage of this, however, is thatthe response of the material to a wide range of pressurescan be examined at a single setting. In earlier work, phasetransitions were detected optically with this technique.In more recent studies to the highest static pressures, re-searchers are using highly collimated beams of extremelyintense radiation, for example, that produced with syn-chrotron storage rings, to study small portions of samplesunder pronounced pressure gradients.

It is more common in the operation of DACs, however,to employ a hardened metal gasket for containment ofthe sample. High-strength metals, such as stainless steel,Inconel, Waspaloy, or hardened Be–Cu, are often used. Forpurposes of alignment and extra hardening, the gasket isusually prestressed by compressing it between the anvils,frequently to about 50% of its original thickness. A holewith a diameter of 1

2 to 23 that of the culet face is then drilled

in the center of the indentation. The indentation serves toallow reasonably rapid recentering of the gasket betweenthe anvils. This cylindrical hole, whose linear dimensionsare typically a few hundred micrometers, constitutes thesample chamber.

The sample of interest is then loaded in the gasket hole,frequently mixed with some standard material to serveas a pressure calibrant; NaCl or Au are frequently usedcalibrants for X-ray studies, while the wavelength shiftof fluorescence from ruby is also a commonly used pres-sure gauge. Although the gasket will help in providing amore uniform distribution of pressure, if a truly hydro-static pressure environment is required, the sample andcalibrant must be immersed in a suitable fluid. A mixtureof 16 parts methanol, 4 parts ethanol, and 1 part water willremain fluid to pressures just above 10 GPa at room tem-perature. For hydrostatic conditions at higher pressures,liquified gases are required. He, Ar, or N2, condensed and

Page 157: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB Final Pages

Encyclopedia of Physical Science and Technology EN007G-315 June 30, 2001 17:33

High-Pressure Research 353

sealed at cryogenic temperatures, provide near hydrostaticconditions to pressures well above 10 GPa.

Another application of diamond anvils should also benoted here: researchers at Cornell University have beenexperimenting with a spherical diamond indentor with atip radius of about 10 µm that is pressed into a flat diamondface. The sample is contained between the two diamondsurfaces, and, as in the case of the ungasketed DAC, thesample will extrude to the point where frictional forces be-tween it and the diamonds cause the sample to effectivelyform its own gasket. Since the contact area is controlledby the tip radius, contact force, and deformation of thediamonds, it is very small, and modest loads can produceextreme pressures: values in excess of 100 GPa have beenclaimed. One major drawback with this arrangement isthat because the sample is small, it is difficult to detect.The reported pressures have been determined from themodel used to represent the tip and flat, and therefore,reported pressures are vulnerable to the accuracy of themodel.

b. Temperature. Elevated temperature studies with aDAC must be approached with the realization that the sta-ble form of carbon at atmospheric pressure is graphite, notdiamond. At normal temperatures though, the diamond-to-graphite transition is not observed because of the very highactivation energy associated with this transformation. But,if sufficient thermal energy is provided, this activation bar-rier can be surmounted. Therefore, most high-temperatureresearch with DACs is restricted to below 800◦C.

High temperatures in a DAC have been attained by twomethods: resistive heating and laser heating. In the for-mer case either the gasket itself can be used as a heatingelement or, more commonly, resistive heaters are used toenvelope the diamonds and gasket material. Temperaturesare usually limited to below 200◦C in the former and tobelow 800◦C with the latter. Most recently, scientists fromLos Alamos National Laboratory and the Naval ResearchLaboratory have collaborated to develop a DAC that is op-erated in a vacuum oven equipped with two concentric Taheaters. Pressures in excess of 10 GPa have been attainedat temperatures above 1200◦C with this system.

The other heating technique employs a high-poweredlaser beam that is focused on the sample. Using a pulsedlaser, a team at Cornell university has achieved temper-atures in excess of 5400◦C for brief periods of time, ac-tually melting diamond, and using a pulsed YAG laser,temperatures of 2000◦C have been sustained at pressuresof 2.5 GPa.

Research has also been carried out with DACs at cryo-genic temperatures. Researchers at the Naval ResearchLaboratory have attained temperatures down to 30 mK atpressures up to 10 GPa by coupling a DAC to a liquid-Hedilution refrigerator. In this apparatus, the mixing cham-

ber of the refrigerator is built directly into the DAC. Ametal bellows chamber pressurized with He gas was usedto generate the compressive force for the anvils.

c. Experimental probes. As noted above, the majoradvantage of the DAC is its virtual transparency to a broadspectrum of electromagnetic radiation. Some of the earlierstudies in the DAC were made in the visible portion ofthe spectrum; refractive index changes, optical absorption,and birefringence are several types of measurements thatwere performed on samples as they underwent variouspressure-induced phase changes.

Optical studies were quantified by incorporating theDAC into a spectrometer system, including appropriatefocusing optics. Work in the infrared region is usuallyperformed using type-II diamond anvils; these allow trans-mission studies to be carried out in the 1–4 and 5.5–15 µmregions, as well as the visible and near ultraviolet. Fluores-cence and spectroscopic measurements, both absorptionand Raman, have also been carried out in the DAC. Anabundance of research has been performed in the higherphoton energy regions as well: both angular and energydispersive Bragg scattering measurements have been usedto detect structural phase transitions, as well as measurethermal expansivities and compressibilities. Extended X-ray absorption (EXAFS) studies have been undertakenwith limited success in the DAC, with the major difficultybeing interference in the EXAFS patterns by Renningerscattering from the diamond anvils. Using appropriategamma-rays, Mossbauer studies have also been performedin a DAC.

Magnetic susceptibility and microwave absorptionmeasurements have been performed in a DAC; this workhas been directed primarily toward superconducting ma-terials. In this same context, some researchers have alsoequipped DACs with electrical leads extending into thepressure cavity, thereby permitting electrical resistancemeasurements under conditions of varied pressure andtemperature. Additional details of the various measure-ments performed in a DAC are discussed in the later partsof this article.

B. Dynamic Pressures

1. Fundamental Principles

The most obvious difference between static and dynamicpressures is that of duration: in most static systems, oncethe pressure is set, it is generally considered to be con-stant in time, whereas shock pressures result from largeamplitude waves passing through matter and are necessar-ily of brief duration, typically on the order of microsec-onds. Another difference between the two techniques isthat static pressures can be applied isothermally, whereas

Page 158: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB Final Pages

Encyclopedia of Physical Science and Technology EN007G-315 June 30, 2001 17:33

354 High-Pressure Research

shock waves are generally accompanied by large thermalexcursions. Final, static pressures are three dimensional incharacter and often hydrostatic, whereas the shock pres-sure is often considered to be uniaxial, the result of a two-dimensional shock front passing through the sample. Thedevelopment of pressure results from the inertial responseof matter to a rapid acceleration. Although several differ-ent shock techniques will be discussed, all depend on thesudden application of force to a surface of the target forthe initiation of a shock wave.

A shock wave represents a thin region, typically 1 nmwide, of a material over which there is a discontinuityin stress, density, material velocity, and internal energy.This region travels at supersonic velocity with respect tothe material into which it progresses. Although impactsnormally generate a region of rapidly increasing pressure,building to a maximum, a shock wave results because thepulse velocity increases with increasing pressure. This is afundamental requirement for the establishment of a shockwave; it leads to a sharpening of the disturbing pulse to asteplike discontinuity. Since all forms of shock-generatingforce are short lived, or the target moves away from thedisturbed region, a low-pressure region, or release wave,is launched into the target following the shock wave. Therelease wave travels in a denser material because of thepreceding shock wave; therefore, it travels faster and even-tually over takes and destroys the shock front.

Shock studies have depended heavily on the concurrentdevelopment of theoretical models or codes that allow theresearcher to predict the effects of shock waves on specificmaterials. Materials are frequently treated as if they wereliquids in these codes; however, the errors resulting fromsuch approximations are generally small because of theextreme pressures involved (i.e., generally above 10 GPa).

The parameters that are typically measured in a shockexperiment include the shock velocity, the particle velocitybehind the shock front, and, more recently, the pressureand temperature. Within the experimental uncertainties,the shock velocity is usually found to be a linear functionof the particle velocity.

Using the principles of conservation of mass, energy,and momentum across the shock front, a set of three equa-tions can be derived that relate the final density, internalenergy, and pressure (or stress) to the initial values forthe material and the shock. The locus of values derivedfrom a number of experiments forms a curve that definesthe final states that can be reached for a given material inthe shock velocity-particle velocity plane, or the stress-volume plane. This curve is generally referred to as theHugoniot, since it defined the Rankine-Hugoniot equationof state for the material with respect to its initial state.

Loading methods will influence the precision of threeparts of the shock experiment: (1) control of the pressure,

(2) uniformity of the disturbance across the sample face,and (3) the decompression or release processes. The mostcommon methods for the production of shock waves aredetonation of explosives, either directly in contact or driv-ing a flyer plate, and impact of a projectile from a gun. Ex-ploding foils or wires and imploding magnetic fields havebeen used occasionally. Laser-generated shock waves arebecoming more common, especially in the light of cur-rent national defense strategies. Electric rail guns promiseinteresting results at the upper pressure limits, but devel-opment has been sporadic. Finally, nuclear-driven shockstudies have resulted in the highest dynamic pressures andhave tested the extension of existing equation-of-state the-ories to pressures in the tens of tesla Pascal regime.

2. Experimental Techniques

a. Explosively driven shock waves. Interest in theeffects of explosives on materials originated with militaryconcerns for the consistent performance of these devicesand later with the related issue of armor penetration. Inthe latter case, spalling became a major issue. This is thefracturing of part of the target on the opposite side fromthe impact due to a tensile failure in the interior where tworelease waves have met. Interest in this area was markedlyincreased by the need for predictable, well-controlled useof explosives in the original atomic bomb triggering se-quence. It was this factor too that stimulated the accumu-lation of large amounts of data on the effects of shocks onvarious materials, and that forms the basis for many of theresearch programs currently underway.

There are two ways of explosively generating shockwaves: one in which the explosive is in direct contact withthe target and the other in which the explosive launches aflyer plate that then impacts the target. In the first case, it isimportant that the explosive impact reach all points on thefront of the target simultaneously. Since detonation usu-ally starts at a point, or along a line, the explosive burn willtake place along a spherical or cylindrical front, respec-tively. Some commercial suppliers of explosives providetriangular sheet line generators that are perforated withan array of holes that serve to break up the curved shockfront into a series of many smaller fronts that approximatea line.

One form of plane wave generator is called a “mouse-trap” (see Fig. 7). It consists of a sheet of explosivematerial laid on a thin, inert (glass or metal) driver platethat is inclined above the main charge at an angle θ , suchthat sin(θ ) equals v/d, where d is the detonation velocitydown the sheet over the driver plate and v is the resultantvelocity of the plate from the pressure generated by thedetonated gases. Initiated at its upper edge, the driver platestrikes the main charge at all points simultaneously and

Page 159: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB Final Pages

Encyclopedia of Physical Science and Technology EN007G-315 June 30, 2001 17:33

High-Pressure Research 355

FIGURE 7 Mousetrap plane shock wave generator. Explosive Ais detonated along its upper edge and burns toward the hingegenerating detonation products, D, and driving plate B toward thetarget, C, which may be a main explosive charge. The angle be-tween the driver plate and the target is chosen such that its sineis equal to v/d, where v is the velocity of the plate derived fromthe detonation of A at velocity d.

will initiate a plane detonation wave if the plate velocityis great enough. Edge effects and construction variationslimit the planarity of this device.

Conical explosive lenses can be produced that will pro-duce a simultaneity of detonation at the driving face towithin 0.1 µsec, but the resultant impulse may be nonuni-form. These lenses are formed either with a cone of explo-sive over an inert cone of larger angle or a similar innercone of explosive with a slower detonation velocity (seeFig. 8). In the latter case, the base angle of the inner, slowerexplosive, α, is determined by the ratio of the detonationvelocities (i.e., sin α = [dout/din]).

The target may be directly attached to the surface of theexplosive, or it may be set a short distance away with aflyer plate attached to the explosive. The simplest form of

FIGURE 8 Conical shock wave plate generator. The detonator,A, ignites the fast burning explosive, B. which in turn ignites theslow burning explosive, C. The conical angle is chosen such thatits sine is equal to the ratio of the rate of C to the rate of B, resultingin the formation of a detonation front in C parallel to the base.

the latter case is the mousetrap where the driver plate hitsthe target rather that a main charge.

The pressures that can be obtained using these tech-niques range up to a few tens of gauss Pascals. The primaryadvantage is the relatively simple set up and correspond-ingly low cost.

b. Guns. Propellant guns, originally developed formilitary applications, provide a special example of theexplosive-driven flyer plate. These also provide a some-what more controlled mechanism than the explosive tech-niques described above. The resultant pressures derivedfrom the impact of the flat-faced projectile on the target aresomewhat greater than those attainable with the flyer plate.

Light gas guns offer greater control of the shock con-ditions. In these devices the projectile is propelled downthe evacuated barrel toward the target by the expansionof a pressurized light gas, such as H2 or He. The gas issuddenly released from its high-pressure reservoir by therupture of a disk behind the projectile in the breech of thegun. These devices can also be operated in two stages,using a larger diameter projectile of compress gas for asecond, smaller gun. The projectiles from these devicescan achieve velocities as high as 7 km/sec, resulting intarget pressures of over 150 GPa.

c. Exploding wire or foil. The force in this systemis generated by exploding a thin metal foil by passage ofan extremely high electrical current. The resistive heat-ing of the metal will cause vaporization of the foil; this,in turn, accelerates an adjacent, thin dielectric plate toimpact the target. There can be serious problems in theplanarity alignment of the plate as it impacts the targetwith severe degradation of the resultant pressure. Asso-ciated electrical instrumentation is also heavily impactedby the electromagnetic noise generated by the system.

Exploding foils have been used to charge small gunswith barrels only a few millimeters long. Using a flyerplate composed of plastic and metal, pressures in excessof 1 TPa have been achieved, with capabilities up to 5 TPaindicated. Such systems are simple in concept and rela-tively inexpensive to set up.

d. Laser-driven shock waves. High-energy laserbeams impacting the surface of a target will cause veryrapid heating, even to the point of forming a plasma. Thesevery high temperatures are formed quite rapidly, and, be-cause material or thermal flow is relatively slow, there isinsufficient time to dissipate this energy with the resultthat shock waves are launched into the material. Althoughparticle beams, for example, electrons, can also be usedfor this process, attention will be focused exclusively onlaser-driven shocks.

Page 160: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB Final Pages

Encyclopedia of Physical Science and Technology EN007G-315 June 30, 2001 17:33

356 High-Pressure Research

One advantage of the laser initiation is that the energydeposited by the laser beam results in a shock wave thatis initiated simultaneously over the irradiated area. Typ-ically, the laser beam is focused to a spot size with adiameter of up to 1 mm. Irradiances greater than about108 W/cm2 are required to ignite a plasma and producea shock wave. These small impact areas lead to largeenergy losses due to two-dimensional expansion of theplasma. Edge effects will also be significant in evalua-tion of the shock. The facts that laser pulses normallylast no more than a few hundreds of nanoseconds andthe plasma dissipates very rapidly after the energy in-put ceases result in the launching of the aforementionedrelease wave, which rapidly overtakes and destroys theshock front. As a result of these constraints, effectivesample thicknesses have been limited to a few tens ofmicrometers, although the pressures attainable with thesesystems can range up to 10 TPa. This pressure range is ofconsiderable interest in the development of equations ofstate.

Not all of the laser energy is effective in the gen-eration of the shock wave; some is lost in the forma-tion of the plasma, some is carried off with the plasma,and some is reflected without effect. The absorption effi-ciency of most materials increases with the photon en-ergy, ranging from about 30% in the near infrared toabout 90% in the near ultraviolet. Short wavelengths of-fer another advantage, namely, the decrease in produc-tion of suprathermal electrons. This, in turn, lessens theelectron preheat of the target prior to passage of theshock front. X-ray production, however, is enhanced atshorter wavelengths and for higher Z materials, which maycause some target preheating. Laser beams may have lo-cal fluctuations in energy density, or “hot spots,” whichcan lead to nonplanarity of the shock front. These ef-fects are minimized at longer wavelengths by thermal con-duction, but are more pronounced at shorter wavelengthsbecause the energy is deposited closer to the ablationsurface.

The flyer plate concept has also been applied to laser-generated shocks. Carbon disks have been irradiated witha 3-nsec laser pulse, generating a shock pressure of 0.5 TPaand accelerating the disk to a velocity of 100 km/sec. Im-pact of this disk on a second disk delivers the energy ina much shorter time, resulting in the production of 2-TPashock pressures.

A conventional shock diagnostic apparatus is electricalin nature. The intense electromagnetic storm generated bylaser and particle beams is an extremely hostile environ-ment of these sensitive detectors. Lasers, however, bringtheir own solution. Because the timing of the laser pulsecan be precisely established, the incoming laser pulse canbe used to trip optical diagnostics, or a portion of the laser

beam itself can be used for diagnostic purposes at thetarget.

Plasmas also generate X-radiation with wavelengthsvarying according to the target composition. This radia-tion can be used either directly, by recording the shock onan X-ray streak camera, or indirectly, by exciting X-raysfrom a second target that are then used to monitor the shockpassage. Shock velocities can be determined by monitor-ing the light generated when the shock reaches the backsurface of the target using a streak camera.

e. Nuclear-driven shock waves. Nuclear explo-sions have been used to obtain equation-of-state data formany years. Pressures obtained with this method haveapproached 7 TPa, and although this seems to be lessthan those attainable with laser beams, the target area wasnearly 30 cm in diameter and the nuclear device was det-onated about 3.5 m away from the sample.

Optical techniques are preferred for the initial signalgeneration because the electromagnetic interference is ex-treme. Usually, the shock velocity is measured for a ref-erence material and several samples mounted on the ref-erence material target plate. Impedance matching is usedto derive the particle velocities of the samples from theknown equation of state of the reference material.

Access to nuclear tests is difficult, and although thetechnique appears to be comparatively simple, not onlythe sample, but most of the expensive signal processingequipment is lost in the process; thus, experiments arelimited to materials of the greatest interest.

f. Electric rail guns. These devices represent a vari-ation of the gas gun; however, here the force used to ac-celerate the projectile is electrical. The “gun” is made oftwo parallel, electrically conducting rails. The projectile,a conductor shorting the rails, rests between them. Thesystem forms a linear dc motor, and the acceleration ofthe projectile is enhanced by the formation of a plasmabehind it, thus aiding in its acceleration. Although severalgroups have experimented with this method, difficultieshave prevented its fully successful implementation. Theo-retical estimates indicate that ultimate impactor valocitiesas high as 40 km/sec should be achievable; these wouldlead to shock pressures up to 10 TPa.

3. Measurement Techniques

Evaluation of shock experiments requires the measure-ment of any two of the four shock variables: Up, particlevelocity; Us, shock wave velocity; P , pressure; and V ,specific volume. The three conservation laws (i.e., thatof mass, linear momentum, and energy across the shockfront) are then used to calculate the other two parameters:

Page 161: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB Final Pages

Encyclopedia of Physical Science and Technology EN007G-315 June 30, 2001 17:33

High-Pressure Research 357

ρ0/ρ1 = 1 − (Up/Us)

P1 − P0 = ρ0UsUp

E1 − E0 = [P1 + P0][V0 − V1]/2,

where E refers to the energy, ρ refers to the density, andthe subscripts 0 and 1 refer to the variables ahead of andbehind the shock front, respectively.

Shock velocities can be measured in several ways. Theaverage valocity can be calculated from the sample thick-ness and by timing the impact and exiting of the shockfront. Electrical pins that short upon the arrival of the shockfront are used to obtain timing data. These pins can con-sist of an insulated conductor separated from a groundedmetallic target or a metallic conductor in a plastic, insu-lated rod that is coated with metal forming a groundedsheath. The end of the rod is capped with a thin insula-tor, and the central conductor is charged with a dc voltageof up to several hundred volts. When the shock wave en-counters the end cap, the center conductor is shorted toground, generating a rapidly rising pulse on a line, whichis then recorded. The separation of the bare contact fromthe grounded target or the thickness of the insulating capis minimized, consistent with the need to prevent shortingfor the level of readout voltage used. If closing time is tobe no more than 100 nsec, then the gaps must be no morethan 10 µm wide.

When electrical noise is a problem, optical pins, con-sisting of small microspheres containing a pressurized gas,such as Ar, are affixed to the end of optical fibers. As theshock front passes, the trapped gas emits light that is car-ried along the fibers to remote detection equipment. Thesepins can also be multiplexed onto a single streak camerarecord.

Pins of either type can be placed at various steps inthe target or in holes of carefully determined depth. Theyshould, of course, be separated by sufficient distance sothat the release wave generated by one hole or step willnot interfere with the readings of adjacent pins.

Electrically conducting targets can be used as one plateof a capacitor; in this case, the velocities are determinedby measurement of the variations in capacitance as thetarget moves. This techniques provides information aboutthe position, the free-surface velocity, and structure of thewave front.

Application of a strong, uniform magnetic field orthog-onal to both the direction of shock propagation and to aconductor through which the shock will pass will generatean electrical voltage when the conductor is moved by theshock. The large magnetic field requirements limit moreroutine application of this technique.

The shock arrival at the rear surface can be detectedoptically for higher shocks because of the intense heating

of the target; this produces a bright flash of visible radi-ation that can be detected with a streak camera. Since itmay be difficult to establish the timing of the impact, astepped target is often used that provides two signals ofshock arrival with a well-defined separation, that, alongwith the timing information from the streak record, givesthe shock velocity. This concept is also employed by usinggaps that are prefilled with a gas, such as air, Ar, or Xe, ora liquid, such as CCl4. These materials all emit light whenimpacted by a shock wave; the brightness of the light in-creases with the intensity of the shock, thus making thistechnique useful in the lower pressure regions. If luciteis used as the transparent rear cover for the flash gap, itbecomes opaque when the shock reaches it, quenching thesignal to the recording device.

Some polished surfaces change their reflectivity whenshocked. This allows the use of mirrors at the impact andrear surfaces of the target. A streak camera is used torecord the shock events. The shock-opacity principle canbe applied here also: when the shock wave reaches the rearsurface, the reflectivity of the polished surface is greatlyaltered and readily detected.

Optical measurements are frequently carried out inevacuated systems in order to minimize the effects of airshocks around the rear or free-surface of the target. An-other type of optical technique involves using the mirroredrear surface of the target as an element in and optical leversystem. With this method, a streak camera is used to recordreflections of a series of point light sources recording thepassage of a shock wave, or the motion of the image ofa fine wire by the moving free surface. If a liquid cell isused, a thin foil may be suspended in it at some angle tothe expected shock front. The shock velocity in the fluidcan be determined, if the fluid remains transparent and thefoil remains reflective when shocked.

Lasers, because of their brightness, monochromaticity,and coherence, have led to many new measurement tech-niques. They can be used as light sources in those tech-niques in which brilliance is important. Their monochro-maticity allows them to be used as interferometers forvelocity measurements. Interference fringe patterns areestablished when partially reflected light from the frontsurface of the target interferes with light reflected fromthe moving face at the rear. After the effects of the initialshock pass, the steady fringe reading corresponds to theparticle velocity in the window material. The timing of thepassage of the shock through the window material givesthe shock velocity. These two pieces of data define a pointon the Hugoniot curve.

Since X-ray interactions with matter generally in-volve low-level electronic processes, they are usually con-sidered to be unaffected even by severe shock waves.Pulse X-ray sources are needed to study transient

Page 162: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB Final Pages

Encyclopedia of Physical Science and Technology EN007G-315 June 30, 2001 17:33

358 High-Pressure Research

phenomena. Laser-generated plasmas have been used aspulse sources.

Since pressure is also one of the primary variables, sev-eral methods have been used to obtain estimates of itsvalue in the shock front. The variation of the electrical re-sistance of manganin wire has long been used as a meansof measuring static pressures; the linearity of this variablewith pressure has been shown to be valid to 30 GPa un-der dynamic conditions. The resistance of other materialswill also vary with pressure, but their resistance–pressurecurves are either not linear or they involve large temper-ature coefficients. Both tourmaline and quartz have beenused as piezoelectric pressure transducers. They have beencalibrated up to 2 GPa, probably the limit of their elasticbehavior. Polyvinyl fluoride is a polymer film that can bepoled to give a piezoelectric gauge, which is finding ap-plication. Ferroelectrics, such as lead zirconate titanate,have also been used.

Another electrical technique is based on the observationthat many materials either lose polarization or becomepolarized upon the passage of a shock. In the first case,electrodes on the two faces of the material being shockedwill generate a voltage through an external resistance; thevalue of the voltage will be proportional to the pressureand sample area and inversely proportional to the samplethickness.

Since each of these techniques requires electrical con-tact to sensors in the shock regime, they are each proneto many difficulties, including shorting by shock-inducedconductivity in the gauge or support material, or loss ofcontact.

A number of other shock phenomena have been notedin shock studies; some of these are unique. Two examplesare phase transformations, either reversible (and there-fore present only in the compressed state) or metastable(such as the graphite-to-diamond transition), and changesin conductivity, particularly of semiconductors and insu-lators that become conducting in the vicinity of the shock.Shifts in the Curie point with both pressure and tempera-ture lead to shock-induced demagnetization of ferromag-netic materials. Luminescence has been observed for somematerials, while some transparent materials such as poly-methyl methacrylate (PMMA) and NaCl become opaqueduring shock compression. In some cases, the mechanicalstrength of a substance increases markedly upon the pas-sage of a shock. Post-shock studies span the full range ofpossibilities and are considered below.

III. MEASUREMENT OF PRESSURE

A. Absolute Pressure Scales

An absolute determination of pressure requires a knowl-edge of either the force and the area over which it is applied

or the variation of the free energy with change in volume.The free piston gauge, or dead-weight piston gauge,makes use of the former. Presuming an exact knowledgeof the piston area, the force is fixed by a set of calibratedweights balanced on the piston. Friction between thepiston-cylinder interface is minimized by rotation of thepiston.

Several techniques are used to compensate for distor-tions at elevated pressures: (1) extension of the cylin-der under pressure is restricted by a separately, pressur-ized jacket surrounding the cylinder; (2) calculations ofchanges from engineering principles; (3) comparison oftwo identical systems using different materials. These sys-tems are limited in pressure by the strength of the materialsused in their fabrication; with tungsten carbide, measure-ments are routinely reported to 2.5 GPa, although somework has been carried to pressures as high as 6 GPa. In arecent program to compare pressure scales between 13 in-ternational metrology laboratories, variances in raw dataamounted to no more than 78 ppm. Although this is withinthe combined uncertainities, the derived calibration con-stants (i.e., the slopes of calibration curves) disagreed ina more marked manner, suggesting fundamental differ-ences, particularly between the controlled displacementand other systems.

A thermodynamic pressure scale, similar to the thermo-dynamically defined temperature scale, would be anotherapproach to an absolute pressure scale. Although this hasbeen proposed, it has not been implemented as of thiswriting. The volume can be defined as the variation of thefree energy, G, with pressure at constant temperature. Us-ing an electrochemical cell, this is also proportional to thevariation of the cell potential φ, with pressure; that is,

V = [(∂G/∂ P)T] = −n(∂φ/∂ P)T,

where the parameter n is dependent on the particular elec-trochemical cell employed. This method has been infre-quently used. One of the more difficult problems is thevariation in the ionic conductivity in the cell as the pres-sure is varied.

B. Secondary, Practical Scales

Like its thermodynamical counterpart, temperature, pres-sure is most frequently determined from the variation ofsome previously calibrated state coordinate. Variations incrystallographic volume, electrical resistivity, or fluores-cence wavelengths have all been used with success.

1. Equations of State

The equation of state (EOS) of a material relates the threethermodynamical variables, pressure, volume, and tem-perature, over a range of each. Although work has been

Page 163: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB Final Pages

Encyclopedia of Physical Science and Technology EN007G-315 June 30, 2001 17:33

High-Pressure Research 359

underway to develop such equations from first principlescalculations (i.e., from basic physical laws and interatomicpotential functions), most materials today are treatedwith semi-empirical EOSs. These employ one or a fewparameters derived from measurements, for example, thevolume dependence of the Gruneisen parameter.

Shock wave data have been used for EOS developmentat much higher pressures and temperatures than those en-countered in static experiments, for example, to thousandsof Kelvin temperature and hundreds of gauss Pascals pres-sure. Researchers have carried out combined studies on thesame materials at the upper end of available static pres-sures and the lower end of the dynamic pressure range toprovide reliable calibration materials.

2. Calibrated Fixed Points

Structural phase transitions that occur in materials and canbe readily detected from discontinuous changes in theirphysical properties are often used as specific pressure cal-ibration points. For example, the freezing pressure of Hg at0◦C, which can be detected from a discontinuous changein the volume, has been accepted to be 0.75692 GPa formany years. Also, Bi exhibits discontinuous changes in itselectrical resistivity at room temperature and 2.5499 and7.7 GPa. One difficulty in employing these phase transi-tions is that they can sometimes be sluggish and can beaffected by the hydrostatic nature of the pressure.

3. Other Pressure Scales

a. Ruby fluorescence. An excellent secondarypressure scale that has become very popular with the ex-tensive use of the diamond-anvil cell is one based on thepressure-induced shift in the wavelength of the R2 fluores-cence peak from ruby. Since temperature will also causea shift in this wavelength, a thermal correction factor of0.068 A/K must also be applied. Based on EOS studieswith a number of materials (NaCl to 30 GPa and Ag, Cu,Mo, and Pd to 100 GPa), the ruby line shift is given by

P(GPa) = 380.8[(λ/λ0)5 − 1

],

where λ and λ0 are the ruby fluorescence wavelength atelevated and atmospheric pressures, respectively. Basedon Au and Cu EOS studies, the foregoing calibration hasbeen found valid to 200 GPa to within 5%. Most recentexperiments in a diamond-anvil cell have led to and ex-tension of the ruby scale to ∼500 GPa, although the rubysignal tends to be obscured by diamond fluorescence inthe 150–300 GPa range.

b. Electrical resistance. Calibrated electrical resis-tance sensors have long been used as temperature gauges.

Similar devices have also been introduced as pressuresensors. The ideal material for pressure calibration wouldhave a small thermal coefficient of resistivity, a minimalhysteresis, and preferably a linear pressure response overas large a range as possible. Manganin, a Cu–Mn alloy, isa material that has long been used for this purpose. Unfor-tunately, the pressure coefficients tend to vary somewhatfrom sample to sample, and therefore, each lot must beseparately calibrated. Manganin pressure gauges havebeen used in static systems to pressures of 6 GPa. Theyhave also been used in shock work, but significant correc-tions for shock temperatures and errors derived from thecreation of point defects tend to limit the accuracy of theresults.

Other materials used for this purpose include Au–Crwith about 2% Cr and zeranin, formed of Cu, Mn, and Ge.These materials were originally used as resistance stan-dards. The first has an appreciable temperature coefficientthat limits its utility; the latter has not been in use foras long, but seems to have the same desirable qualitiesas manganin in addition to a more rapid recovery and ahigher resistance to oxidation.

c. Superconducting transition temperatures.Early calibration techniques at cryogenic temperaturesgrew directly from interest in the effect of pressureon superconducting transition temperatures, Tc. Thesemeasurements can either be performed by monitoringthe sample resistance, which drops to zero at Tc, or themagnetic susceptibility, which will change abruptly at Tc

due to the Meissner effect.In, Pb, and Sn have all been calibrated and used in this

regard. However, one difficulty in this work is the lack ofother low-temperature calibrants against which to fix theTc’s. Typically, a known fixed pressure is sealed in a bombat room temperature, after which the system is cooled todetermine the shift in Tc. The problem is that there is likelyto be a change is sealed pressure due to differences inthe thermal expansivity of the various components of thesystem. It is difficult to accurately correct for this change.

d. Semiconductors. Since the resistivity of semi-conductors is also sensitive to pressure, they too can beused as pressure calibrants. These have the advantage thatthey will typically be significantly more responsive thanmetallic resistors; however, they also tend to have largetemperature coefficients. In one case, Sb-doped Te hasbeen shown to have an exponentially decreasing resistiv-ity up to 0.65 GPa: specifically, the ratio of the resistanceat pressure P to that at atmospheric pressure has beenreported to be exp[−1.122 P(GPa)]. Doping GaAs has theeffect of decreasing its temperature sensitivity whileincreasing its pressure sensitivity to a value approachingthat of manganin.

Page 164: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB Final Pages

Encyclopedia of Physical Science and Technology EN007G-315 June 30, 2001 17:33

360 High-Pressure Research

e. Capacitance. Pressures can also be determinedfrom variations in capacitance. In this case the dimensionsof the solid dielectric used in the capacitor will changein both area and thickness as a function of increasingpressure. These gauges will typically have a resolutionas small at 0.07/MPa. However, they too must be operatedunder conditions of extreme temperature control and ne-cessitate at least three electrical contacts into the pressurechamber. These requirements, along with their relativelylow range of applicability, have limited the use of suchgauges.

IV. RESEARCH AT HIGH PRESSURE

High-pressure research involves examining the responseof materials to the environment of extreme pressures. Thisexamination, usually in the form of a measurement of oneor more physical parameters, can either be made in situ(i.e., while the pressure is changing) or upon completionof the pressure excursion. Although the latter experimentsare easier to perform, since they do not require access tothe high-pressure environment during pressurization, theyhave the undesirable drawback of requiring a metastableproduct.

The objective of most of this research is to assess the ef-fect of altering interatomic distances on a variety of phys-ical phenomena, for example, local, short range order;crystalline structure; electrical resistivity; and mechani-cal, magnetic, and optical properties. In short, most phe-nomena that can be measured at atmospheric pressure canalso be studied at elevated pressures.

A. Structural Measurements

One of the most fundamental properties of a condensedmatter system pertains to the arrangement of atoms ofwhich that system is formed. In the majority of materials,this is the crystallographic structure. A wealth of high-pressure research is directed toward the examination ofchanges in crystal structure under pressure. Such studiesgenerally involve the application of basic X-ray diffrac-tion techniques. A material is illuminated with eithermonochromatic or heterochromatic X-rays, and the scat-tered photons are analyzed to determine crystal structure.This can be accomplished through utilization of Bragg’sequation:

Edhkl sin θ = hc/2,

where E is the energy of the X-ray photon, dhkl is theinteratomic planar spacing, θ is the Bragg diffraction an-gle, h is Planck’s constant, and c is the speed of light. Ifmonochromatic radiation is used, then E is fixed and the

scattered radiation must be analyzed over a spatial rangeto determine the values of θ for which the Bragg equationis satisfied. Alternately, if heterochromatic or white radia-tion is used, then the diffraction geometry (i.e., θ ) is fixedand the scattered photons are analyzed in terms of theirrespective energies.

An important experimental detail in this work is to pro-vide a window into the pressurized region with a relativelylow absorption coefficient for X-rays. As noted above, thediamond anvils of the DAC satisfy this requirements verywell: the X-ray absorption cross section for carbon de-creases by a factor of 4000 over the photon energy rangefrom 1 to 20 keV. Other materials that also have beenused as high-pressure X-ray windows are pryophyllite andB4C. The latter material has the advantage of being amor-phous and therefore will not contribute substantially to thediffraction pattern of the sample.

Nevertheless, it is often difficult to bring a sufficientnumber of X-ray photons in and out of the pressurizedenvironment in a short time period. Consequently, X-raydiffraction experiments can require very long exposureperiods. Typically, tens to hundreds of hours are requiredfor a single measurement with a DAC when conventionalX-ray tubes are employed.

One means of accelerating this process is to employ amuch brighter source of radiation. In the past few years,most of the high-energy synchrotron storage rings in theworld have been used for high-pressure structural re-search. Since the X-ray flux available with these machinesis many orders of magnitude greater than that availablewith conventional radiation sources, the measurementscan usually be completed in much shorter time intervals;exposure periods of minutes or seconds are typical. Inaddition to speeding up the entire process, these brighterX-ray sources also permit measurements that would nototherwise be feasible, for example, phase transition ki-netics. Experiments performed recently at the StanfordSychrotron Radiation Laboratory involved monitoring thestructure of a number of alkali–halide salts in 1-min timeintervals as they were driven through a first-order struc-tural phase transition with increasing pressure. Parametersin the equations describing these phase transitions can bedetermined from these measurements.

In the quest for higher pressures, researchers usingDACs are employing diamond anvils with beveled tips;the consequence of this, in addition to pressures in excessof 500 GPa, is a further diminution of the sample volume.Thus, even brighter X-ray sources will be needed for futurework. Efforts have been made to employ devices insertedin the synchrotron rings to further increase the emittedphoton flux, so-called wigglers and undulators, for mate-rials studies on samples contained in a region only a fewmicrons in diameter.

Page 165: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB Final Pages

Encyclopedia of Physical Science and Technology EN007G-315 June 30, 2001 17:33

High-Pressure Research 361

Another type of high-pressure X-ray measurement thathas recently been advanced through the utilization of syn-chrotron radiation sources is EXAFS, or extended X-rayabsorption fine structure. Analyses of X-ray absorptiondata is complementary to standard crystallography in thatit allows determination of nearest and next-nearest dis-tances of specific atoms, coordination numbers, and ther-mal vibrational properties. A difficulty encountered inEXAFS measurements performed with a DAC is the pres-ence of Bragg or Renninger scattering in the absorptionspectrum. To circumvent these problems, amorphous ma-terials such as B4C are used to contain the pressure andprovide and X-ray window.

These structural measurements are pursued for severalscientific reasons: to determine compressibilities or equiv-alently bulk moduli, to detect structural phase transitions,and to identify new crystalline phases. Frequently, experi-mental work is closely coordinated with theoretical modelcalculations, the latter often predicting the existence ofpossible new and interesting phases.

B. Electrical Measurements

1. Nonsuperconducting Materials

Electrical measurements are usually directed toward mon-itoring the electrical resistance of a sample under pressure,although other measurements may also require the incor-poration of electrical connections to the pressure chamber.For example, the pressure dependences of elastic mod-uli have been measured ultrasonically with transducersbonded to the pressurized sample. The most difficult as-pect of this work is providing a feed-through that will notsignificantly attenuate the electrical signal and will con-tain the pressure. In some cases the anvils themselves areused as probes, for example, as noted above, tetrahedralpresses are constructed with each of the four rams elec-trically isolated, thus four-probe resistance measurementscan readily be carried out.

The electronic properties of solids are related to thecharacteristics of the ground state and various excitedstate energy levels. Since these levels can be affected bypressure, much of the research in this area is concernedwith how the states will move under pressure. ProfessorH. G. Drickamer of the University of Illinois has intro-duced the phrase “pressure tuning of electronic energylevels.”

An area that is currently receiving considerable atten-tion in this regard involves pressure-induced metallization.Although this phenomena has been studied in a numberof materials, perhaps none holds more interest than theprospect of metallizing hydrogen. Originally predicted byWigner and Huntington in the 1930s, recent model calcu-

lations indicate that at elevated pressures, hydrogen willundergo two transitions from its highly ordered diatomicinsulator state, first to a nonmetallic monatomic structure,followed by a transition to a metallic state. Unfortunately,there are many possible high-density phases for hydrogen,all with very similar values for the free energy. Therefore,it is difficult to forecast, with certainty, the properties ofmetallic hydrogen or whether it may be metastable un-der normal pressure and temperature conditions once itis formed. The imagination and excitement of scientiststhroughout the world, however, have been captured bysome serious theoretical predictions of very high temper-ature superconductivity in metallic hydrogen . . . if it canbe created.

As a prelude to this, other systems that are more easilystudied are being examined. Iodine, for example, at pres-sures of 13 to 17 GPa undergoes an insulator–metal tran-sition, and at about 21 GPa it converts from a diatomic toa monatomic conductor. Similar studies on other halideswould also seem appropriate (i.e., the metallization andsubsequent dissocation in Br2, Cl2, and possible F2). Ma-terials that have been metallized under pressure includeBrI, HI, CsI, BaTe, BaSe, BaS, and BaO.

Another interesting pressure-induced electronic phe-nomenon involves valence changes. The valence states inYb, Eu, Pr, Sm, and Ce can all be shifted through applica-tion of pressure. In Ce, this valence change manifests itselfin a very interesting, isomorphic phase transition. As thematerial changes valence state, there is a first-order phasetransition with approximately a 15% volume change butno change in crystalline structure (i.e., each phase is face-centered cubic).

2. Superconducting Materials

The practical applications of superconducting materialsare almost limitless. Power transmission lines; electricalmotors and generators; magnetic levitation for, for exam-ple, high-speed transit systems, and computer electronicsare but a few. A major problem limiting the utilizationof superconducting wires is the simple fact that the bestknown superconductor, Nb3Ge, has a transition tempera-ture, Tc, no higher than 23.2 K. Therefore, any realisticapplication of superconductivity must also include an ap-propriate means for cryogenic refrigeration, and therefore,in most cases, the gains that would otherwise be realizedare obviated by the temperature demands.

Since there is no theoretical reason why the coop-erative electronphonon coupling mechanism that is re-quired for normal (BCS) superconductivity cannot takeplace at higher temperatures, there are major researchprograms underway worldwide to discover new materi-als with higher Tc’s. Guidelines for these efforts can be

Page 166: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB Final Pages

Encyclopedia of Physical Science and Technology EN007G-315 June 30, 2001 17:33

362 High-Pressure Research

either empirical or theoretical. As an example of the for-mer, in V-based A3B compounds that crystallize in theA15 structure, such as V3Sn, V3Ge, and V3Si, it has beennoted that a plot of Tc vs m−1/2

B , where mB is the mass ofthe B-atom, tends to be linear. On the basis of this it hasbeen predicted that the Tc of Nb3Si should surpass thatof Nb3Ge. Since the A15 phase of Nb3Si is expected tothe denser than its normal Ti3P structure, it is expectedthat pressure should be favorable to conversion of Nb3Sito the A15 structure. Shock experiments have producedthis transformation. Unfortunately, the measured Tc is notas high as expected; the explanation is believed to lie indefects introduced during the shock conversion.

One of the most promising classes of materials in termsof potential high-Tc superconductors is the recently dis-covered organic salts. These are often one dimensionalor quasi-one dimensional and are promising because theydo not involve the usual electron–phonon (BCS) mecha-nism for superconductivity. Rather, the electron–electroninteractions are mediated through excitations of a Peierlsground state or excitons. The first organic supercon-ductors are from the ditetramethyltetraselenofulvalenium(TMTSF)2 X family, where X represents a suitable anion.

At room temperature and ambient pressure, all of thesesalts, independent of the anion, exhibit the same crys-tallographic structure, electrical resistivity, and thermo-electric power. However, at low temperature and/or highpressure, a variety of ground states may exist. Many ofthese salts have a spin density wave at low tempera-ture and ambient pressure, but at elevated pressures of0.6–0.8 GPa, they become superconducting with a Tc of1.2 K. One of these salts, (TMTSF)2CIO4, becomes super-conducting at ambient pressure with about the same Tc.

Other TMTSF-salts exhibit high-temperature metal–insulator transitions associated with anion ordering. How-ever, it has been demonstrated that pressure can be usedto effectively suppress these transitions, thereby leav-ing the material in the metallic state at low tempera-tures. In these cases the materials will undergo a super-conducting transition. These salts also appear to passthrough a glassy phase in which the resistivity coeffi-cient is negative at intermediate temperatures; they thenundergo superconducting transitions at lower tempera-tures. One of these, (TMTSF)2FSO3, exhibits the highestvalues of Tc known to date for organic compounds, ≥3 K.

The variables that determine under what conditions su-perconductivity occurs or the nature of the “glassy” phaseare not at all understood. The facts that the proposed spindensity wave ground state of these systems is very sensi-tive (1) to pressure and (2) to the type of anion used sug-gest that this is a structural issue. For this reason, it wouldbe important to carryout extended high-pressure, singlecrystal, X-ray diffraction experiments on many of these

organic salts at cryogenic temperatures. It is believed thatsuch information would prove extremely useful in char-acterizing the nature of the origin of superconductivity inthese materials.

C. Melting/Freezing Phenomena

The transition between the condensed and liquid phasesof matter is perhaps one of the most important and leastunderstood in the field. It is a problem that touches a broadcross section of disciplines: condensed matter physics,rheology, metallurgy, and the geosciences. From a techno-logical viewpoint, an understanding of these issues is veryimportant for a variety of materials-related industries, forexample, those dealing with semiconductor devices, ce-ramics, and optical components. For example, it is stillnot known why pressure will enhance the crystal growthrate in some systems.

The pressure dependence of the melting temperature,Tm(P), has been measured for a wide variety of materials.It is generally found that dTm/dP is positive and the Tm(P)curve is usually fit to some phenomenological or empiricalrelation, for example, the Lindemann equation. Althoughrecent advances in first-principles calculations of Gibbsfree energies from effective interatomic or intermolecularpotential functions have been most encouraging, a micro-scopic theory explaining the solid–liquid transition is stilllacking.

An area related to this concerns crystallization phenom-ena in amorphous solids, for example, metallic glasses.“Met-glasses” represent an important new class of mate-rials with certain improved physical properties, for exam-ple, superior radiation and corrosion resistance. It has beendemonstrated that hydrostatic pressure can raise the crys-tallization temperature of metallic glasses by 10–20◦/GPa;this effect is still not understood. Moreover, detailed struc-tural analyses of high-pressure crystallization phenomenahave yet to be carried out. A number of outstanding ques-tions in this area remain to be answered.

1. Why does pressure inhibit crystallization? Is thisrelated to the instability of favorable crystalline phasesthought to allow formation of metallic glasses from themelt?

2. Are the crystalline phases produced at high pressuredifferent from those formed at ambient pressure, and, ifso, what does this imply?

3. What are the changes that occur on cooling andrelease of pressure, and can they be understood in termsof more subtle structural changes that occur duringheating?

4. What is the crystallization nucleation mechanism,and are there precrystallization phenomena that may shedlight on the mechanisms involved?

Page 167: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB Final Pages

Encyclopedia of Physical Science and Technology EN007G-315 June 30, 2001 17:33

High-Pressure Research 363

D. Materials Modification by Shock Treatments

It is recognized that the state of some materials can besignificantly altered through application of shock pres-sures. Although no comprehensive theory has yet beendeveloped to explain this phenomenon, it is presumed thatthe rapid, massive shear deformation induced by the shockwaves and the concomitant defect state that follows is re-sponsible for what has been perceived to be a unique stateof matter in the post-shock material.

Anomalous behavior in post-shocked materials hasbeen identified in a number of separate areas: enhanced re-activity in, for example, structural phase transitions, chem-ical reactions, and sintering processes; enhanced atomicmigration as seen in radioactive tracers and thin films ofmaterials deposited at interfaces; shock-induced polariza-tion in polymers and ionic crystals; shock-induced opacityin optical materials; bleaching of color centers; formationof color centers; anomalous shifts in absorption bands;saturation of dislocation densities without deformations.

It is likely that many, if not all, of these anomalies arerelated to the massive defect state left in the wake of theshock wave. A wide variety of micro- and macrostruc-tural effects have been observed. Dislocation multiplica-tion, twinning, and void formation are examples of someof the small-scale effects; spallation and flow are promi-nent among large-scale effects of shock loading.

Shock effects influence the surface and near-surfacecharacteristics of materials. The surface hardness of2024 Al alloys was found to increase with increas-ing shock pressures produced by flyer plates. Explosivewelding leads to hydrodynamic and thermodynamic in-terpenetration of colliding metallic surfaces. Compactionof powders, which have high strength and resist ordinaryconsolidation or are chemically unstable and thus cannotbe sintered, can be accomplished with shocks. Interiors ofthe particles remain relatively cold, while the surfaces areheated to cause interdiffusion, welding, or melting. Moreductile materials may be used to bond together strongerparticles or fibers in order to form composite materials. Astrong and oscillatory dependence on hardness of Au–Gealloys has been observed for shock treatments with dura-tions of 0.1–1 µsec.

Although the precise mechanisms are not yet under-stood, there is clear evidence to indicate that shock wavesdo, in fact, produce an altered state in many materials.The classic means of subjecting materials to extremeshock states is chemical explosive techniques: detonationof contacting explosives, impact by explosively driven

flyer plates, or impact by gun-driven projectiles. How-ever, shock conditions can also be achieved by subjectingmaterials to intense pulses of radiation, for example, fromelectron beam accelerators, X-ray or neutron sources, orhigh-intensity lasers. The pulsing methods hold an impor-tant advantage of relatively high repetion rate, in contrastto methods dealing with explosives.

Perhaps an even more important advantage of laser-driven shock studies is that the same laser pulse can be usedto produce both a shock and a high-temperature plasmathat will emit X-rays to probe the shock. Exquisite timingwill be possible using time-of-flight delay methods. Theseadvantages suggest that laser-driven shock states may beuniquely useful in some material processing or testingpurposes. Hardening, welding, and compaction have beendemonstrated using shocks generated by explosives; thepossibilities of using lasers for these purposes has yet tobe assessed.

SEE ALSO THE FOLLOWING ARTICLES

DENSE MATTER PHYSICS • DIAMOND FILMS, ELECTRICAL

PROPERTIES • HIGH-PRESSURE SYNTHESIS (CHEMISTRY)• PULSED POWER SYSTEMS • SUPERCONDUCTIVITY

BIBLIOGRAPHY

Akimoto, S., and Manghnani, M. H. (1982). “High Pressure Research inGeophysics,” Center for Academic Publications Japan, Tokyo.

Bridgman, P. W. (1958). “The Physics of High Pressure,” G. Bell andSons, London.

Drickamer, H. G., and Frank, C. W. (1973). “Electronic Transitions andthe High Pressure Chemistry and Physics of Solids,” Chapman & Hall,London.

Ferraro, J. R. (1984). “Vibrational Spectroscopy at High External Pres-sures,” Academic Press, Orlando, FL.

Hazen, R. M., and Finger, L. W. (1982). “Comparative Crystal Chem-istry: Temperature, Pressure, Composition and the Variation of CrystalStructure,” Wiley, New York.

Jayaraman, A. (1983). “Diamond anvil cell and high pressure physicalinvestigations,” Rev. Mod. Phys. 55, 65.

Minomura, S. (1985). “Solid State Physics under Pressure,” KTK Sci-entific Publishers, Tokyo.

Schilling, J. S., and Shelton, R. N. (1981). “Physics of Solids under HighPressure,” North-Holland, Amsterdam.

Skelton, E. F. (1978). “High Pressure Science and Technology in Japan,”Office of Naval Research, Arlington, VA.

Spain, I. L., and Paauwe, J. (1977). “High Pressure Technology,” Dekker,New York.

Page 168: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ/LOW P2: ZCK Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN007C-333 June 30, 2001 15:18

Impedance SpectroscopyJ. Ross MacdonaldUniversity of North Carolina

I. Short History of Impedance SpectroscopyII. Categories of Impedance Spectroscopy:

Definitions and DistinctionsIII. Elements of Impedance SpectroscopyIV. Applications

GLOSSARY

Admittance A complex quantity usually symbolizedby Y = Y ′ + iY ′′. It is the inverse of impedanceand is sometimes called complex conductance. Herei = +(−1)0.5, and the single and double primes denotein-phase and quadrature components, respectively.

Complex dielectric constant The ratio of the (complex)dielectric displacement to the small-signal AC elec-tric field that induces the displacement. Conventionallywritten as ε = ε′ − iε′′. It is given by Y/(iωCC), whereCC is the capacitance of the empty measuring cell.

Complex forms Impedance spectroscopy data may beexpressed in two different forms. Rectangular: I = I ′+iI ′′, where I ′ and I ′′ are the real and imaginary partsof I, respectively; or Modulus: I = |I |eiφ , where |I | isthe modulus, or absolute value, of I and φ is its phaseangle, or argument. Note that the complex conjugate ofI is I ∗ = I ′ − iI ′′ = |I |e−iφ .

Complex modulus M = M ′ + i M ′′. It is the inverse ofthe complex dielectric constant and is also equal toiωCC Z .

Debye length A characteristic length that determines theextent of a space charge region near a discontinuity.

It depends on temperature, dielectric constant, and thevalence numbers and bulk concentrations of the mobilecharges present. The diffuse double-layer capacitancepresent near a non-ohmic electrode is inversely propor-tional to the Debye length.

Immittance A general term denoting any of the four basicimpedance spectroscopy response quantities: Y, Z , ε,or M .

Impedance The ratio of a sinusoidal voltage, appliedacross two terminals of a measurement cell, to the si-nusoidal component of the current flowing between theterminals that results from the applied potential differ-ence. Unless the system is purely resistive, impedanceis a complex quantity because the current will have adifferent phase from the applied voltage: Z = Z ′ + i Z ′′.

IMPEDANCE SPECTROSCOPY (IS) is a general termthat subsumes the small-signal measurement of the lin-ear electrical response of a material of interest (includingelectrode effects) and the subsequent analysis of the re-sponse to yield useful information about the physicochem-ical properties of the system. Analysis is generally car-ried out in the frequency domain, although measurements

703

Page 169: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ/LOW P2: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007C-333 June 30, 2001 15:18

704 Impedance Spectroscopy

are sometimes made in the time domain and then Fouriertransformed to the frequency domain. Impedance spec-troscopy is by no means limited to the measurement andanalysis of data at the impedance level (e.g., impedance vs.frequency) but may involve any of the four basic immit-tance levels; thus, most generally, IS stands for immittancespectroscopy.

I. SHORT HISTORY OF IMPEDANCESPECTROSCOPY

Since IS deals directly with complex quantities, its his-tory really begins with the introduction of impedance intoelectrical engineering by Oliver Heaviside in the 1880s.His work was soon extended by A. E. Kennelly and C. P.Steinmetz to include vector diagrams and complex rep-resentation. It was not long before workers in the fieldbegan to make use of the Argand diagram of mathemat-ics by plotting immittance response in the complex plane,with frequency an implicit variable. Electrical engineer-ing examples were the circle diagram introduced by C. W.Carter (1925) and the Smith-Chart impedance diagram ofP. H. Smith (1939). These approaches were soon followedin the dielectric response field by the introduction in 1941of the Cole–Cole plot: a plot of ε′′ on the y (or imagi-nary) axis versus ε′ on the x (or real) axis. Such complexplane plots are now widely used for two-dimensional rep-resentation of the response of all four immittance types.Finally, three-dimensional perspective plots that involve alog-frequency axis were introduced to the IS area by theauthor and his colleagues in 1981; these plots allow com-plete response at a given immittance level to be shown ina single diagram.

Because IS analysis generally makes considerable useof equivalent circuits to represent experimental frequencyresponse, the whole history of lumped-constant circuitanalysis, which particularly flowered in the first third of thecentury, is immediately relevant to IS. Since then, muchwork has been devoted to the development of theoreti-cal physicochemical response models and to the defini-tion and analysis of various distributed circuit elementsfor use in IS-equivalent circuits along with ideal, lumpedelements like resistance and capacitance. The preferredanalysis method for fitting of IS data to either equivalentcircuits or to a mathematical model is complex nonlin-ear least squares fitting (CNLS), introduced to the field in1977 by Macdonald and Garber. In this procedure, all theparameters of a fitting model are simultaneously adjustedto yield an optimum fit to the data.

Early experimental work in the IS field is discussedin the book on IS listed in the bibliography (Macdonald,1987). Here it will suffice to mention the work of Grahame

on electrolyte double-layer response, the technique of ACpolarography pioneered by D. E. Smith, and the electrolytestudies of Randles and Somerton, Sluyters and Oomen,R. P. Buck, and J. E. Bauerle. Since the late 1960s, IS hasdeveloped rapidly, in large part because of the availabilityof new, accurate, and rapid measuring equipment.

II. CATEGORIES OF IMPEDANCESPECTROSCOPY: DEFINITIONSAND DISTINCTIONS

There are two main categories of IS: electrochemicalIS (EIS) and everything else. EIS involves measure-ments and analysis of materials in which ionic conduc-tion strongly predominates. Examples of such materialsare solid and liquid electrolytes, fused salts, ionicallyconducting glasses and polymers, and nonstoichiometricionically bonded single crystals, where conduction can in-volve motion of ion vacancies and interstitials. EIS is alsovaluable in the study of fuel cells, rechargeable batteries,and corrosion.

The remaining category of IS applies to dielectric ma-terials: solid or liquid nonconductors whose electricalcharacteristics involve dipolar rotation, and to materialswith predominantly electronic conduction. Examples aresingle-crystal or amorphous semiconductors, glasses, andpolymers. Of course, IS applies to more complicated sit-uations as well, for example, to partly conducting dielec-tric materials with some simultaneous ionic and electronicconductivity. It is worth noting that although EIS is themost rapidly growing branch of IS, nonelectrochemicalIS measurements came first and are still of great valueand importance in both basic and applied areas.

In the EIS area in particular, an important distinctionis made between supported and unsupported electrolytes.Supported electrolytes are ones containing a high concen-tration of indifferent electrolyte, one whose ions generallyneither adsorb nor react at the electrodes of the measuringcell. Such an added salt can ensure that the material isvery nearly electroneutral everywhere, thus allowing dif-fusion and reaction effects for a low-concentration ion ofinterest to dominate the AC response of the system. Sup-port is generally only possible for liquid electrochemicalmaterials; it is often, but not always, used in aqueous elec-trochemistry. Solid electrolytes are unsupported in mostcases of interest, electroneutrality is not present, and Pois-son’s equation strongly couples charged species. Becauseof this difference, the formulas or models used to analyzesupported and unsupported situations may be somewhator completely different.

Another important distinction is concerned with staticpotentials and fields. In a material-electrode system

html
Page 170: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ/LOW P2: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007C-333 June 30, 2001 15:18

Impedance Spectroscopy 705

without an applied static external potential difference(p.d.), internal p.d.s and fields are, nevertheless, gener-ally present, producing space-charge layers at interfaces.For solids such regions are known as Frenkel layers andarise from the difference in work function between theelectrode and the material. Because the static fields andcharge concentrations in the material are inhomogeneous,exact small-signal solutions for the impedance of the sys-tem are impossible and numerical methods must be used.

In an electrolytic cell such static space-charge regionsare only absent when the external static p.d. is adjustedso that the charge on the working electrode is zero—thepoint of zero charge (PZC)—a flat-band condition. Suchadjustment is impossible for systems with two symmetri-cal electrodes because an applied static p.d. increases thespace-charge region at one electrode while reducing it atthe other. But the use of a working electrode of small areaand a large-area counter electrode ensures that the overallimpedance of the system is little influenced by what hap-pens at the counter electrode; in this situation the PZC canbe achieved for the working electrode. In general, the cur-rent distribution near this electrode is frequency dependentand thus makes a frequency-dependent contribution to theoverall impedance of the system, which is dependent onelectrode geometry and character.

Figure 1 shows a flow diagram for a complete IS studywhose goal is characterization of important properties ofthe material-electrode system from its electrical response,one of the major applications of IS. The experimental dataare denoted by Ze(ω), the impedance predicted by a theo-retical fitting model by Z t(ω), and that of a possible elec-trical equivalent circuit by Zec(ω), where ω = 2πf and fis frequency. When an appropriate detailed model for thephysicochemical processes present is available, it shouldcertainly be used for fitting. Otherwise, one would em-ploy an equivalent electrical circuit whose elements andconnectivity were selected, as far as possible, to representthe various mass and charge transport physical processesthought to be of importance for the particular system.

Note that a complete IS analysis often involves morethan a single set of measurements of immittance versusfrequency. Frequently, full characterization requires thatsuch sets of measurements be carried out over a rangeof temperatures and/or other externally controlled experi-mental variables. IS characterization may be used to yieldbasic scientific and/or engineering information on a widevariety of materials and devices, ranging from solid andliquid electrolytes to dielectrics and semiconductors, toelectrical and structural ceramics, to magnetic ferrites, topolymers and protective paint films, and to secondary bat-teries and fuel cells. Other important applications of IS,not further discussed herein, have been made in the bi-ological area, such as studies of polarization across cell

FIGURE 1 Flow diagram for the measurement and characteri-zation of a material-electrode system. (Reprinted by permissionof John Wiley & Sons, Inc., from “Impedance Spectroscopy—Emphasizing Solid Materials and Systems,” J. R. Macdonald, ed.Copyright 1987, John Wiley & Sons, Inc.)

membranes and of animal and plant tissues. Finally, theanalysis techniques of IS are not limited to electrical im-mittance but apply as well to measurements of mechanicaland acoustic immittance.

III. ELEMENTS OF IMPEDANCESPECTROSCOPY

A. Measurement Methods

Although IS measurements are simple in principle, theyare often complicated in practice. Part of the difficultyarises because the resistive and capacitive components ofIS response have ranges, when one considers differentmaterials, electrodes, and temperatures, that span 10 ormore orders of magnitude. Measurements require com-parison with standard values of these components and arethus only as accurate as the standards. Second, the IS fre-quency range may extend over 12 orders of magnitude ormore: from as low as 10 µHz for adequate resolution of

Page 171: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ/LOW P2: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007C-333 June 30, 2001 15:18

706 Impedance Spectroscopy

interfacial processes, up to 10 MHz or higher, sometimesneeded to characterize bulk response of the material ofinterest.

Although IS measurements on solids or dielectric liq-uids usually involve cells with two identical-plane parallelelectrodes, the situation is often much more complicatedfor measurements on liquid electrolytes. There, one usu-ally employs one or more small working electrodes, a verysmall reference electrode, and a large counter electrode.Such an arrangement ensures that everything of interest(related to immittance) happens at or near the workingelectrode(s). Further, a rotating-disk working electrode isfrequently used to control hydrodynamic conditions nearthe electrode.

Because the kinetics of electrode reactions often dependstrongly on the static (dc) potential difference between theworking electrode and the bulk, or, equivalently, the work-ing electrode and the reference electrode, a potentiostat isneeded to fix this p.d. to a known and controllable value.The simultaneous application of both ac and dc signals toa three- or four-electrode cell makes it particularly diffi-cult to obtain accurate frequency-response results above50 kHz or so.

Although a calibrated double-beam oscilloscope, orthe use of Lissajous figures with a single-beam instru-ment, can be used to determine immittance magnitudeand phase, such measurements are generally insufficientlyaccurate, are time consuming, and apply only over a lim-ited frequency range. A superior alternative is the use ofaudio-frequency or high-frequency bridges. Several suchbridges are discussed in the IS book. Of particular interestis the Berberian–Cole bridge, which can cover a wide fre-quency range and can allow potentiostatic dc bias control.Another important technique using a bridge and special er-ror reduction procedures has recently been developed bySchone and co-workers that allows potentiostatic controland yields very accurate impedance results up to 3 MHz.But manual balancing of a bridge is often disadvantageousbecause of its slowness, especially for corrosion studieswhere the properties of the system itself may be slowlychanging.

Manual balancing is avoided in various automatednetwork analyzers and impedance analyzers now com-mercially available. But the measuring instrument thathas virtually revolutionized IS measurements and prin-cipally led to the burgeoning growth of the field in thepast 20 years is the frequency-response analyzer (FRA).Typical examples are FRAs produced by Solartron and byZahner. Although space does not allow a full descriptionof their many features, such instruments allow potentio-static control for three- or four-terminal measurements,they are highly digitized, they incorporate automatic fre-quency sweeps and automatic control of the magnitude of

the applied ac signal, they can yield 0.1% accuracy, andthey carry out measurements automatically.

Although FRAs such as the Solartron 1260 cover a fre-quency range from 10 µHz to 32 MHz, impedance re-sults using them are not sufficiently accurate above about50 kHz when potentiostatic control is used. A typical FRAdetermines impedance by correlating, at each frequency,the cell response with two synchronous signals, one inphase with the applied signal and the other phase-shiftedby 90◦. This process yields the in-phase and out-of-phasecomponents of the response and leads to the various im-mittance components. A useful feature is autointegration,a procedure that averages results over an exact number ofcycles, with the amount of such averaging automaticallyselected to yield statistically consistent results. Recently, adielectric front end has become available for FRAs. It hasan extremely high input impedance and makes possibleaccurate measurements on dielectrics and on very-high-resistivity solids containing mobile charges.

B. Analysis and Interpretation of Data

1. Graphics

Before carrying out a detailed analysis of IS immittancedata, it is a good idea to examine the data graphically,both to search for any outliers and to examine the struc-ture of the data, structure that will usually reflect, at leastin part, the physical processes present that led to the data.From the experimental situation one will generally knowwhether one is dealing with an intrinsically insulating ma-terial, such as a nonconducting or a leaky dielectric, orwhether the situation is of intrinsically conducting char-acter: mobile charges dominate the response but may becompletely or only partially blocked at the electrodes. Forcomplete blocking, no DC can pass, a case that could beconfused with dielectric response. In the intrinsically con-ducting situation, dielectric effects are generally minimal,and Z and M representations of the data are often mostuseful. In the nonconducting case, Y and ε are frequentlymost appropriate, but it is nevertheless a good idea ini-tially to examine plots of the data for all four immittancelevels, whatever the conducting/nonconducting situation.

When mobile charges are present, five principal phys-ical processes may influence the data; these are bulkresistive-capacitive effects, electrode reactions, adsorp-tion at the electrodes, bulk generation-recombinationeffects (e.g., ion-pairing), and diffusion. The double-layer capacitance is the reaction capacitance CR, and thereaction resistance RR is inversely proportional to thereaction rate constant. It is important to distinguish CR

from the usually much larger low-frequency psudocapac-itance associated with the diffusion of mobile charge or

Page 172: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ/LOW P2: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007C-333 June 30, 2001 15:18

Impedance Spectroscopy 707

with adsorption at an electrode. Note that in general aprocess that dissipates energy is represented in an IS-equivalent circuit by a resistance, and energy storage isusually modeled by a capacitance. Detailed CNLS analy-sis of IS data can lead in favorable cases to estimates ofsuch basic material-electrode quantities as electrode reac-tion and adsorption rates, bulk generation–recombinationrates, charge valence numbers and mobilities, diffusioncoefficients, and the (real) dielectric constant of thematerial.

There are many ways IS data may be plotted. In theIS field, where capacitive rather than inductive effectsdominate, conventionally one plots −Im(Z ) ≡ −Z ′′ on they-axis versus Re(Z ) ≡ Z ′ on the x-axis to give a complexplane impedance plot. Such graphs have (erroneously)been termed Nyquist plots. They have the disadvantageof not indicating frequency response directly, but may,nevertheless, be very helpful in identifying conductionprocesses present. Another approach, the Bode diagram,is to plot log[|Z |] and φ versus log[ f ]. Alternatively, onecan plot Z ′ (or any I ′) or −Z ′′ (or −I ′′), or the logs ofthese quantities versus log[ f ].

An important IS building block is Debye response,response that involves a single time constant, τ . ACole–Cole plot of such response is shown in Fig. 2.The arrow shows the direction of increasing frequency.Debye response can be represented in complex form asε = ε∞ + [ε0 − ε∞]/[1 + (i ωτ )] and, in circuit form, in-volves a capacitance ε∞CC in parallel with the series com-bination of a resistor R, modeling dissipative effects, anda capacitor C ≡ (ε0 − ε∞)CC, representing stored charge.Finally, the time constant or relaxation time is given byτ ≡ RC.

Three-dimensional perspective plots are particularlyuseful because they allow complete response to appear ona single graph. Figure 3 shows such plots at the impedancelevel for the analog of Debye response for a conductingsystem. By including projections of the 3-D curve of theresponse in all three perpendicular planes of the plot, one

FIGURE 2 Complex-plane plot of the complex dielectric constantfor Debye frequency response.

FIGURE 3 A simple circuit and 3-D perspective plots of itsimpedance response. (Reprinted by permission of John Wiley &Sons, Inc., from “Impedance Spectroscopy—Emphasizing SolidMaterials and Systems,” J. R. Macdonald, ed. Copyright 1987,John Wiley & Sons, Inc.)

incorporates all relevant 2-D plots in the same diagram.Note that the curve in the back plane, the complex-planeimpedance plot, is just the usual Debye semicircle, onewith its center on the real axis.

To demonstrate some of the power and weaknesses of3-D plots, Fig. 4 includes three types of such plots, all forthe same EIS data taken on single-crystal Na β-alumina.Graph A is an impedance plot and shows that only two outof the four curves indicate that the lowest frequency pointis in error. In this plot, ν denotes frequency f. Clearly, one

Page 173: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ/LOW P2: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007C-333 June 30, 2001 15:18

708 Impedance Spectroscopy

FIGURE 4 Three-dimensional perspective plots of Na β-alumina data at (A) the impedance level, (B) log impedancelevel, and (C) complex modulus level. (Reprinted by permission of John Wiley & Sons, Inc., from “ImpedanceSpectroscopy—Emphasizing Solid Materials and Systems,” J. R. Macdonald, ed. Copyright 1987, John Wiley& Sons, Inc.)

should not rely on the conventional log[ f ] curves alone.Since the diagram shows that much high-frequency dataare not resolved by this kind of plot, graph B involves thelogarithms of the data. Although high-frequency responsenow appears, the error in the low-frequency point is nearlyobscured by the reduced resolution inherent in a log plot.

Much improved results appear in graph C, a 3-D Mplot. Resolution over the full frequency range is greatlyincreased; the error in the lowest frequency point isclearly shown; a midfrequency glitch now appears thatis not evident in the other plots and arises from a switch

of measuring devices without adequate cross-calibration;and nonphysical behavior is now apparent at the highestfrequencies. These results make it clear that even when3-D plots are used, it is always desirable to explore theresults of different transformations of the data and to pickthe one with the best resolution.

2. Complex Nonlinear Least Squares Data Fitting

a. Strengths and weaknesses. Although graphicexamination of IS data is an important analysis step, only

Page 174: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ/LOW P2: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007C-333 June 30, 2001 15:18

Impedance Spectroscopy 709

in the simplest cases can it be used to obtain even roughestimates of some system parameters. Since good param-eter estimates are needed for adequate characterization ofthe material-electrode system, a fitting technique such asCNLS must be applied to obtain them. In doing so, thedata, at any I level, are fitted to a mathematical model in-volving the parameters or to the response of an equivalentcircuit. Such fitting models are discussed in Section IV.A.Not only does CNLS fitting yield estimates of the param-eters of the model, but it also provides estimates of theirstandard deviations, measures of how well they have beendetermined by the data fit. These standard deviation val-ues are valuable in deciding which parameters are crucialto the model and which are useless, or at least not welldeterminable from the data.

CNLS fits are produced by a program that minimizes theweighted sum of squares of the real and imaginary resid-uals. A residual is the difference between a data value ata given frequency and the corresponding value calculatedfrom the model. The weights used are the inverses of theestimated error variance for a given real data value andthat for the corresponding imaginary value. Weighting isthe most subjective part of least squares fitting, yet it canoften have crucial effects on the results of such fitting andis thus of prime importance.

Since individual error-variance estimates are usuallyunavailable, it has been customary to use simplified vari-ance models to obtain values to use in the fitting. The sim-plest such model is to take all weights equal to one: unityweighting (UWT). Another popular and important choiceis to set the error variance of each data value equal to thesquare of that value. Since the uncertainty of the valueis then proportional to the value itself, this defines pro-portional weighting (PWT). It has recently been shown,however, that such weighting leads to biased parameterestimates; it should be replaced, when the fitting model iswell matched to the data, by function-proportional weight-ing (FPWT), where the calculated rather than the directdata value is employed in the weighting.

PWT or FPWT is particularly important because therange of typical IS data can be as large as 103 or even 106.When UWT is used in fitting such data, only the largestparts of the data determine the parameter estimates, and thesmaller values have little or no effect. Alternatively, withPWT or FPWT, which is equivalent to assuming a constantpercentage error, small and large data values contributeequally to the final parameter estimates.

Figure 5 presents the results of PWT CNLS fitting ofβ-PbF2 data using an equivalent circuit with a distributedelement, the constant phase element (CPE). Both the orig-inal data and the fit results are shown in the 3-D plot. Thefigure indicates that seven free parameters have been quitewell determined by the data, a remarkable result when one

FIGURE 5 Three-dimensional perspective impedance plot ofβ-PbF2 data (—— ---) and fitted values and curves (— — —);the fitting circuit used and parameter estimates and estimates oftheir standard deviations. (Reprinted by permission of John Wi-ley & Sons, Inc., from “Impedance Spectroscopy—EmphasizingSolid Materials and Systems,” J. R. Macdonald, ed. Copyright 1987, John Wiley & Sons, Inc.)

considers the apparent lack of much structure in the datathemselves.

A detailed physico-chemical model is always prefer-able to an equivalent circuit for fitting, especially sincesuch modles often cannot be expressed in terms of anequivalent circuit involving standard elements. But mostIS situations involve many-body problems currently in-solvable at the microscopic level. Thus one must usually besatisfied with simpler continuum models, often expressedas equivalent circuits. One weakness of equivalent cir-cuits involving only ideal elements is their ambiguity. Thesame elements may be interconnected in different waysand yet, with appropriate values, yield exactly the samefrequency response at all frequencies. Thus, IS fitting can-not distinguish between the different possible structures,and only other measurements, such as IS fitting of data

Page 175: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ/LOW P2: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007C-333 June 30, 2001 15:18

710 Impedance Spectroscopy

FIGURE 6 Four two-time-constant circuits that exhibit the sameimpedance response over all frequencies. Units are M for resis-tances and µF for capacitances.

over a range of temperatures and/or potentials, can helpone establish which of the possible circuits is most phys-ically reasonable.

Figure 6 shows all possible potentially equivalent con-ducting circuits involving two resistances and two capac-itances. Specific parameter value choices that make themall have exactly the same response are also indicated. Herethe values for circuit D were taken exact, and approximatevalues for the other elements are denoted with a ∼ sign.Let the units of these elements be M for resistances andµF for capacitances. Note that the two RC time constantsof circuit A, a series connection, differ by less than 17%and are thus very close together. Can IS procedures re-solve such a situation? Figure 7 shows the exact complex-plane response of these circuits at both the Z and the Mlevels, compared to single time-constant Debye response.The M curve shows much better separation of the tworesponse regions than does the Z curve. Thus, adequategraphical resolution is indeed possible. Further, it turns outthat CNLS fitting of synthetic data calculated from any ofthese circuits with appreciable proportional random errorsadded still yields excellent parameter estimates. In fact,with reasonably good data, CNLS can resolve response

FIGURE 7 Complex-plane immittance responses, at the Z andM levels, of the circuits of Fig. 6.

involving considerably closer time constants than thoseinvolved here.

Although several CNLS fitting programs now exist foruse on personal computers, two commercially availableones have been especially tailored for the IS field. The first,EQUIVCRT, can be obtained from Dr. B. A. Boukamp,Twente University, P.O. Box 217, 7500 AE Enschede,The Netherlands: the second, LEVM, is now availableat no cost, and both its extensive manual and program, in-cluding source code, may be downloaded from this homepage: http://www.physics.unc.edu/ ∼macd/. The programsto some degree complement each other, but LEVM is moregeneral and flexible in many ways and incorporates muchmore sophisticated weighting possibilities.

b. Recent developments. Currently, the capabilityof using various types of weighting involving model pre-dictions instead of data values exists only in LEVM, firstreleased in the summer of 1989. Although weighting suchas FPWT is somewhat more complicated than PWT be-cause it varies with each nonlinear least squares iterationas the parameter estimates change during the fitting proce-dure, its bias reduction potential makes such complexityoften worthwhile. Although LEVM allows the fitting ofreal or imaginary parts of the data separately, fitting bothtogether, as in CNLS, ensures that the best use is made ofall the data in determining the parameter estimates and isthus preferred when both parts are available.

Real IS data often have independent random errors thathave both an additive term and one that depends on the truemodel predictions. A rather general error-variance modelincorporating these possibilities is included in LEVM. Fora specific angular frequency ω j , the real and imaginaryparts of ν j , the error variance used in determining theweighting, may be written as

ν ′j = U 2 + |F ′(ω j )|2ξ

and

ν ′′j = U 2 + |F ′′(ω j )|2ξ ,

where U is associated with the additive random errors andξ is an arbitrary positive fractional exponent.

When U = 0, ξ = 1, and F is a data value, one has PWT;whereas when F is a model prediction the result is FPWT.Another widely used weighting, modulus weighting, fol-lows when the same values of U and ξ are used but bothF ′ and F ′′ in the above equations are replaced by |F |.It is usually inconsistent, however, with the types of er-rors likely to be present and generally leads to appreciablymore bias in parameter estimates even than PWT. CNLSfitting yields a standard deviation Sf of residuals, which isa measure of the overall goodness of fit. For proportionalrandom errors having a proportionality constant of σr, Sf

Page 176: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ/LOW P2: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007C-333 June 30, 2001 15:18

Impedance Spectroscopy 711

is an unbiased estimate of σr for FPWT, is nearly so forPWT, and is appreciably biased on the low side for FMWTand MWT, types of modulus weighting.

When the data involve one or more inductive-type loops,such as may arise from adsorption of a mobile charge atan electrode, it is desirable to use a nonzero U along withPWT or FPWT or to use modulus weighting. Otherwise,because values of the imaginary part of the data may be-come very small and even pass through zero in the loopregion, PWT or FPWT alone can strongly overemphasizethe effect of these values near zero and thus lead to poorfitting.

Although U and ξ may be given fixed values duringCNLS fitting, a way has been found to incorporate them asfree parameters in LEVM least squares fitting. When this isdone, the data themselves determine the most appropriateweighting for their fit, thus removing an appreciable partof the subjective element present in prior weighting ap-proaches. Further, Monte Carlo fitting studies have shownthat the statistical uncertainties of U and ξ in CNLS fitswith them both taken free, or with only ξ free, are usuallyquite small compared to their estimates, and their biasesare even smaller. Thus, their fit estimates may be used withconfidence.

Although for much IS data one would expect that ξwould be close to unity, this need not always be the case.Consider, for example, a set of real data arising from thesum of the radioactive decay of two different species. Nowthe statistics of such decay follows a Poisson process, onefor which ξ = 0.5. The radioactive background count willalso involve such a process. Thus the appropriate variancemodel would involve U = 0 and three terms, each with2ξ = 1. The first two would be the two exponential decaysand the last, the background. In such a fitting situation,where ξ is known absolutely, it should be held fixed at itsproper value.

IV. APPLICATIONS

A. Basic Analysis of Material Propertiesand Electrode Effects

1. Bulk and Reaction Response

Although IS is of great value for the characterization ofthe electrical properties of material-electrode systems, itsuse for this purpose requires that connections be knownbetween model and/or equivalent circuit parameters andthe basic characterization parameters. One must be ableto pass from estimates of macroscopic quantities, suchas resistances and capacitances, to estimates of averagemicroscopic quantities. Here only a brief overview willbe given of some of the large amount of theoretical IS-

related work of the past 40 years. More details appear inthe IS book.

Because of the charge decoupling present in a supportedsituation, it is often an excellent approximation to treatthe effects of the various physical processes present in-dependently. On the other hand, for unsupported condi-tions where strong coupling is present, a unified treat-ment of all the processes together is necessary. The mostcomplete such theory, which incorporates all five of theprocesses mentioned in section III.B.1, was published byFranceschetti and Macdonald in 1978. It is a continuum(i.e., averaged, not microscopic) theory, includes intrinsicand extrinsic charge effects, and applies to either ionic orelectronic conduction conditions. Even though it strictlyapplies only to flat-band conditions, its results are still suf-ficiently complicated that only in simplified cases does itlead to responses that may be modeled by an equivalentcircuit.

It is useful to separate the electric processes presentinto bulk- and electrode-related groups whenever possible,something which is usually indeed possible using CNLSfitting. The first group includes bulk resistance and dielec-tric effects, the homogeneous reactions associated withdissociation and recombination of the charges present, andeven possible dispersive response. It is generally associ-ated with response effects at the high end of the frequencyrange, while significant electrode effects often occur nearthe low end, possibly at very low frequencies. Bulk resis-tance and capacitance are extensive quantities, dependenton the effective separation between electrodes.

The second group involves what happens in the neigh-borhood of the electrodes (within a few Debye lengths ofthem) and is thus intensive. No net charge is transferredto an electrode if it is completely blocking for all mobilecharges. The next simplest EIS situation is that where amobile metallic ion is of the same species as the atomsof a metallic electrode: a parent-ion electrode. Then, ina symmetrical-electrode situation there is a sink/sourceof ions at each electrode, since electron transfer at anelectrode can transform ions into atoms and vice versa,depending on the polarity of the electric field at the elec-trode. Such a reaction can be written

Me ⇀↽ Mez+ + ze−,

where Me denotes a metal atom and z the number of elec-trons transferred. An example of a symmetrical cell of thistype is Ag|AgCl|Ag.

Particularly important for the aqueous electrolyte areais the redox electrode, where charge crosses the interfaceat the electrode only in the form of electrons. The speciesRed and Ox are usually soluble in the electrolyte, satisfy

Red(z−n)+ ⇀↽ Oxz+ + ne−,

Page 177: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ/LOW P2: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007C-333 June 30, 2001 15:18

712 Impedance Spectroscopy

and involve the forward and reverse reaction rate con-stants kf and kr, respectively. If z = n, the Red species isuncharged and may diffuse in the electrode, or may evolveif it is a gas.

2. Distributed Circuit Element Response

a. Diffusion. Since diffusion is not localized at apoint in space but is distributed over a finite region, itleads to electrical response characteristic of a distributedcircuit element (DCE). Such elements cannot be describedby means of a finite number of ideal elements, such as re-sistances and capacitances. The response of several DCEsimportant to IS will be discussed.

In addition to possible diffusion of uncharged specieswithin an electrode, diffusion of mobile species in the elec-trolyte may contribute significantly to the impedance of aIS system. Generally, diffusion response is neither inten-sive nor extensive. At sufficiently high frequencies, diffu-sion effects are confined to the immediate neighborhoodof the electrode (or within a hydrodynamic boundary layerat a rotating electrode) and so are intensive; whereas at lowenough frequencies, diffusion occurs throughout the mate-rial between electrodes, and the response becomes exten-sive as the frequency decreases and the effective diffusionlength ld, proportional to (ω)−0.5, becomes comparable tothe size of the cell.

The diffusion impedance, appropriate when there is afast electrode reaction, is of the form

ZW(ω) = ZW(0)[tanh

{i(l/ ld)2

}0.5]/{i(l/ ld)2

}0.5,

where l is the separation between symmetrical electrodesand ZW(0) is a resistance proportional to l and thus is ex-tensive. Such response is known as finite-length Warburgbehavior. At high enough frequencies that the tanh termgoes to unity, ZW(ω) becomes proportional to the intensivequantity ld and is termed (ordinary) Warburg response.

As the electrode reaction rate decreases toward zero, amore complicated expression for ZW(ω) must be used, butit reduces to the form

ZW(ω) = ZC[ctnh

{i(l/ ld)2

}0.5]/{i(l/ ld)2

}0.5

when the electrode is completely blocking (open-circuitdiffusion). Here ZC is given by (l/ ld)2/ωCDOC, andCDOC/CC is the effective low-frequency limiting dielec-tric constant associated with the process.

For general unsupported situations, those with positiveand negative charged species mobile and having diffu-sion coefficients of Dn and Dp and valence numbers of zn

and zp,

(ld)2 = (4Dn Dp/ω)[(ηn + ηp)/(ηn Dn + ηp Dp)].

No diffusion effects appear when only charge of a singlesign is mobile; this often is an excellent approximation forsolid electrolytes.

For supported conditions, matters are different. Con-sider a single species with diffusion coefficient D and va-lence number z (possibly zero). Then (ld)2 = 4D/ω, a re-sult that follows from the above expression when one setsDn = Dp = D and zn = zp = z. Further, when both positiveand negative charges are mobile, diffusion under unsup-ported conditions leads to a single expression involvingtanh, as above, but for supported conditions, as in the re-dox case, two such terms appear, one for each species—inkeeping with the lack of coupling between the species.

b. Other DCEs. A characteristic signature of diffu-sion is (iω)±0.5 response, but IS data more often exhibitsCPE response (iωτ )±ψ , where 0 ≤ ψ ≤ 1. But such re-sponse is not physically realizable over all frequencies,and so other DCEs have been introduced that approxi-mate such behavior over a limited frequency range. Theymay be written as impedances or complex dielectric con-stants, depending upon which I level is appropriate. Herethey will be given at the Z level.

An empirical DCE of the above type is Havriliak–Negami (HN) response, written as

ZHN(ω) = RHN/[1 + (iωτ )α]β,

where 0 ≤ α ≤ 1 and 0 ≤ β ≤ 1. It reduces to Cole–Davidson response when α = 1 and to Cole–Cole response(termed ZC response at the Z level) when β = 1. The firstof these yields an asymmetric arc in a complex-plane plotand the second one a symmetric arc. Both shapes appearoften in practice, and ZC fitting is frequently used to rep-resent data that yield an arc of a semicircle with its centerbelow the real axis. Such behavior is usually ascribed tothe presence of a distribution of some physical quantityin space, time, or energy. Rough electrodes are one ex-ample. Although fitted values of α and/or β often showappreciable temperature dependence, there exists no the-ory yielding such dependence for HN response.

Another important DCE is that of Kohlrausch–Williams–Watts (KWW) response. It yields a stretchedexponential in the time domain, response proportional toexp[−t/τ ]ψ , with 0 ≤ ψ ≤ 1. Here there are, if anything,too many different theories yielding such response, butagain they do not usually predict the temperature depen-dence of ψ . The corresponding frequency response is verydifficult to calculate accurately, but an excellent approxi-mation for it is available in LEVM. Complex plane plotsof KWW response yield an asymmetric arc for any ψ < 1until ψ = 1, when a Debye semicircle appears.

Another DCE category is associated with the presencein a material of a distribution of activation energies (DAE).

Page 178: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ/LOW P2: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007C-333 June 30, 2001 15:18

Impedance Spectroscopy 713

Such distributions are likely in IS materials and may beexpected even in single crystals when there are many com-peting possibilities for the individual motion of mobilecharges. Both Gaussian and exponential distributions havebeen considered in detail and can lead to either symmetricor asymmetric complex-plane arcs. But only an exponen-tial DAE yields CPE-like fractional-exponent frequencyresponse over a finite frequency region. This exponent,φ, is not limited to the range from 0 to 1 but satisfies−∞ < φ < ∞. Further, unlike the other DCEs considered,an exponential DAE predicts temperature dependence ofφ in good agreement with many experimental results.

3. Equivalent Circuits

Many different equivalent circuits have been proposedover the years for IS fitting, and no one circuit structureis appropriate for all situations. Figure 8 shows a circuit,however, that has been found useful for a variety of materi-als and experimental conditions. Bulk properties are repre-sented by Cg, the geometrical compacitance, and R ∞, thehigh-frequency limiting resistance. CR, associated with anelectrode reaction, is the double-layer capacitance (possi-bly including both a compact inner-layer capacitance anda diffuse double-layer capacitance), and RR is the reactionresistance. Finally, CA and RA are associated with adsorp-tion at an electrode. The ZD elements, when present, areDCEs. Also, not all the other elements need be present; forexample, in the absence of adsorption CA and RA wouldnot appear.

For an unsupported, fully dissociated material withcharges of only a single sign mobile, the Fig. 8 circuitwith all ZDs absent has been found to yield an accuraterepresentation of the impedance resulting from a flat-band

FIGURE 8 An equivalent circuit of hierarchical structure usefulin fitting much IS data. (Reprinted from “Interface Effects in theElectrical Response of Non-Metallic Conducting Solids and Liq-uids,” J. R. Macdonald, IEEE Trans. on Electrical Insulation, Vol.EI-15, pp. 65–82, Fig. 3. Copyright IEEE 1981.)

theoretical analysis of the situation. Since only Rs and Csare involved, ambiguity is present, and many other cir-cuit structures with the same elements and the same fre-quency response are possible. Nevertheless, the presenthierarchical ladder-net-work connection is more physi-cally reasonable than the others for homogeneous mate-rial. It ensures that bulk charging and conduction effectstake place before reaction/adsorption ones. For polycrys-talline materials, however, circuits involving series ratherthan hierarchical connection of parallel RC subcircuits areoften found appropriate.

For the conditions above, no diffusion DCE element ispresent. The ZW(ω) one discussed earlier appears, how-ever, in the ZD3 position when charges of both signs aremobile and at least one of them reacts at an electrode.When static fields are present in the material, either intrin-sic and/or externally produced, numerical analysis of thenonlinear transport equations governing the IS responseshows that the Fig. 8 circuit still applies to good approx-imation but elements such as CR and RR then depend ap-preciably on the static p.d. present. Finally, the circuit ofFig. 8 has often been found appropriate for the fitting ofdata for supported conditions as well as unsupported ones.

The Fig. 8 circuit is particularly appropriate for ana-lyzing electrode–interface effects in low-resistivity EISsituations. For high-resistivity solids, however, such asion-conducting glasses, one is usually more interestedin isolating and inerpreting bulk dispersion behavior.Figure 9 shows a circuit appropriate for such materials,where te DCE block is a conductive-system dispersiveDE, and the DED block is a dielectric-system dispersiveone. The circuit thus allows one to account for electrodeeffects when important, and either conductive dispersion,or dielectric dispersion, or both.

FIGURE 9 An equivalent circuit available in LEVM that is ap-propriate for analyzing the frequency response of high-resistivitymaterials. As usual, “DE” indentifies possible distributed circuit el-ements. (Reprinted by permission of Elsevier Science B. V., from“Power-law exponents and hidden bulk relaxation (the word “re-laxation” was erroneously printed as “relation”) in the impedancespectroscopy of solids,” J. R. Macdonald, J. ElectroanalyticalChem., Vol. 378, pp. 17–29, 1994.)

Page 179: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ/LOW P2: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007C-333 June 30, 2001 15:18

714 Impedance Spectroscopy

B. Uses of IS for Evaluation and Controlof Electrochemical Processesof Industrial Interest and Importance

1. Corrosion and Surface Protection

Corrosion of metallic structural materials leads to im-mense damage each year (an estimate for the United Statesfor 1988 is $200 billion); thus its control and ameliorationare of tremendous economic importance. EIS has playedand is playing a valuable role in quantifying and mitigatingcorrosion effects. For example, it has been successfully ap-plied to complicated corrosion systems to determine cor-rosion rates as well as the mechanisms and efficiency ofcorrosion inhibitors. The use of EIS has broadened therange of corrosion phenomena that can be studied us-ing electrochemical techniques and has been particularlyvaluable in evaluating the corrosion behaviors of polymer-coated metals and anodized aluminum alloys. In addition,it has been incorporated into a quality control test for an-odized aluminum surfaces and for chromate-conversion-coated aluminum alloys.

The application of EIS techniques has resulted in a greatdeal of information on methods of corrosion protectionthat are difficult or impossible to study with traditional dctechniques, such as conversion and polymer coatings, an-odic films, and inhibitors. Not only can EIS measurementsprovide greater sensitivity and more information aboutthe processes investigated than can conventional dc meth-ods, they are particularly appropriate when impedancesare high and/or when low-conductivity media are used.

EIS measurement and analysis has been used to providefast and sensitive information on the protection propertiesof chromated galvanized steel. Such measurements maybe used as a quality control procedure, since the chargetransfer resistance has been found to be well related tothe corrosion rate. EIS has been used to detect corrodingareas of large structures accurately and has been appliedfor corrosion monitoring of steel reinforcing bars in con-crete to yield a nondestructive estimate of the amount ofcorrosion damage.

Since the roughness of an electrode surface is reflectedin the results of EIS measurements involving the electrode,EIS may be used to identify surface inhomogeneities pro-duced by corrosion. It provides (averaged) information onsurface morphology on a much smaller scale than doeseven electron micrography. EIS has been employed as ameans of nearly continuous evaluation of localized cor-rosion processes such as pitting, crevice corrosion, stresscorrosion, cracking and fatigue corrosion, abrasion, andcorrosion under a porous surface layer.

EIS measurements over a relatively wide frequencyrange have been found to yield valuable detailed informa-tion about the properties of aluminum oxide layers formed

under different anodizing and sealing conditions. Discrim-ination was possible between properties of the dense bar-rier layer and the porous outer layer, and changes arisingfrom aging and from the effects of natural environmentalconditions were reflected in the results.

An EIS monitor has been used for the detection of paintdegradation under atmospheric exposure. A model is beingdeveloped to help predict the lifetime of protective or-ganic coatings on steel based on short laboratory tests.The model includes the steps of defect formation, trans-port of corrodents, loss of adhesion, and corrosion. EIShelps elucidate how these four processes interact and de-pend on coating processes and environmental effects.

Although IS analysis should properly be carried outonly on time-invariant data—data obtained from a sys-tem whose properties are independent of time—some ofthese properties are often not time-invariant during mea-surement of a corroding system. If the change is slowcompared to the required measurement time and/or if it isapproximately linear in time, imporved results may be ob-tained by making a set of measurements from low to highfrequencies immediately followed by one from high to lowfrequencies. Averaging of the results will then eliminatemuch of the variation with time.

A test of time invariance can be made by analyzing thedata with the Kramers–Kroning (K–K) relations, integraltransforms connecting real to imaginary parts of the dataand vice versa. They are only applicable for time-invariantsystems. All useful fitting models and equivalent circuitsare minimum phase and so automatically satisfy the K–Krelations. Thus, a good fit is evidence of time invariance.Strong failure of the K–K relations for a given set of datais immediate evidence of unwanted time variation, and,unlike CNLS fitting, no model or circuit is required tocarry out such a test. Although ordinary K–K analysis re-quires much computation and uncertain extrapolations aswell, an alternate program available in LEVM avoids suchdifficulties, uses only measured data, and caan readilytest for time-invariance as well as estimating the imagin-ary-part response associated with real-part data, orvice versa.

2. Batteries and Fuel Cells

EIS studies have been made of the kinetics of the insertionreaction in solid-state batteries based on such reactions. Asingle EIS experiment allows information to be obtainedabout the electrode-interface reaction and diffusion in theelectrolyte or electrode. Measurements at different batteryvoltages to determine the dependence of the results on thecharge of the battery have led to increased understandingof the discharge process and thence to improved batterydesign.

Page 180: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ/LOW P2: ZCK Final Pages

Encyclopedia of Physical Science and Technology EN007C-333 June 30, 2001 15:18

Impedance Spectroscopy 715

An interesting EIS study has been carried out on elec-trochemically impregnated Ni electrodes from four differ-ent manufacturers of Ni/H cells. The EIS measurementswere made in KOH electrolyte, and large differences werefound in the impedance behavior of the electrodes fromthe different manufacturers. The results indicated a prob-able correlation between impedance parameters and celllife and performance.

EIS studies of molten carbonate fuel cells have in-creased the understanding of processes going on underoperating conditions of the cell. In particular, they havehelped identify and elucidate the reactions that occur bothat the anode and at the cathode.

3. Other Devices and Techniques

Electrolyte-insulator-semiconductor sensors meld inte-grated circuit technology with traditional chemical tech-nology. They can be used to monitor pH changes, forexample, and can be constructed with ion-selective mem-branes to make them sensitive to a specific ion. ISmeasurements and analysis can yield, in favorable cases,information on the electrical characteristics of the elec-trolyte, the insulator, the semi-conductor, and the vari-ous interfaces and on interface states. The IS approachallows very low surface-state densities at the insulator-semiconductor interface to be determined. Measurementshave shown, however, that it is the electrolyte–insulatorinterface that responds to pH changes.

Solid-electrolyte chemical sensors are electro-chemicalcells designed to measure the concentration or pressure ofchemical species in gases or fluids; for example, zirconia-based solid electrolytes have been used to measure oxygenconcentration. Such sensors are employed to measure theoxygen concentration in steel melts and the air–fuel ratioin automobile engines. EIS has been found very usefulto study (and to help optimize) electrode materials andappropriate pretreatment preparation for such sensors.

In recent years a number of variants on and extensionsof IS have been developed. An important one is electro-hydrodynamic impedance. Here the speed of a rotating-disk electrode is modulated sinusoidally, resulting inmodulation of the mass transport in a liquid electrolyte.Such modulation allows the minimization of the couplingwith interfacial kinetics. Modulation of numerous otherquantities in an IS experiment is also possible, such aslight, temperature, or magnetic field. Thus analysis ofother transfer functions, cause and effect relations that go

beyond potential and current, can add valuable additionalinformation to IS studies. It is likely that considerable fu-ture development will be concerned with such possibilites.

SEE ALSO THE FOLLOWING ARTICLES

BATTERIES • CORROSION • ELECTROCHEMICAL ENGI-NEERING • ELECTROCHEMISTRY • ELECTROLYTE SOLU-TIONS, THERMODYNAMICS • ELECTROLYTE SOLUTIONS,TRANSPORT PROPERTIES • FUEL CELLS, APPLICATIONS

IN STATIONARY POWER SYSTEMS

BIBLIOGRAPHY

Archer, W. I., and Armstrong, R. D. (1980). The application ofA. C. impedance methods in solid electrolytes. Electrochemistry 7,157–202.

Armstrong, R. D., Bell, M. F., and Metcalfe, A. A. (1978). The ACimpedance of complex electrochemical reactions. Electrochemistry 6,98–127.

Franceschetti, D. R., and Macdonald, J. R. (1978). Theory of small-signalAC response of solids and liquids with recombining mobile charge. J.Chem. Phys. 68, 1614–1637.

Gabrielli, C., ed. (1990). Proceedings of the First International Sympo-sium on Electrochemical Impedance Spectroscopy. Electrochim. Acta35, 1483–1670.

Macdonald, D. D. (1991). Mechanistic Analysis Using ElectrochemicalImpedance Spectroscopy, Proceedings of the Symposium on HighTemperature Electrode Materials and Characterization, 91-6, pp. 1–43. The Electrochemical Society, Inc., Pennington, NJ.

Macdonald, D. D., ed. (1993). Proceedings of the Second Interna-tional Symposium on Electrochemical Impedance Spectroscopy, Elec-trochim. Acta 38, 1797–2143.

Macdonald, J. R., ed. (1987). “Impedance Spectroscopy—EmphasizingSolid Materials and Systems,” Wiley–Interscience, New York.

Macdonald, J. R. (1987). Impedance spectroscopy and its use in ana-lyzing the steady-state AC response of solid and liquid electrolytes.J. Electroanal. Chem. 223, 25–50.

Macdonald, J. R., and Potter, L. D., Jr. (1987). A flexible procedure foranalyzing impedance spectroscopy results: Description and illustra-tions. Solid State Ionics 23, 61–79.

Macdonald, J. R. (1999). A full list of categorized JRM publications,with titles, is available in http://www.physics.unc.edu/∼macd/.

Mansfeld, F. (1988). Don’t be afraid of electrochemical techniques—butuse them with care. Corrosion 44, 856–868.

Schone, G., Wiesbeck, W., and Lorenz, W. J. (1987). High-frequencyimpedance spectroscopy of fast electrode reactions. J. Electroanal.Chem. 229, 407–421.

Sluyters-Rehbach, M., and Sluyters, J. H. (1984). AC Techniques. In“Comprehensive Treatise of Electrochemistry” (E. Yeager, J. O’M.Bockris, B. E. Conway, and S. Sarangapani, eds.), Vol. 9, pp. 177–292.Plenum Press, New York.

Page 181: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB/FEE P2: GLM Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN007D-335 June 30, 2001 15:29

Incommensurate Crystalsand Quasicrystals

Uwe GrimmThe Open University

Max SchefferChemnitz University of Technology

I. Aperiodic CrystalsII. The Structure of Quasicrystals

III. Physical Properties of QuasicrystalsIV. Concluding Remarks

GLOSSARY

Aperiodic crystal Crystalline structure without three-dimensional lattice periodicity.

Approximant phase Periodic crystalline phase that ap-proximates a quasicrystalline phase.

Commensurate Having a rational ratio.Crystal Ordered structure with essentially pure Bragg

diffraction pattern; the current definition includes bothperiodic and aperiodic crystals.

Decagonal phase Quasicrystal with one periodic direc-tion and a 10-fold rotational symmetry.

Dodecagonal phase Quasicrystal with one periodic di-rection and a 12-fold rotational symmetry.

Icosahedral phase Quasicrystal with icosahedral sym-metry and no periodic direction.

Icosahedral symmetry The symmetry of the regularicosahedron. It is the largest symmetry group of a three-dimensional regular polyhedron and comprises two-,three-, and fivefold rotational symmetry axes.

Incommensurate Having an irrational ratio.

Incommensurate phase Aperiodic crystal whose struc-ture is based on periodic lattices. The aperiodicity maybe due to an incommensurate modulation of the peri-odic lattice or an incommensurate combination of sev-eral periodic lattices.

Lock-in transformation An incommensurate-to-comm-ensurate phase transition.

Modulated structure Result of a (small) periodic distor-tion of a periodic pattern. It is periodic when the twoperiods are commensurate; an incommensurate modu-lation results in an aperiodic structure.

Octagonal phase Quasicrystal with one periodic direc-tion and an eightfold rotational symmetry.

Phason Degree of freedom associated with relativephases of incommensurate waves.

Quasicrystal Aperiodic crystal without an underlyingperiodic lattice structure, usually showing noncrystal-lographic symmetries that cannot occur in periodiccrystals.

Quasiperiodic A particular type of aperiodicity wherethe Fourier transform is pure point, i.e., consists only of

731

Page 182: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB/FEE P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN007D-335 June 30, 2001 15:29

732 Incommensurate Crystals and Quasicrystals

Bragg peaks, and is supported on the (usually dense) setof integral linear combinations of a finite set of vectorsin Fourier space.

Quasiperiodic tiling Quasiperiodic space-filling tilingthat serves as the analogue of the periodic lattice instructure models of perfect quasicrystals.

Random tiling Space-filling tiling in which tiles are ar-ranged randomly. Random tilings are believed to be amore realistic description of the structure of real qua-sicrystals than perfect quasiperiodic tilings.

SOLIDS have traditionally been classified as either crys-talline or amorphous. The basic property that distinguishesa crystal from an amorphous or glassy material is the long-range positional order of its microscopic constituents.Classical crystallography deals with lattice-periodic struc-tures that can be described by a space-filling periodic repe-tition of a single microscopic building block, the so-calledunit cell. However, order does not imply periodicity, andover the last decades it has become evident that aperiodi-cally ordered materials not only are theoretically possible,but commonly realized in nature. Aperiodic crystals can beclassified into incommensurate crystals, known since the1950s, on the one hand, and quasicrystals, discovered inthe early 1980s, on the other. In the former, aperiodicity isdue to the combination of several periodic structures withincommensurate periods. Quasicrystals are, in a sense,a more radical manifestation of aperiodic order, as theatomic positions cannot be interpreted in terms of under-lying periodic lattices in three dimensions. They are usu-ally identified by symmetries that are incompatible withlattice periodicity and hence forbidden in classical crys-tallography, such as icosahedral symmetry. The beautifulsymmetry, the peculiar aperiodic order, the rather intricateand subtle atomic structure, the unique (and only partiallyunderstood) physical properties, and, last but not least,the quest for technical applications have made quasicrys-tals an important topic of crystallography, mathematics,physics, chemistry, and materials science.

I. APERIODIC CRYSTALS

Crystals, in our common perception, are characterized bytheir morphology, their faceted shape, and have tradition-ally been be classified according to their symmetry. Ob-viously, the regularity of crystals reflects the underlyingorder of their microscopic structure. It can be visualizedby the beautifully ordered patterns of sharp diffiractionspots, so-called Bragg peaks, as observed, for instance, inX-ray diffiraction experiments. For a long period of time,it was taken for granted that the microscopic structure of

crystals, apart from defects that exist in any real solid, isperiodic in space. In other words, associated to a crystalthere is a three-dimensional periodic lattice, and this lat-tice also determines the possible symmetries that may beapparent in the crystal shape.

A. Periodic Crystals

A “conventional,” periodic crystal is thus characterized bya periodic lattice. Once the distribution of atoms in a singlefundamental domain of the lattice, a unit cell, is known, theentire structure is determined by periodicity. In additionto this translational symmetry, crystalline structure mayhave other symmetries such as a rotational symmetry withrespect to a certain axis. However, these symmetries haveto be compatible with each other, and this restricts the pos-sible symmetries of a periodic three-dimensional crystalto one of 230 crystallographic space groups classified bySchoenflies and von Fedorow in the late 19th century. Inparticular, the crystallographic restriction only concedestwo-, three-, four-, and sixfold rotational symmetry. Othersymmetries, such as fivefold rotational symmetry or icosa-hedral symmetry, cannot be reconciled with lattice peri-odicity in three-dimensional space, and thus cannot beaccommodated in periodic crystals.

In general, the atomic structure of a material need notmanifest itself in the shape of its surface; for instance, thestructure of gold cannot be guessed by the morphology ofa gold nugget. Only if the surfaces correspond to specialplanes, for instance, planes parallel to the faces of a cubein a cubic crystal, does the shape of the crystal reflectits atomic structure. This happens if special surfaces of acrystal are energetically favored and if the growth velocityof these surfaces is high. For example, minerals frequentlydevelop beautiful facets.

B. Diffraction

Direct access to the underlying atomic structure and itslong-range order is provided by diffraction experiments.An X-ray, electron, or neutron beam is scattered by thesample, and interference gives rise to a diffraction pat-tern that can be recorded, providing information about thestructure of the material. Let ρ(r) denote the density ofscatterers in space, and q = kout − kin the scattering vec-tor, i.e., the momentum diffierence between the incomingand the scattered radiation. Provided that the scatteringis elastic and that multiple scattering can be neglected,the measured intensity I (q) is proportional to the Fouriertransform g(q) of the pair correlation or Patterson function

g(r) =∫

d3r′ρ(r + r′)ρ(r′). (1)

Page 183: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB/FEE P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN007D-335 June 30, 2001 15:29

Incommensurate Crystals and Quasicrystals 733

In many situations, I (q) can also be expressed directly interms of the absolute square of the Fourier transform ρ(q)of the scattering density,

I (q) ∼ |ρ(q)|2. (2)

If the scattering density ρ is periodic with respect to thethree-dimensional lattice spanned by the base vectors a1,a2, and a3, the diffraction intensity I (q) is concentratedon a lattice, the reciprocal lattice in momentum space.This lattice is spanned by the three dual base vectors a∗

1,a∗

2, and a∗3. The diffraction pattern thus consists of sharp

peaks, so-called Bragg peaks, at positions

q =3∑

j=1

h j a∗j (3)

in momentum space, where h j , j = 1, 2, 3, are integernumbers indexing the diffraction spots.

Incommensurate crystals and quasicrystals, unlikeamorphous solids, also display sharp Bragg diffractionpatterns, but in contrast to periodic crystals, the positionsof the diffraction spots do not lie on a periodic lattice inthree-dimensional momentum space. Indexing the Braggpeaks as linear combinations of three base vectors as inEq. (3) would require irrational coeficients. How-ever, upon enlarging the number of vectors a∗

j , j =1, 2, . . . , D > 3, one recovers an indexing scheme

q =D∑

j=1

h j a∗j (4)

with integer coefficients h j , but now D > 3 integers arerequired to index a Bragg spot. Assuming that the basevectors a∗

j are linearly independent with respect to integrallinear combinations (otherwise one could do with fewervectors), one finds that the set of integral linear combina-tions of the a∗

j , in general, densely fills space. The gen-eralization from Eq. (3) to Eq. (4) may appear innocent,but the question remains of how the aperiodic order inreal space that looks like that produces such diffractionpatterns.

−2

0

2

0 10π 20π 30πFIGURE 1 Graphs of f (x) (gray) and f (x + x 0) (black) on the interval 0 ≤ x ≤ 30π , with x 0 = 58π/(1 + √

2). The greycurve has been widened to a “tube” of four times the width of the black line, indicating the size of the deviation betweenthe two curves on the interval shown.

C. What is Aperiodic Order?

At first sight, the term “aperiodic order” may appear para-doxical. However, there exists a wealth of possible struc-tures that, in a sense, are intermediate between the peri-odic order of a perfect crystal and the disorder that onemay find realized in amorphous intermetallic alloys. Ape-riodic order is realized in incommensurate crystals andquasicrystals, which are discussed in detail below; whilethe discovery of these structures came quite as a surpriseat the time, it is now apparent that these are not merelyrare caprices of nature, but, on the contrary, occur quitecommonly.

The kind of aperiodicity encountered here is known asquasiperiodicity, and may be most easily understood bya one-dimensional example. The paradigm of a periodicfunction is the trigonometric function sin(x), which is pe-riodic with period 2π , i.e., sin(x + 2π ) = sin(x). Now,consider the sum of two sine functions

f (x) = sin(x) + sin(cx) = 2 sin

(1 + c

2x

)cos

(1 − c

2x

),

(5)

where c is some fixed number. Is the function f (x) pe-riodic? Well, this depends on the values of c. If c is arational number, c = m/n with coprime integers m and n,then the periods 2π and 2πc = 2πm/n are commensu-rate, and the function is periodic with period 2πn becausesin[c(x +2πn)] = sin(cx + 2πm) = sin(cx). However, ifc is irrational, say c = √

2, the two frequencies are in-commensurate, and f (x) is aperiodic. This can also beseen from the product form of the function f (x) that isalso given in Eq. (5). Looking, for instance, at the set ofsolutions of f (x) = 0, we see that the sine and the cosinefunctions in Eq. (5) each contribute zeros at equally spacedpositions xk = 2πk/(1 + c) and x� = 2π (� + 1/2)/(1 − c),respectively, but the two spacings are incommensurate ifc is irrational. Still, the function f (x) retains a lot of itsregularity—after all, it is just the sum of two sine func-tions. In fact, it is almost periodic in a sense that may beinferred from Fig. 1, which shows two different sections

Page 184: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB/FEE P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN007D-335 June 30, 2001 15:29

734 Incommensurate Crystals and Quasicrystals

FIGURE 2 The quasiperiodic function f (x) of Eq. (5) as a cut of the two-dimensional periodic “egg carton” functionF(x , y), Eq. (6), for c = √

2.

of the graph of the function f (x) that almost agree on alarge interval.

In Fourier space, the function f (x) is represented bytwo frequencies, either commensurate or incommensu-rate. In general, a function is periodic if all its frequenciesin Fourier space are situated on a periodic lattice, thusare integer linear combinations of d base vectors in dspace dimensions. A quasiperiodic function is a general-ization of this concept where, again, all frequencies areinteger linear combinations of D vectors, but this numbermay be larger than the spatial dimension, D ≥ d, whereequality implies periodicity. Thus, the Fourier transformof a quasiperiodic structure that is not periodic will besupported on the set of all integer linear combinations ofD > d vectors in d-dimensional Fourier space. Thus, thediffraction pattern of a quasiperiodic structure consists ofBragg peaks located on a particular dense set of points inFourier space, and the Bragg peaks can be indexed by D in-teger numbers as in Eq. (4). In mathematical terminology,the Fourier transform is finitely generated over the inte-gers; its support is a module of rank D. One can also con-struct ordered structures whose Fourier transforms consistof Bragg peaks that cannot be indexed by a finite numberof integers; however, such structures have not yet beenobserved in nature.

It is instructive to think of the function f (x) as a cut ofthe two-dimensional periodic function

F(x , y) = sin(x) + sin(y) (6)

along the line y = cx , i.e., f (x) = F(x , cx). This is shownin Fig. 2. The quasiperiodic function emerges as a sectionthrough a higher dimensional periodic function along adirection that induces the incommensurability. It is pre-cisely the same idea that underlies the higher dimensionaldescription of incommensurate crystals and perfect qua-sicrystals.

D. Incommensurate Crystals

Incommensurate structures in crystals have been knownsince the 1950s. These are magnetic crystals that exhibita helical ordering of the spins which is incommensuratewith the underlying periodic lattice structure. A sketch ofsuch a situation is shown in Fig. 3a. The existence of thesesystems may not seem too surprising, as the incommensu-rability occurs due to an additional degree of freedom, thespin, in an otherwise perfectly periodic crystal. However, itdid not take long until evidence for incommensurability ofthe structure itself was found in form of so-called satellitereflections in diffraction patterns. The satellite peaks showup next to the main reflctions, hence the name, but theircoordinates with respect to the lattice of main peaks arenot simple fractions, and may even depend continuouslyon temperature. In particular this continuity invalidates the

Page 185: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB/FEE P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN007D-335 June 30, 2001 15:29

Incommensurate Crystals and Quasicrystals 735

FIGURE 3 Examples of incommensurate structures. (a) Incom-mensurately ordered degree of freedom. (b) Displacive modula-tion. (c) Occupational modulation. (d) Composite structure.

interpretation of the satellite peaks in terms of structurallystable periodic superstructures, which refers to another pe-riodic structure with a large unit cell that is superimposedon the original lattice.

Sketches of several possible scenarios of incommensu-rability in a crystalline structure are compiled in Fig. 3.Most important are the modulated structures depicted inFigs. 3b and 3c. These crystals are characterized by a peri-odic deviation, the modulation, from their underlying pe-riodic lattice structure. The structure is incommensurateif the period of the modulation does not match the latticeperiodicity. In Fig. 3b, the modulation is displacive, i.e.,the positions of the atoms are shifted. Figure 3c shows an-other scenario where the deviation is occupational. In thiscase, the periodic modulation determines the occupationprobability of the perfect lattice positions. Finally, Fig. 3dshows an example of an incommensurate composite struc-

ture which consists of two subsystems, indicated by twodifferent symbols, each of which by itself is perfectly pe-riodic, but which are incommensurate with each other.

For clarity, the modulations in Fig. 3 are chosen to beone dimensional, i.e., the deviation from the perfect peri-odic structure occurs in one direction only. For such sys-tems, one would need D = d + 1 base vectors in Eq. (4)to describe their Fourier transform. The additional vec-tor accounts for the periodic modulation in one directionof space, which need not coincide with any lattice direc-tion of the basic structure. There exist many examples forone-dimensional modulations; higher dimensional modu-lations also occur in nature, where the dimension of themodulation may be defined as the number of additionalvectors D − d in Eq. (4) that are needed in order to describethe diffraction pattern of the structure.

E. Quasicrystals

Quasicrystals entered the scene only in the early 1980swhen icosahedral symmetry was found in a selected-areaelectron diffraction analysis of a rapidly cooled Al–Mn al-loy (see Fig. 4). Figure 5 shows the two Platonic solids withicosahedral symmetry, the icosahedron and the dodecahe-dron. The 20 faces of the icosahedron are equilateral trian-gles; they meet in 30 edges and 12 vertices. The dodeca-hedron consists of 12 faces that are regular pentagons, andcomprises 30 edges and 20 vertices. Both polyhedra showthe same symmetry. There are six fivefold axes, connect-ing opposite vertices of the icosahedron or the centers ofopposite pentagons of the dodecahedron, respectively. The10 threefold axes connect the centers of opposite faces ofthe icosahedron or opposite vertices of the dodecahedron,respectively; in both cases, the 15 twofold rotational axesconnect midpoints of opposite edges. The three differenttypes of symmetry axes and their relative orientations inspace are perfectly recovered in the diffraction pattern ofFig. 4.

Apparently, the diffraction pattern consists of sharpspots, and thus the structure must be ordered. However,as icosahedral symmetry is incompatible with lattice pe-riodicity, it cannot be a conventional periodic crystal.Even though incommensurate crystals had been knownfor about 30 years and an explanation of the observeddiffraction pattern in terms of a quasiperiodically orderedstructure was readily available, this discovery came as asurprise and gave rise to prolonged controversy. However,it soon became clear that alternative interpretations of theresults in terms of periodic crystals were either inconsis-tent or required periodic structures with huge unit cellscomprising thousands of atoms. Strictly speaking, thisquestion can never be resolved completely because thepeaks observed in experiments cannot be infinitely sharp

Page 186: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB/FEE P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN007D-335 June 30, 2001 15:29

736 Incommensurate Crystals and Quasicrystals

FIGURE 4 The first reported experimental evidence of a quasicrystal: Selected-area electron diffraction patternsof a rapidly cooled Al–Mn alloy showing icosahedral symmetry. Different symmetries were observed by tilting thesample by the angles given in the figure, exactly corresponding to the angles between different symmetry axes ofisosahedral symmetry. [From Shechtman, D., Blech, I., Gratias, D., and Cahn, J. W. (1984). “Metallic phase withlong-range orientational order and no translational symmetry.” Phys. Rev. Lett. 53, 1951–1953. Copyright 1984 by theAmerican Physical Society.]

due to the finite size and disorder of the crystal and dueto the limited resolution. Thus, one may always describethe experimental data in terms of a hypothetical periodicstructure; when the number of atoms in a unit cell be-comes too large, however, the description as an aperiodiccrystal is not only more elegant and appealing, but alsomuch simpler. In particular, it can easily account for thesymmetry of quasicrystals (Fig. 6).

Soon after the discovery of icosahedral quasicrys-tals, intermetallic alloys with further crystallographicallyforbidden symmetries were found, showing either a 12-,10-, or 8-fold symmetry axis. The corresponding diffrac-tion patterns reveal that these quasicrystals are periodic inone direction of space, which coincides with the symmetryaxis, i.e., they consist of a periodic stacking of planes with12-, 10-, or 8-fold rotational symmetry. Accordingly, theyare referred to as dodecagonal, decagonal, and octagonalquasicrystals.

II. THE STRUCTURE OF QUASICRYSTALS

One would like to reduce the atomic structure of a qua-sicrystal to a small number of basic building blocks analo-gous to the unit cell for a periodic crystal. Thus, structure

models of quasicrystals consist of two parts: The geomet-ric arrangement of the building blocks, which takes care ofthe quasiperiodic long-range order, and the location of theatoms within each building block, their “decoration.” Thegeometric arrangement of building blocks is convention-ally encoded in a space-filling tiling with a finite number ofprototiles. To some extent, these tilings can be visualizeddirectly by high-resolution electron microscopy of qua-sicrystals. Whereas much is known about quasiperiodictilings of space and their symmetry properties, the actualdistribution of atoms in quasicrystalline solids remainslargely unknown. Diffraction data alone do not suffice toderive the atomic density unequivocally, and electron mi-croscopy methods are just on the verge of reaching therequired atomic resolution. For several systems, sophisti-cated models have been proposed, although many details,for instance, the kind and the amount of inherent disor-der, need to be unraveled. For this reason, the followingdiscussion focuses on the geometric part.

A. A One-Dimensional Quasicrystal

It is worth starting with a one-dimensional example of aquasiperiodic structure, not merely because it is easy tounderstand, but because it is, in fact, reflected in higher

Page 187: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB/FEE P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN007D-335 June 30, 2001 15:29

Incommensurate Crystals and Quasicrystals 737

FIGURE 5 Regular polyhedra with icosahedral symmetry: Theicosahedron (left) and the dodecahedron (right).

dimensional quasicrystalline structures. The paradigm,dwelled on in almost any introductory text on quasicrys-tals, is the ubiquitous Fibonacci sequence, related to thecelebrated Fibonacci numbers fn defined by the simplerecursion

fn +1 = fn + fn −1 , f0 = 0, f1 = 1. (7)

The Fibonacci sequence consists of two symbols, say Land S, and can be constructed by successive applicationof the substitution rule S → L, L → LS, starting, for in-stance, with the single letter S. The resulting sequence

LSLLSLSLLSLLSLSLLSLSL . . . (8)

is aperiodic; the ratio of L’s and S’s in the sequence tendsto the golden number τ ,

τ = limn →∞

fn +1

fn= 1 + √

5

2, (9)

which is irrational, τ = 1.61803 . . . . The sequence can bemade into the one-dimensional “tiling” of Fig. 7a by asso-ciating two intervals of different length to the two letters.

Much as the quasiperiodic function f (x) of Eq. (5) wasrecovered as a cut through the two-dimensional periodicfunction F(x , y), Eq. (6), the Fibonacci quasicrystal canbe constructed from the two-dimensional square latticeas shown in Fig. 8. Here, the shaded strip corresponds tothe region swept out by a unit square of the lattice whenmoved along a line of irrational slope 1/τ . The latticepoints within the strip are projected onto that direction,yielding a binary one-dimensional tiling of long and shortintervals of length ratio τ . This tiling in “physical space”coincides, apart from a shift that depends on the locationof the strip, with the Fibonacci tiling obtained from thetwo-letter substitution rule.

By construction, it is clear that the Fourier transformof this one-dimensional structure will have pure Braggpeaks, located on the projected points of the dual lattice,which again is a square lattice. However, when projectingall lattice points, and not just those within a certain strip,the projected points are dense on the line, and one arrivesat a dense set of Bragg peaks which can be indexed bytwo integers. Nevertheless, a measurement would yielda diffraction pattern of peaks that appear to be well sep-arated, as shown in Fig. 7b. This apparent contradictionis resolved by realizing that only those Bragg peaks thatcarry more than a certain minimum intensity will be visi-ble, and however small the minimum intensity is chosen,the set of peaks with larger intensity is discrete.

This one-dimensional Fibonacci structure can also befound in experiment; a beautiful example is given inFig. 9a. As shown in Fig. 9b, the ordering is not always per-fect, but the sample exhibits a disorder mechanism com-monly known as “phason defects.” These are deviationsfrom the ideal quasiperiodic structure that can be inter-preted in terms of local flips in the sequence of Fig. 7 or interms of a slight deformation of the strip in the projectionof Fig. 8. In analogy to phonons that describe the motionof atoms from their ideal positions, the motion of the pro-jection strip perpendicular to the “physical space” can bedescribed in terms of quasi-particles called phasons whoseexperimental verification and characterization is a topic ofcurrent research.

B. Quasiperiodic Tilings

The projection approach of Fig. 8 can be generalized in astraightforward way to construct quasiperiodic tilings ofspace in any dimension. In particular, three-dimensionaltilings with icosahedral symmetry and planar tilings with8-, 10-, or 12-fold symmetry can be obtained. Someof these, such as the celebrated Penrose tiling shownin Fig. 10, had already been known to mathematiciansand theoretical physicists before quasicrystals were dis-covered. As an example, the diffraction pattern of the

Page 188: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB/FEE P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN007D-335 June 30, 2001 15:29

738 Incommensurate Crystals and Quasicrystals

FIGURE 6 A single-grain icosahedral HoMgZn quasicrystal, grown from the ternary melt, on a millimeter-scalebackground. [Figure courtesy of P. C. Canfield and I. R. Fisher; reprinted with permission from Fisher, I. R., Cheon,K. O., Panchula, A. F., Canfield, P. C., Chernikov, M., Ott, H. R., and Dennis, K. (1999). “Magnetic and transportproperties of single-grain R–Mg–Zn icosahedral quasicrystals [R = Y, (Y1−x Gdx), (Y1−x Tbx), Tb, Dy, Ho, and Er].”Phys. Rev. B 59, 308–321. Copyright 1999 by the American Physical Society.]

pentagon tiling is shown in Fig. 11, exhibiting perfect 10-fold symmetry.

It is worth mentioning that the golden mean τ , and infact the Fibonacci sequence itself, reappears in 10-foldand icosahedral tilings as well as in their diffraction pat-terns. This is related to the fivefold rotational symmetrycommon to these structures because τ = 2 cos(π/5). Inthe diffraction pattern of Fig. 11, it can be recognizedas the length ratio of distances between peaks of similarintensity. This is related to a rescaling symmetry ofthe quasiperiodic tiling, the so-called inflation/deflationsymmetry. In essence, it is the higher dimensional versionof the substitution rule that was used to construct theFibonacci sequence: in an inflation step, each tile is dis-sected into several parts of tiles such that a tiling emergeswhose tiles are just scaled copies of the original prototilesand that, upon rescaling, is equivalent to the original tiling.Deflation is the reverse process, in which a number oftiles is replaced by a larger tile. For the Penrose tiling, thelinear rescaling factor associated to this symmetry is justτ again. This property means that the quasiperiodic order-

L S L L S L S L L S L L S L S L L S L S L

FIGURE 7 (a) The Fibonacci quasicrystal and (b) its diffraction pattern.

ing is of the same kind no matter at what length scale it isprobed.

Besides this nice property, the Penrose tiling has another“magic” property that it shares with a number of othertilings used in description of quasicrystalline structures.It is the existence of perfect matching rules, which meansthat there exist a marking of the basic prototiles and aset of local rules that determine the possible local neigh-borhoods of a marked tile such that, if the marked tilesare assembled as in a jigsaw puzzle, the resulting tiling isaperiodic and, in fact, is indistinguishable from a Penrosetiling. However, these matching rules do not provide aconstructive instruction to produce a perfect quasiperi-odic tiling: in general, after assembling a number of tiles,one meets the situation that it is impossible to add a tilewithout violating the rules, and there is no information onthe location where the arrangement of tiles needs to bealtered to rectify the problem.

Quasiperiodic tilings are a natural generalization ofperiodic lattices that can account for the noncrystallo-graphic symmetries found in diffraction experiments. The

Page 189: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB/FEE P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN007D-335 June 30, 2001 15:29

Incommensurate Crystals and Quasicrystals 739

L S L L S L S L L S L L S L S L

LS

L L

SL

SL

LS

L L

SL

SL

FIGURE 8 Projection of the Fibonacci quasicrystal from the two-dimensional square lattice.

beautiful example of several difierent modifications of thedecagonal phase in AlCoNi is shown in Fig. 12. Unfortu-nately, the information contained in the diffraction patternis not sufficient to reconstruct the structure, as the phaseinformation is lost. This is why holographic methodsthat can access the phase information have attractedincreasing interest. For quasicrystals with one periodicdirection, however, high-resolution transmission electronmicroscopy along the periodic direction yields directinformation on the spatial distribution of structural units.An example for a decagonal AlCoNi quasicrystal is shownin Fig. 13a. Connecting similar contrasts in the image, onearrives at the tilings of Fig. 13b, at difierent length scalesconnected by multiplication with τ . This observation isrelated to the inflation/deflation symmetry of the Penrosetiling mentioned above. For this particular alloy, the tilingsobtained in this way are very close to ideal quasiperiodictilings. This is corroborated by Fig. 13c, which showsthe projection into “internal space” corresponding to thecross section of the strip in the projection of Fig. 8. Ifthe tiling were perfect, all projected points would cometo line inside the decagons. Indeed, for the three experi-mentally derived tilings of Fig. 13b, only few points falloutside the decagons. For this reason, this particular high-temperature AlCoNi phase is referred to as a highly perfectquasicrystal.

C. Clusters and Coverings

Right after the discovery of quasicrystals, the ideaemerged that highly symmetric atomic clusters are the ba-

sic constituents of their structure. These may be conceivedas particularly stable local configurations of atoms, and asimplistic picture of a quasicrystal would be a conglomer-ate of such clusters held together by “glue atoms.” Whilethis approach can easily be combined with an underlyingtiling picture, an alternative concept has recently attractedgrowing attention, according to which a covering of spacerather than a tiling is employed.

The difference between a covering and a tiling is thepossibility of overlaps. An example of a covering by aregular decagon is shown in Fig. 14. Here, the possibleoverlaps are restricted by the markings of the decagonand the resulting structure is in fact completely equivalentto the Penrose tiling of Fig. 10. This shows one of theadvantages of the covering picture: only a single “quasiunit cell” is needed, whereas at least two different tilesare required in a Penrose tiling. Furthermore, the shape ofthe quasi unit cell resembles the typical motives observedin electron microscopy, whereas the tiles usually have tobe imposed artificially. Last but not least, the picture isvery much reminiscent of interpenetrating atomic clusters,which is rather appealing from the physical point of view.In fact, the structures that are equivalent to Penrose tilingsare characterized by a maximal density of clusters.

An example of a structure model of a decagonal AlCoNiquasicrystal based on the decagon covering is shown inFig. 15. The model was chosen such that it fits the featuresseen in atomic resolution Z-contrast scanning transmissionelectron microscopy. Here, the size of the basic decagonalcluster is 2 nm. Note the asymmetric decoration, matchingthe asymmetric contrast in the Z-contrast image.

Page 190: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB/FEE P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN007D-335 June 30, 2001 15:29

740 Incommensurate Crystals and Quasicrystals

(a)

(b)

FIGURE 9 (a) Scanning tunneling microscopy image of a 1.5-nm silver film on a GaAs(110) surface and (b) a detailshowing two phason defects marked by arrows. [Reprinted with permission from Ebert, Ph., Chao, K.-J., Niu, Q., andShih, C. K. (1999). “Dislocations, phason defects, and domains walls in a one-dimensional quasiperiodic superstruc-ture of a metallic thin film.” Phys. Rev. Lett. 83, 3222–3225. Copyright 1999 by the American Physical Society.]

Page 191: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB/FEE P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN007D-335 June 30, 2001 15:29

Incommensurate Crystals and Quasicrystals 741

FIGURE 10 The Penrose pentagon tiling.

FIGURE 11 Fourier transform of the Penrose pentagon tiling. The diffraction peaks are represented by disks whoseareas are proportional to the intensity.

Page 192: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB/FEE P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN007D-335 June 30, 2001 15:29

742 Incommensurate Crystals and Quasicrystals

FIGURE 12 Electron diffraction images of four of the eight known structural modifications of the decagonal phase inAlCoNi. [From Ritsch, S., Beeli, C., Nissen, H.-U., Godecke, T., Scheffer, M., and Luck, R. (1998). Phil. Mag. Lett. 78,67–75.]

D. Disorder and Randomness

The decagonal AlCoNi quasicrystal is a rather special ex-ample in the sense that, at high temperature, its structure iswell represented by a perfect quasiperiodic tiling or cov-ering of the plane, whereas most other structures show alarge amount of configurational disorder. The fact that themost perfect structures are found at higher temperatures,and the large compositional ranges where metastable qua-

sicrystals can be obtained by rapid solidification of liquidalloys, are indications that entropy plays an important rolein the stability of quasicrystals.

One way to incorporate configurational entropy is byconsidering random tilings rather than perfectly orderedquasiperiodic tilings as the basis of structure models. In arandom tiling, all possible space-filling arrangements ofa certain set of prototiles are taken into account. A per-fect tiling such as in Fig. 10 can be randomized by local

Page 193: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB/FEE P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN007D-335 June 30, 2001 15:29

Incommensurate Crystals and Quasicrystals 743

FIGURE 13 (a) High-resolution electron microscopy image of a decagonal AlCoNi quasicrystal, and tiling analysis(b) in physical and (c) in internal space. [From Ritsch, S., Beeli, C., Nissen, H.-U., Godecke, T., Scheffer, M., and Luck,R. (1996). “Highly perfect decagonal Al–Co–Ni quasicrystals.” Phil. Mag. Lett. 74, 99–106.]

Page 194: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB/FEE P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN007D-335 June 30, 2001 15:29

744 Incommensurate Crystals and Quasicrystals

FIGURE 14 This covering of the plane by marked regular decagons was shown to be equivalent to the Penrose tilingof Fig. 10 by P. Gummelt.

flips to obtain a random tiling such as shown in Fig. 16.The disorder may not be apparent on first view, but, forinstance, the star-shaped tiles are distributed in a less reg-ular way and may appear at short distances in the ran-domized tiling, while they are separated by at least twopentagons in the perfect tiling. In other words, the ran-dom tiling comprises local configurations that are absentin the perfect case. In the projection setup, this can beinterpreted as an arbitrary deformation of the projectionstrip of Fig. 8. It turns out that arrangements that lead tothe highest statistical symmetry are entropically favored,which may be interpreted as an entropic mechanism thatstabilizes quasicrystals. Although the tilings are stochas-tic, their diffraction images are still believed to be point-like. Moreover, most electron microscopy investigationsof the local order in quasicrystals support this stochasticpicture.

As a further benefit, the random tiling picture can alsoresolve the somewhat mysterious growth mechanism ofquasicrystals. The problem with quasicrystal growth isthat a perfect quasiperiodic structure such as the Penrosetiling in Fig. 10 cannot be grown by local growth rules.Thus, even though the perfect matching rules would allowfor an assignment of energies to local configurations thatresult in a perfect quasiperiodic ground state, the randomtiling scenario seems to be much more realistic, with thesystem choosing among many possible local configura-tions that just differ slightly in energy.

III. PHYSICAL PROPERTIES OFQUASICRYSTALS

In contrast to the incommensurate crystals, quasicrystalsconstitute a fairly coherent class of materials which share

Page 195: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB/FEE P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN007D-335 June 30, 2001 15:29

Incommensurate Crystals and Quasicrystals 745

FIGURE 15 Structure model of decagonal AlCoNi and Z-contrast images. (a) Decoration of the Gummelt decagonby transition metal atoms (large circles) and aluminum atoms (small circles). Dark spots refer to positions c = 1/2along the periodic axis, light symbols to c = 0. The arrows denote positions that moved significantly during a first-principles relaxation of the structure. (b) The structure superimposed on a Z-contrast image. (c) Lower resolutionZ-contrast image, several clusters, and their overlaps. [From Yan, Y., and Pennycook, S. J. (2001). “Chemical orderingin Al72Ni20Co8 decagonal quasicrystals.” Phys. Rev. Lett. 86, to appear. Copyright 2001 by the American PhysicalSociety.]

similar physical properties. These are briefly summarizedbelow.

A. Appearance of Quasicrystals in Nature

One may wonder why it took about 30 years after thefirst investigations of incommensurably modulated crys-tals until quasicrystals were finally discovered. Arguably,one reason for this is the natural appearance of incom-mensurate crystals. Whereas incommensurate phases arefound in minerals like plagioclase feldspars, the predom-inant number of the known quasicrystals occur in inter-

metallic systems, which have to be prepared synthetically.A compilation of the concentrations of the componentsand the temperature ranges where such structures existis provided by phase diagrams, which are particularlyimportant for sample preparation. Most of the systemsforming quasiperiodic and incommensurate crystals show,in addition, a variety of phases with different struc-tures. Since the complexity of phase diagrams increaseswith the number of phases, phase diagrams of suchsystems are generally intricate. An example is givenin Fig. 17, which shows a cut through the aluminum-rich part of the three-dimensional phase diagram of the

Page 196: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB/FEE P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN007D-335 June 30, 2001 15:29

746 Incommensurate Crystals and Quasicrystals

FIGURE 16 A randomized pentagon tiling.

ternary Al–Pd–Mn system at a constant temperature of873 K.

B. Morphology

In certain cases, it is possible to grow single-grain qua-sicrystals from the melt. An example is shown in Fig. 6.The pentagonal surfaces of the dodecahedral crystal areperpendicular to the fivefold axis of the icosahedral struc-ture. The existence of other planes with different sym-metries was corroborated by the investigation of voids inquasicrystalline alloys. The magnificent electron micro-scopic picture in Fig. 18 depicts the surface of a holeinside an icosahedral quasicrystal. A variety of differ-ent polygons emerge, indicating twofold, threefold, andeven more complex surfaces. For decagonal quasicrystals,a prismatic morphology prevails. These needle-shapedcrystals, which often show a decagonal cross section,form as a result of an anisotropic growth of the quasicrys-talline grains, which usually grow considerably faster inthe periodic direction than in the quasiperiodic planes.

C. Mechanical Properties

The mechanical properties of metallic alloys are stronglyinfluenced by the type and the concentration of struc-tural defects. For instance, the plastic deformation of the

material happens by the migration of defects. Quasicrys-talline structures possess special kinds of defects not ex-isting in crystalline structures. Besides dislocations, onefinds structural rearrangements, so-called phasons, whichdo not generate structural misfits but destroy the per-fect quasiperiodic order. Quasicrystals are mostly veryhard and brittle, a very common property of intermetal-lic alloys. For example, the Vickers hardness of Al-basedquasicrystals is comparable to the hardness of steel andslightly lower than the hardness of silicon. The brittle-ness of the quasicrystalline alloys is expressed in theirlow toughness, which is around 40 times lower than thatof other Al-based alloys. This circumstance changes athigher temperatures, above about 900 K, where a brittle-to-ductile transition was experimentally observed in whicha softening of the material occurs. For the explanation ofthis behavior, the knowledge of the structure and the ki-netics of the defects is required. Due to the high symmetryof icosahedral quasicrystals, a higher isotropy of the me-chanical properties compared to crystals was expected andwas verified by experiments.

D. Electronic Properties

As it is the electronic interaction of the charged con-stituents that holds a solid together, the electronic structure

Page 197: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB/FEE P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN007D-335 June 30, 2001 15:29

Incommensurate Crystals and Quasicrystals 747

FIGURE 17 Cut through the Al-rich part of the three-dimensional phase diagram of the ternary system Al–Pd–Mn ata constant temperature of 873 K. The concentration of the three components is represented by a plane. Lines in thediagram separate regions of thermodynamic equilibria of one or more phases, depending on the concentration. Formost compositions, no structure can exist (light and middle gray). Those alloys will decompose into several phaseswith different compositions. Regions where only a single phase exists (dark gray) are always separated by regions inwhich several phases are coexistent. Remarkable is the coexistence of a decagonal (D) and an icosahedral (i) phasein the two-phase region (D + i) in one ternary system. [Figure courtesy of T. Godecke and R. Luck; From T. Godeckeand R. Luck (1995). “The aluminum–palladium–manganese system in the range from 60 to 100 at. % Al.” Z. Metallkd.86, 109–121.]

of a solid is important for stability. If, in turn, the spa-tial arrangement of the atoms influences the electronicstructure, a complex interplay between the electrons andthe structure will result. This is often observed in in-termetallic phases, where, under special conditions, theelectronic system favors special atomic structures. Formany quasicrystals, a so-called Hume–Rothery stabiliza-

tion is assumed. Its main fingerprint is the developmentof a pseudogap in the electronic density of states at theFermi level, which may also explain the transport anoma-lies observed in quasicrystals. For instance, the electricconductivity is very low, so quasicrystals are poor con-ductors. The conductivity of quasicrystals, contrary tothe conductivity of metals, decreases enormously as the

Page 198: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB/FEE P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN007D-335 June 30, 2001 15:29

748 Incommensurate Crystals and Quasicrystals

FIGURE 18 A beautifully faceted hole in an icosahedral quasicrystal. [From Beeli, C., Godecke, T., and Luck, R.(1998). “Highly faceted growth shape of microvoids in icosahedral Al–Mn–Pd.” Phil. Mag. Lett. 78, 339–348.]

temperature is lowered, and it also appears to decreasewith increasing structural perfection of the sample. Sim-ilar anomalies are also observed in other transport prop-erties, such as thermal conductivity, Hall coefficients, andthermopower.

E. Magnetic Properties

As mentioned above, magnetic moments can form incom-mensurable phases, even if the moments are situated ona periodic lattice. Ordering phenomena of magnetic mo-ments in quasicrystals could be rather interesting due to thegeometric frustration that may be caused by the aperiodicstructure. Experimentally, magnetic properties were in-vestigated mainly for Al-based quasicrystals and for qua-sicrystals with the composition ZnMgRE, where RE de-notes a rare earth metal. Besides the approximately 70 at %Al, many Al-based quasicrystals contain transition metalssuch as Mn, Fe, Ni, and Co. In the pure metals, these atomsshow magnetic moments, which originate from partiallyoccupied 3d states. As a consequence of the changed elec-tronic structure in the quasicrystal, however, these mag-netic moments vanish together with the partial occupationof the 3d states. Thus, high-quality Al-based quasicrystals

often show diamagnetic behavior even though they con-tain a fair proportion of transition metal atoms. Concern-ing the ZnMgRE quasicrystals, the situation is different.In contrast to the 3d states of the transition metals, the4f states of the rare earth metals cannot be filled by theelectrons of the other constituents, and the magnetic mo-ments survive. However, neutron diffraction experimentsshow that these moments are only short-range ordered; nolong-range magnetic order in a quasicrystalline alloy hasbeen found. At very low temperatures, around approxi-mately 4 K, these phases behave like spin glasses, whichmeans that the short-range order of the spins becomesfrozen.

F. Applications

An exceptional property of some quasicrystalline phasesis their very low surface energy, which results in a wettingof the surface which lies in between PTFE (Teflon) andnormal metals. Their high resistivity against scratchesmakes quasicrystalline materials well suited for coatings.The hardness as well as the low weight of quasicrystallinematerials can be exploited in composite materials, inwhich advantageous properties of the components can be

Page 199: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNB/FEE P2: GLM Final Pages

Encyclopedia of Physical Science and Technology EN007D-335 June 30, 2001 15:29

Incommensurate Crystals and Quasicrystals 749

combined. Icosahedral quasicrystals based on titaniumcan store up to two hydrogen atoms per metal atom,which makes them good candidates for use in hydrogentechnology.

IV. CONCLUDING REMARKS

Aperiodic crystals not only form a fascinating chapter ofmodern crystallography, but are also of importance for avariety of scientific disciplines. On the mathematical side,one is interested in the aperiodic ordering and its math-ematical description as well as in a characterization ofthe plethora of possible structures that still may be foundto exist. As far as the physics of quasicrystals is con-cerned, the understanding of the physical properties onthe basis of their structure is at the center of the interest.However, this may first require a more detailed knowl-edge about the structure than is available to date, and, inparticular, a thorough account of the type and amount ofthe inherent disorder in quasicrystals. Current technolog-ical applications of quasicrystals, partly still in a prelimi-nary stage, look promising, and further research should berewarding.

Aperiodic crystals, like other surprising discoveries,have again taught us that even long-held beliefs in sciencemay eventually prove wrong. Who knows—even thoughit appears improbable today, maybe some day someonewill come up with a sevenfold quasicrystal.

SEE ALSO THE FOLLOWING ARTICLES

CRYSTAL GROWTH • CRYSTALLIZATION PROCESSES •CRYSTALLOGRAPHY

BIBLIOGRAPHY

Axel, F., and Gratias, D. (eds.). (1995). “Beyond Quasicrystals,” Editionsde Physique, Les Ulis, France, and Springer, Berlin.

Baake, M., and Moody, R. V. (eds.). (2000). “Directions in MathematicalQuasicrystals,” AMS, Providence, RI.

Blinc, R., and Levanyuk, A. P. (eds.). (1986). “Incommensurate Phasesin Dielectrics,” North-Holland, Amsterdam.

Grunbaum, B., and Shephard G. C. (1987). “Tilings and Patterns,”Freeman, New York.

Hippert, F., and Gratias, D. (eds.). (1994). “Lectures on Quasicrystals,”Editions de Physique, Les Ulis, France.

Janot, C. (1994). “Quasicrystals: A Primer,” 2nd ed., Clarendon Press,Oxford.

Janssen, T., and Janner, A. (1987). “Incommensurabilty in Crystals.”Adv. Phys. 36, 519–624.

Moody, R. V. (ed.). (1997). “The Mathematics of Long-Range AperiodicOrder,” Kluwer, Dordrecht.

Patera, J. (ed.). (1998). “Quasicrystals and Discrete Geometry,” AMS,Providence, RI.

Senechal, M. (1995). “Quasicrystals and Geometry,” Cambridge Univer-sity Press, Cambridge.

Stadnik, Z. M. (ed.). (1999). “Physical Properties of Quasicrystals,”Springer, Berlin.

Steinhardt, P. J., and Ostlund, S. (eds.). (1987). “The Physics of Qua-sicrystals,” World Scientific, Singapore.

Yamamoto, A. (1996). “Crystallography of Quasiperiodic Crystals.” ActaCryst. A 52, 509–560.

Page 200: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN008c-349 June 29, 2001 12:30

Interacting Boson ModelBruce R. BarrettPhilip HalseUniversity of Arizona

I. Description of the ModelII. Interacting Boson Model-1 (IBM-1)III. Neutron–Proton Interacting Boson Model

(IBM-2)IV. IBM-3 and IBM-4V. Interacting Boson–Fermion Model

VI. Boson–Fermion SymmetriesVII. Other Extensions of the IBM

VIII. Microscopic Interpretations of the IBM

GLOSSARY

Atomic weight or nuclear mass number (A) Integerequal to the sum of the number of protons Z and neu-trons N .

Boson Particle possessing integer angular momentum (orspin) and satisfying Bose–Einstein statistics (that is,symmetric under particle interchange).

Fermion Particle possessing half-odd-integer angularmomentum (or spin) and satisfying Fermi–Dirac statis-tics (that is, antisymmetric under particle interchange)and thereby the Pauli exclusion principle.

Isospin Vector operator relating to the charge of particles.For the nucleon, the total isospin is T = 1

2 , and the thirdcomponent is T3 = + 1

2 for the proton and T3 = − 12 for

the neutron.Parity Symmetry of a wave function under inversion of

the coordinate system: r → −r. The wave function ei-ther remains unchanged (even or + partiy) or changessign (odd or − parity).

Seniority (v) Integer equal to the number of nucleons ina nucleus not coupled pairwise to zero.

FOR OVER 30 years, nuclear structure physics has beendominated by two models, the single-particle shell model,developed by Maria Goeppert-Mayer and J. H. D. Jensen,and the collective model, developed by Aage Bohr andBen Mottelson. The shell model is successful in explain-ing the so-called magic numbers (or closed shell values)for protons and neutrons that lead to highly stable nuclei.It is also able to describe the properties of light nucleiand of nuclei near closed shells. However, because of thelarge number of possible states, shell-model calculations

25

html
Page 201: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN008c-349 June 29, 2001 12:30

26 Interacting Boson Model

for medium-mass and heavy mass nuclei away from closedshells are prohibitively difficult. On the other hand, thecollective model is phenomenologically successful intreating the nucleus as a liquid drop, whose excitationsare taken to arise from rotations and small oscillationsabout an equilibrium shape, with the modes correspond-ing to quadrupole (angular momentum two) deforma-tions dominating. On quantization, this model can be ex-pressed in terms of angular-momentum-two phonons (thatis, bosons).

Although considerable effort has been made to unitethese two models since their development, these investi-gations have met with only partial success. In 1974, AkitoArima and Francesco Iachello introduced a new model,the interacting boson model (IBM), which is an alge-braic model and offers the real possibility of providing themissing link between the single-particle shell model andthe collective model, in that it contains features of both.Although the IBM was first developed for medium-mass toheavy mass nuclei with an even number of protons and aneven number of neutrons (so-called even–even nuclei), ithas now been extended to describe odd-mass nuclei (even–odd and odd–even nuclei) and odd–odd nuclei, the latterbeing the most difficult to understand. For historical aswell as practical reasons, the IBM for even–even nucleiwill be described first.

I. DESCRIPTION OF THE MODEL

The shell model treats the nucleus as a system of neu-trons and protons interacting through the strong interac-tion. Neutrons and protons are collectively referred to asnucleons and are fermions, because they have an intrin-sic spin angular momentum of one-half. As fermions, theysatisfy Fermi–Dirac statistics and obey the Pauli exclusionprinciple, which states that no two fermions can occupythe same state in the same system, that is, they cannot havethe same set of classifying quantum numbers. The Pauliexclusion principle leads to the filling of shells (or levels)produced by the mean field of the nucleons. As in atoms,the filling of a shell leads to a highly stable structure, withall the angular momenta of the nucleons in the shell sum-ming up to zero. In the shell model, such structures areassumed to be inert, and nuclear properties are describedin terms of the remaining nucleons (that is, the valencenucleons) moving outside the closed shells.

When two alike nucleons occur outside a closed shell,it is observed that their angular momenta couple to zeroin the nuclear state of lowest energy, that is, the groundstate. In fact, it is found empirically that the ground-state angular momenta (J ) of all even–even nuclei arezero. The physical explanation of this result is the short

FIGURE 1 Strength of the two-alike nucleon interaction versusthe total angular momentum J (taken from an analysis of pairs ina j = 9

2 level).

range of the attractive strong interaction, that is, oppo-sitely aligned angular momenta of the alike nucleons pro-duce maximum overlap of the nucleons’ wave functionsand so the largest interaction. By similar reasoning, thenext lowest energy states of the two alike nucleons areJ = 2, then J = 4, etc. Figure 1 shows the relative strengthof the alike-nucleon interaction versus the total angularmomentum J of the nucleons. For two alike nucleonsin the same level, only even total angular momenta canoccur because of the Pauli exclusion principle, which re-quires their total wave function to be antisymmetric un-der particle interchange. The above empirical observationsuggests that building blocks of nucleon pairs of angu-lar momentum zero and two may play an important rolein determining low-lying nuclear properties. A system offermion pairs is symmetric under the interchange of anytwo pairs. Consequently, such pairs are boson-like objects.These observations, together with the known phenomeno-logical usefulness of angular-momentum-two bosons inthe geometrical model, provide the motivation for theIBM.

The basic assumption of the IBM is that an even–even nucleus of Np valence protons and Nn valence neu-trons in the shell model can be treated as a system ofnp = Np/2 valence proton bosons and nn = Nn/2 valence

Page 202: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN008c-349 June 29, 2001 12:30

Interacting Boson Model 27

FIGURE 2 Example of the truncation in the number of levels andthe number of particles involved in replacing the shell-model prob-lem by the IBM problem. The 12 nucleons in 5 shell-model levels(left-hand side) become 6 bosons in 2 levels, s or d (right-handside) In reality, the boson configuration shown would correspondto a superposition of many shell-model configurations.

neutron bosons, each having angular momentum zero ortwo. For reasons related to the naming of angular momentain atomic physics, the angular-momentum-zero bosons arecalled s bosons, and those with angular momentum twoare called d bosons. Since the number of bosons is directlyrelated to the number of valence nucleons, the number ofIBM bosons is strictly conserved. The neutron bosons canbe in s states (their number given by nsn) or d states (theirnumber given by ndn), such that nn = nsn + ndn, with asimilar relation for the protons. This relationship is indi-cated in Fig. 2. It is assumed that bosons of higher angularmomenta, for example, g bosons of angular momentumfour, are less probable, because the corresponding fermionpairs are less tightly bound, for the reason given earlier(see Fig. 1).

The IBM is a model, instead of a theory, because itis known that the nucleus is made up of fermions andnot bosons. However, the IBM can be a successful phe-nomenological model without defining or understandingthe fermionic structure of the bosons. At the present time,the exact nature of this underlying structure is not known.Attempts to associate a microscopic structure with theIBM bosons will be discussed in Section VIII.

By building the valence structure of the nucleus from sand d bosons, one gains a twofold truncation of the shell-model problem. First, the bosons exist in only two states,s and d, while the fermions may occupy several single-particle levels with various large angular momentum val-ues, and second, the number of interacting particles is cutin half, as shown in Fig. 2. This double truncation can re-duce a shell-model problem involving 1012 or 1014 statesto a boson problem in 102 or 103, which can be easily han-dled on a computer. Thus, it is noted that the interactingboson model is actually a shell model for bosons, but it ismuch simpler to apply to heavier mass nuclei.

II. INTERACTING BOSON MODEL-1(IBM-1)

The original version of the IBM does not distinguishbetween proton and neutron bosons; there are simply(np + nn) s and d valence bosons. This form of the modelis referred to as the IBM-1. If one assumes that only one-body and two-body terms are important in describing theinteractions among the bosons, one can easily write downa boson Hamiltonian involving all possible interactions tothis order. This empirical Hamiltonian contains nine in-dependent terms, only six of which are needed to definea spectrum for each value of N . The strength parametersof these terms can be easily determined by fitting exper-imental data for a given nucleus, a procedure also oftenused in shell-model calculations.

In their early papers, Arima and Iachello noted thatthe IBM-1 Hamiltonian possesses three symmetry limits,which could be related to geometrical descriptions in thecollective model. Physicists feel that symmetries in na-ture are very fundamental, since they are often related toconserved quantities and basic principles. In the case ofthe IBM, the largest symmetry is the unitary group in sixdimensions, U(6). The six dimensions come from the ones boson and the five possible states of the d boson [thatis, the five possible orientations of its angular momentum(J = 2) along a given axis].

This overall U(6) symmetry for the s and d bosons canbe broken in three distinct ways that contain the conservedrotation group SO(3), giving rise to three dynamical sym-metry chains, as indicated in Eq. (1):

U(6) ⊃ U(5) ⊃ SO(5) ⊃ SO(3)

U(6) ⊃ SU(3) ⊃ SO(3) (1)

U(6) ⊃ SO(6) ⊃ SO(5) ⊃ SO(3)

A dynamical symmetry comes from breaking the largersymmetry by the Casimir operators of groups makingup one of the subgroup chains in Eq. (1). The nuclear-physics phenomenology corresponding to the U(5) andSU(3) chains was already known. The U(5) chain is re-lated to a spherical vibrator, while the SU(3) chain displaysaspects of rotational motion. Figure 3 shows the spectrumof a nucleus exhibiting SU(3)-like structure. The SO(6)chain was a new prediction, which was later verified byexperiment and shown to represent what are known as γ -soft or γ -unstable nuclei. In these symmetry limits, exactanalytical solutions can be obtained to the IBM-1 Hamil-tonian. Moreover, the general IBM-1 formalism providesnumerical solutions for cases between the symmetry lim-its, known as transitional nuclei.

Page 203: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN008c-349 June 29, 2001 12:30

28 Interacting Boson Model

FIGURE 3 Comparison of the experimental spectrum of 15664Gd

(experiment) with that corresponding to the SU(3) limit of the IBM-1 (theory). Each energy level is labeled by the value of the totalangular momentum J.

III. NEUTRON–PROTON INTERACTINGBOSON MODEL (IBM-2)

The key feature missing from the IBM-1 approach is therelationship of the bosons to the underlying fermionicstructure of the shell model. The work of Igal Talmi hasshown the importance of the interaction between valenceprotons and valence neutrons in producing nuclear defor-mations. For this reason, the IBM was later expanded totreat separately the proton boson and neutron boson de-grees of freedom. This proton–neutron interacting bosonmodel is known as the IBM-2. A completely general one-and two-body IBM-2 Hamiltonian would contain 30 inde-pendent terms, so it is usually simplified to basically twocomponents, a term that splits the energies of the s andd bosons, related to the pairing interaction (see SectionI), and a quadrupole–quadrupole interaction between theproton bosons and the neutron bosons. The latter term isthe lowest order interaction that can mix states containingdifferent numbers of s bosons and d bosons, thought tobe appropriate for nuclear deformation. The parameters ofthe IBM-2 Hamiltonian have been determined for a widerange of medium-mass to heavy mass nuclei, mainly inthe rare-earth region, and the model has enjoyed consid-erable success in describing the low-lying properties ofthese nuclei.

Because the IBM-2 contains separately proton and neu-tron boson degrees of freedom, it yields not only states thatare totally symmetric in both the charge and sd spaces(corresponding to the IBM-1 solutions), but also statesof mixed symmetry in both spaces. These mixed symme-try states in the IBM-2 lead to predictions regarding newforms of collective motion. The degree of symmetry ofthe IBM-2 states can be classified according to a quan-tity known as F-spin, which treats the proton and neutronbosons as two charge states of a single particle in the same

way that the isospin T treats protons and neutrons as twocharge states of one particle, the nucleon. The states ofmaximum F-spin are totally symmetric and correspondto the IBM-1 states. States with F-spin less than the max-imal value are of mixed symmetry and are believed tolie higher in energy as their F-spin value decreases. Forhighly deformed nuclei (i.e., in the SU(3) limit), the lowestenergy mixed symmetry (or F maximal minus one) stateshould have a signature of angular momentum 1 and par-ity plus with a strong magnetic-dipole gamma transition,for which the orbital component is more important thanthe intrinsic spin, to the ground state. Numerous stateswith these properties have now been observed in rare-earth nuclei, supporting this prediction of the IBM-2. Thetheory also predicts mixed symmetry states in other massregions, including those where the U(5) and O(6) limitsare appropriate. States with characteristic signatures forthese limits have now been seen. The detailed study ofthese states provides us with new information about thestructure of nuclear collective states.

IV. IBM-3 AND IBM-4

In most medium-mass to heavy mass nuclei, the protonsand neutrons fill different major shells. In this mass region,the nuclear interaction strongly favors the proton–protonand neutron–neutron like pairs instead of pairs of neutron–proton structure. In light nuclei (mass number A less than100), the protons and neutrons often fill the same shells.In such cases, it is equally likely to form pairs constructedfrom a proton and a neutron. A neutron–proton pair can beeither symmetric (T = 1) or antisymmetric (T = 0) in itscharge state T . The symmetric neutron–proton state hasthe same space–spin structure as the proton–proton andneutron–neutron pairs, so that together they form a tripletof equivalent states (i.e., the three T = 1 states). The inter-acting boson model constructed from these three bosonsis called the IBM-3. If the antisymmetric neutron–protoncharge state is included as a fourth boson, one obtains theIBM-4. The IBM-3 and IBM-4 were developed by J. P.Elliott et al. and have been successfully applied to lightnuclei, mainly for 18 ≤ A ≤ 46.

Recent experiments with radioactive ion beams haveproduced many previously unobserved proton-rich nucleiand have now extended the region of observed N = Znuclei as far as 94Ag (N = Z = 47). The IBM-4 may beimportant in understanding the structure of these newlyobserved N = Z nuclei, because the antisymmetric T = 0proton–neutron pair appears to play a significant role. Fornuclei where they are appropriate, the inclusion of proton–neutron bosons allows the IBM-3 and IBM-4 to describe

Page 204: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN008c-349 June 29, 2001 12:30

Interacting Boson Model 29

odd–odd nuclei and β-decay, which cannot be done withthe IBM-2.

V. INTERACTING BOSON–FERMIONMODEL

The empirical success of the IBM for even–even nuclei en-couraged its developers to expand it to odd–even and even–odd nuclei by the addition of one fermion. This odd-Aversion is known as the interacting boson–fermion modelor IBFM. The IBFM Hamiltonian contains a boson term,a fermion term, and a third term representing the interac-tion (or coupling) between the bosons and fermion. As inthe IBM, the IBFM can be discussed in symmetry limits,as in Eq. (1), in which the odd fermion is coupled to thevalence bosons in either the U(5), SU(3), or SO(6) limits.The different limits can be related to particular cases inthe collective model, such as the strong-coupling or weak-coupling limits.

The addition of an odd fermion greatly increases thenumber of possible parameters in this model; the numberof possible states also greatly increases. For these reasons,the IBFM has been applied mainly to special cases, such asa nucleon in a single j level or in several j levels with theimposition of some boson–fermion symmetry (see Sec-tion VI). This model has also been used for studies ofβ-decay between odd–even and even–odd nuclei.

VI. BOSON–FERMION SYMMETRIES

In 1980, Iachello observed that in certain cases new sym-metries, corresponding to simultaneous transformations ofthe boson and fermion systems, can be introduced. This

FIGURE 4 Comparison of the theoretically predicted (theory) and observed (experiment) energy spectra for 19076Os

and 19177Ir, as an example of a bose–fermi symmetry in nuclei. Each energy level is labeled by the value of the total

angular momentum J. The dashed line enclose levels of the same symmetry. The solid lines indicate levels betweenwhich strong electromagnetic radiation occurs.

is possible if some groups in the fermion classification(that is, group chain) coincide with some groups in theboson classification [see Eq. (1)]. Combined bose–fermigroups can then be introduced corresponding to particularcouplings of the bosons and fermions.

It was, of course, known that the conservation of thetotal angular momentum requires the combined systemto be invariant under the total angular momentum oper-ator (the sum of parts acting on the bosons and on thefermions), generating SOBF(3). But it was found that spec-tra are often closer to those associated with combiningthe larger groups in Eq. (1), such as SO(6), with theirfermion counterparts, implying the conservation of lesswell-understood quantities.

Figure 4 illustrates the spectrum of an odd-A nucleusrelated by such a bose–fermi symmetry to the spectrumof its even–even neighbor. It was claimed that cases suchas those shown in Fig. 4 are examples of supersymmet-ric structure in nuclei. However, supersymmetry conven-tionally refers to a description involving a superalgebra,which is an algebra containing operators that transformbosons into fermions and vice versa. In fact, the casesand examples given then and in later work are for bose–fermi symmetries rather than true supersymmetries. Nev-ertheless, the fact that the properties of certain neighbor-ing even–even and odd-A nuclei can be related by thesame group-theoretical chains and the same Hamiltonianis of significant interest and provides new insight into thestructure of complex nuclei. Present investigations regard-ing high-spin superdeformed bands indicate that superde-formed bands in certain neighboring even–even and odd-Anuclei may prove to be the best examples of bose–fermisymmetries in nuclei. Recent work with superalgebras in-dicates that examples of real supersymmetries may existin nuclei.

Page 205: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN008c-349 June 29, 2001 12:30

30 Interacting Boson Model

VII. OTHER EXTENSIONS OF THE IBM

By its basic assumptions, the IBM is a model for low-excitation nuclear structure. For this reason, a number ofexpanded versions have been developed, so as to describeother nuclear properties. Angular momentum one (p) andthree ( f ) bosons have been introduced to explain nega-tive parity states in nuclei, and aligned pairs coupled tolarge values of the angular momentum have been used todescribe high-spin states. Procedures have also been de-veloped for treating configuration mixing in nuclei, suchas the mixing between vibrational-like and rotational-likestates.

Particle-like bosons can be combined with hole-likebosons in a similar manner as proton bosons were com-bined with neutron bosons in the IBM-2, thereby leadingto states of different F-spin. In the case of particle and holebosons, the states of different particle-hole symmetry areclassified by I-spin, which can connect states in differentnuclei.

VIII. MICROSCOPIC INTERPRETATIONSOF THE IBM

The success of this formalism involving rather abstractbosons suggests that they might represent real objectswithin the nucleus, in particular that they may be inter-preted in terms of the valence protons and neutrons of theshell model (Section I). Investigations of this possibilitymake up the largest single area of research arising fromthe IBM.

Since a pair of fermions is bosonlike (a similarity whichimproves as the number of fermion states increases), anatural proposal is that the s and d bosons are modelingpairs of nucleons coupled to angular momentum 0 and 2denoted as S and D, respectively; indeed, this idea wasused to motivate our discussion of the IBM and is com-monly seen as part of the IBM per se. However, conclud-ing that the validity of this interpretation follows from theequality of statistics and angular momentum alone wouldbe a non sequitur. Moreover, many other situations arepossible, such as the bosons representing quartets of nu-cleons (IBM results are generally not very sensitive tothe number of bosons), or even having no interpretationof the bosons singly, necessitating a more complicatedmany-boson–many-nucleon correspondence.

In fact, there can be no automatic answer to the ques-tion of what the bosons represent, since their interpretationmust depend on the phenomena they are used to describe.For instance, s and d bosons could in principle be used todescribe the giant quadrupole resonance; any shell modelinterpretation of such bosons would have to be very dif-

FIGURE 5 Pictorial description of one microscopic IBM proce-dure. The full fermion space (large circle) is truncated to an S andD collective-pair space, 1. An appropriate subset, 2, is associatedthrough the mapping (represented by label 3) to the correspond-ing states, 4, in the boson space. The IBM-2 interaction can nowbe computed microscopically, completing the boson picture.

ferent from one designed to reflect the description of low-energy phenomena with which the model is associated inpractice.

Nevertheless, as described above, an interpretation ofthe bosons as fermion pairs is almost always the basicpostulate of such investigations. It is apparent that thestructure of the pairs associated with the bosons must becollective in nature, because of the transition from thenucleon shell model space to the collective-pair space isalready a significant truncation.

Most attempts to develop a microscopic interpretationhave centered around a search for the appropriate collec-tive pairs in the fermion (that is, nucleon) space to bemapped onto (that is, related with) the bosons. This is in-dicated schematically in Fig. 5. Here, the large circle rep-resents the full fermion shell model space, which is thentruncated to the subspace constructed using the S (J = 0)and D (J = 2) pairs. Then, some subset, 2, restricted forcomputational reasons to consist of only those states witha small number of D pairs, is associated through mapping(represented by label 3) to the corresponding set of states,4, in the boson space and used to determine the bosonoperators, corresponding to those of interest in the shellmodel.

One prescription for the collective fermion pairs is tosolve the shell-model problem for two alike nucleons forJ = 0 and J = 2 and to equate the lowest J = 0 eigenstatewith the collective S pair and the lowest J = 2 eigens statewith the collective D pair. This procedure follows ideassuggested by I. Talmi regarding his work on generalizedseniority. Other procedures have been proposed for con-structing the collective fermion states to be associated withthe IBM states, but there is no general agreement regardingan ideal choice.

However, the impossibility of performing shell modelcalculations for heavy nuclei (Section I), itself a rationale

Page 206: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN008c-349 June 29, 2001 12:30

Interacting Boson Model 31

for use of the IBM, means that the validity of the pair in-terpretation cannot be definitively tested. It simply is notknown whether the many-pair condensate analogues of themany-boson IBM states would indeed allow a reasonableapproximation to the shell model eigenstates, nor whetherthe shell model operators mentioned above would in factreproduce the data. As discussed above, the shell modelinterpretation of the bosons must be appropriate for thelevels the IBM is used to model: for instance, the inter-pretation for a description of all the low-energy rotationalbands of 156Gd (Fig. 3.) would be different from that ap-propriate for a description of the first, fourth, and fifthbands only. (For example, if the strength of the interac-tion in the theoretical boson calculations were increasedby around 50%, then the second and third model bandsobtained would correspond to the fourth and fifth bandsin the experiment, while the observed second and thirdbands would have no IBM counterparts). A possible fail-ure of the SD pair interpretation is then apparent in exactcalculations for lighter nuclei, where it is found that themany-pair states describe only some of the levels that theIBM would be used to model. A similar conclusion hasbeen obtained in an approximate calculation for 156Gditself. If this situation does indeed persist in heavy nuclei,it would have to be concluded that the simple interpreta-tion of the bosons as pairs is inconsistent with the use ofthe IBM to model all the collective low-energy levels, asis invariably the case (Fig. 3).

There is much controversy in this area, which only fur-ther research can resolve. A truly valid shell model inter-pretation of the elegantly simple IBM would reveal a cor-respondingly simple latent structure amid the complexityof realistic shell model calculations.

This said, it is worth noting that after 25 years the Inter-acting Boson Model approach to describing the propertiesof medium-to-heavy-mass nuclei has held up extremelywell and has proven itself to be quite versatile and robust.

SEE ALSO THE FOLLOWING ARTICLES

GROUP THEORY • NUCLEAR PHYSICS • PARTICLE

PHYSICS, ELEMENTARY

BIBLIOGRAPHY

Arima, A., and Iachello, F. (1984). “Advances in Nuclear Physics” (J.W. Negele and E. Vogt, eds.), Vol. 13. Plenum, New York.

Barrett, B. R. (1984). “Nucleon–Nucleon Interaction and Nuclear Many-Body Problems” (S. S. Wu and T. T. S. Kuo, eds.). World Scientific,Singapore.

Bonatsos, D. (1988). “Interacting Boson Models of Nuclear Structure,”Clarendon Press, Oxford.

Casten, R. F. (ed.) (1993). “Algebraic Approaches to Nuclear Struc-ture: Interacting Boson and Fermion Models.” Contemporary ConcPhysics, 6. Harwood Academic Publishers.

Casten, R. F. and Feng, D. H. (1984). Nuclear dynamical supersymme-try. In “Physics Today,” Vol. 37. American Institute of Physics. NewYork.

Casten, R. F., and Warner, D. D. (1988). The interacting boson approx-imation. In “Reviews of Modern Physics,” Vol. 60. The AmericanPhysical Society, New York.

Dieperink, A. E. L., and Wenes, G. (1985). “Annual Review of Nu-clear and Particle Science,” Vol. 35. Annual Review Inc., Palo Alto,California.

Iachello, F., and Arima, A. (1987). “The Interacting Boson Model.”Cambridge Univ. Press, London and New York.

Iachello, F., and Talmi, I. (1987). Shell-model foundation of the inter-acting boson model. In “Reviews of Modern Physics,” Vol. 59. TheAmerican Physical Society, New York.

Iachello, F., and Van Isacker, P. (1990). “The Interacting Boson-FermionModel.” Cambridge University Press, London and New York.

Mizusaki, T., and Otsuka, T., (ed.) (1996). Microscopic Study of theInteracting Boson Model. In “Progress of Theoretical Physics: Sup-plement,” Number 125. Yukawa Institute and the Physical Society ofJapan, Kyoto.

Scholten, O. (1985). “Progress in Particle and Nuclear Physics,” Vol. 14.Pergamon, Oxford.

Talmi, I (1993). “Simple Models of Complex Nuclei: The Shell Modeland Interacting Boson Model.” Contemporary Concepts in Physics,Vol. 7. Harwood Academic Publishers.

Page 207: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN008E-966 July 3, 2001 14:53

Liquid Crystals (Physics)Paul UklejaSoutheastern Massachusetts University

I. Brief HistoryII. Description of Liquid Crystal Phases

III. Properties of Liquid CrystalsIV. Applications

GLOSSARY

Amphiphile Material in which each molecule has onepart that is attracted to water and another part thatrejects water.

Cholesteric Liquid-crystalline state of matter in whichthe molecules align in a helical structure that has aregular pitch. Over distances much smaller than thepitch, a cholesteric has a nematic structure.

Director Direction (often denoted by a unit vector, n)about which the long axes of molecules or aggregatesof molecules fluctuate in liquid-crystalline phases.

Hexagonal phase Lyotropic phase in which the am-phiphilic molecules aggregate into parallel cylindersthat pack into a hexagonal array.

Homeotropic Alignment of liquid crystals in which thedirector is uniformly aligned perpendicular to the op-posite, parallel surfaces of a thin, flat container.

Lamellar phase Smectic lyotropic phase commonly con-sisting of alternate flat layers of water and amphiphile.

Lyotropics Liquid crystals that form in solutions andchange phases primarily with concentration.

Nematic State of matter in which molecules or aggre-gates of molecules align along a common direction,the director, but are otherwise fluid.

Order parameter Parameter that indicates the degree towhich the molecules of a liquid crystal align with thedirector.

Polymer dispersed liquid crystals (PDLC) Compositematerial made of tiny droplets of liquid crystal in apolymer matrix. The droplets normally scatter light,but an applied electric field aligns them, switching thematerial from opaque to transparent.

Smectics Phases in which the molecules have orienta-tional order and partial positional order, generally inlayers.

Supertwist LCD Similar to the twisted nematic displaybut with a twist angle of 270◦, giving it a faster responsetime. Cholesteric material is used to create the largetwist angle in the relaxed state.

Surfactant Amphiphilic material; the molecules tend toarrange themselves on surfaces with water on one side.

Thermotropics Liquid crystals that change phase withchanges in temperature.

LIQUID CRYSTALS are materials that have proper-ties and characteristics of both liquids and crystallinesolids. For many liquid-crystalline materials the liquid-crystalline phases, also called mesophases, occur in a

717

Page 208: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008E-966 July 3, 2001 14:53

718 Liquid Crystals (Physics)

range of temperatures between those at which the ma-terials are normal liquids and solid crystals. For otherliquid crystals, it is mainly the concentrations of differ-ent components of a solution that determine the phases.Many solutions or pure compounds form several distinctphases at different temperatures or concentrations. Sev-eral thousand compounds have liquid-crystalline phasesin their pure states. Many biological materials, such ascell membranes, also display liquid-crystalline behavior.Since liquid crystals have both the ability to flow and theanisotropy of crystals, they display many properties ofgreat interest and, often, of practical importance.

I. BRIEF HISTORY

The study of liquid crystals began with observations by anAustrian botanist, Friedrich Reinitzer, in 1888. Reinitzerfound that cholesteryl benzoate melted from a solid at145◦C into a liquid having a murky or turbid appearance.At 179◦C the liquid cleared. The color of the turbid liquidchanged from red to blue as the temperature increased.On cooling, the reverse occurred. Reinitzer sent some ofthis material to O. Lehmann, who was able to make fur-ther studies with polarized light on a microscope equippedwith a heating stage, which allowed him to vary the tem-perature of samples being observed. Lehmann discoveredthat the turbid liquid actually displayed optical anisotropyor birefringence, as do solid crystals. The combination ofthe ability of the material to flow like a liquid and yet retainthe anisotropic optical properties of a crystal led Lehmannto coin the name “liquid crystal” to describe this state ofmatter.

The turbid appearance, resembling that of a colloidalsolution, gave rise to early ideas that the liquid crystal wasno more than such a solution, but it was later found that theliquid-crystalline state is a distinct phase of matter withfixed transition temperatures into the solid and normal orisotropic liquid states and that the molecules in a liquidcrystal have orientational order.

In 1922, G. Friedel proposed a classification system thatis still used extensively, dividing liquid crystals into threeclasses: smectic, nematic, and cholesteric. In a smecticphase, the molecules are arranged in sets of parallel planes.The smectic phases generally have high viscosity and asoapy appearance; hence the name was derived from theGreek word for “soap.” The word “nematic” is derivedfrom the Greek word for “thread.” A nematic liquid crystaloften shows a thread-like pattern when placed betweencrossed polarizers and viewed through a microscope. Thethird class was named “cholesteric” since the moleculesforming these phases commonly contained cholesterol. Acholesteric has a characteristic iridescent color, which can

change dramatically with changes in temperature or otheraspects of the environment.

Early attempts to explain the turbid appearance of thenematic liquid crystals included the idea that the moleculesin these materials grouped into swarms. The boundariesbetween swarms would represent variations in the opticalproperties of the medium that could scatter light. It waslater found that nematic liquid crystals, which are most of-ten formed from rod-like molecules, are generally homo-geneous throughout, with the long axes of the moleculeslining up parallel to one another. Long-wavelength ther-mal fluctuations in the direction of the alignment scattervisible light.

Classification and identification of liquid crystal phaseswere first done by using a polarizing microscope with aheating stage. Observations of the textures and how theychanged from phase to phase were useful in determin-ing some of the properties of the crystals. By mixing twocompounds in various proportions and determining the re-sulting phase transitions, it was possible to compare thephases they formed in their pure states. The present namesfor the different smectic states (A, B, C, etc.) were merelyassigned as they were observed and do not necessarilybear a logical relation to their structures. For many of thestates formed in lyotropic materials (those whose concen-trations determine the phases), there are several namesarising from different lines of research.

The study of liquid crystals continued fairly stronglyinto the 1930s, including measurements of viscosity co-efficients and development of theoretical models for theelastic and flow properties of nematics. The renewed in-terest in liquid crystals in this country owes much to GlennH. Brown of Kent State University, who organized a se-ries of international conferences starting in the 1960sand founded the Liquid Crystal Institute at Kent StateUniversity. Low-power electronic liquid crystal displays(LCDs) are now found in a very wide range of devices.Further developments in displays and other uses of liq-uid crystals have stimulated a wide range of investiga-tions in this country and abroad into the nature of thesephases.

II. DESCRIPTION OF LIQUIDCRYSTAL PHASES

The most obvious difference between liquids and solidsis the ability of a liquid to flow or to adapt its shape to itscontainer under the influence of small external forces. Onthe molecular level, a crystalline solid has long-range or-der, that is, a strong correlation between the positions andorientations of molecules that are far apart, whereas such

Page 209: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008E-966 July 3, 2001 14:53

Liquid Crystals (Physics) 719

coordination extends to only a few neighboring moleculesin a liquid. One result of this is that the physical propertiesof normal liquids show no distinctions among different di-rections; that is, liquids are isotropic. Crystals, on the otherhand, often have properties that vary with direction; theyare anisotropic. Light or sound, for instance, may travelfaster in some directions than in others.

Liquid crystals are characterized by partial ordering;that is, one or more degrees of freedom, but not all, willhave long-range order. In particular, the molecules willretain some ability to move throughout the medium, al-though the motion may be restricted in extent or direc-tion. The molecules may be as little restricted as in thenematic phase where there is a preferred direction for thelong axis of a molecule but the other axes and overall loca-tion are free to vary. The translational motion may also berestricted as in the smectic phases in which the molecules,besides aligning their long axes, arrange themselves inparallel planes. In plastic (as opposed to liquid) crystalsthe molecules are well ordered translationally but not ori-entationally.

Pure compounds which display liquid crystal phases asthe temperature is changed are called thermotropic liquidcrystals. Homogeneous mixtures of these compounds aregenerally also thermotropic liquid crystals. When certainsubstances, such as soaps, are dissolved in a suitable sol-vent, such as water, liquid crystalline phases are observed.These phases are called lyotropic phases from the Greekroot “lyein,” to dissolve. For lyotropic phases, concentra-tion is the main physical variable, although temperaturechanges can also effect phase changes. Solutions of poly-mers also display liquid crystalline order. Friedel’s termi-nology of nematic, smectic, and cholesteric is still used todescribe the main classes of liquid crystals.

A. Thermotropic Liquid Crystals

The thermotropics (see Table I) are the most studiedand perhaps best understood of the liquid crystals. It isthermotropic liquid crystals that are used in liquid crys-tal displays in wristwatches, computers, and televisions.Cholesterics, sensitive to temperature, are used to makevery thin thermometers and films that change color withtemperature. Most thermotropics are formed from organicmolecules with a rod-like or lath-like shape. The major-ity of the molecules having liquid crystalline mesophaseshave planar and rigid nuclei, typically including two ormore benzene rings. Table I shows examples of typicalthermotropic liquid crystalline compounds and the tem-perature ranges over which they are liquid crystalline.Table II lists the most common thermotropic phases alongwith some of their properties.

FIGURE 1 A nematic liquid crystal. The oval shapes in this andthe next two figures represent the positions and average orien-tations of molecules. The instantaneous directions of the longmolecular axes fluctuate about the average direction (the director)by angles as large as 40◦. [Courtesy of Nuno Vaz.]

1. Nematic and Cholesteric Phases

a. Ordinary nematic. This is the simplest of the liq-uid crystal phases. In the nematic phase the long axes of themolecules have a preferred orientation, a director, aboutwhich they fluctuate rapidly. In Fig. 1, the average orien-tations of the molecules are represented by cigar-shapedforms. An instantaneous picture of the molecules wouldshow the long axes tilted at angles up to 40◦ away fromthe director. The molecules behave like a fluid in that theycan move freely from point to point in the medium. Thedirector responds to very weak external forces and usuallyvaries from point to point in the medium. As the nematicis optically birefringent, this variation in the director overdistances on the order of magnitude of the wavelength ofvisible light is what gives the nematic its turbid or cloudyappearance. It is possible to make a uniformly alignedsample through the use of external magnetic or electricfields or interactions with treated surfaces. This makesit possible to use nematics for displays in which electricfields are used to realign the director and thus changethe optical behavior of the sample. Some nematics maybe composed of groups of hundreds of molecules, calledcybotactic groups, in which the molecular centers are ar-ranged in layers.

b. Cholesteric. A cholesteric phase is similar tothe ordinary nematic but with a natural twist in the di-rector (Fig. 2). Thus the long axes of molecules that areside by side prefer to align at a small angle. On a localscale (distances on the order of tens or hundreds ofmolecules), the ordering is essentially nematic. Thermo-dynamically, the cholesteric behaves like a nematic, as theenergy of the twist is small compared to the energy asso-ciated with parallel alignment of the molecules. Mixturesof cholesterics having opposite twists form cholesterics of

Page 210: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008E-966 July 3, 2001 14:53

720 Liquid Crystals (Physics)

TABLE I Some Thermotropic Liquid Crystalline Compoundsa

Liquid crystallineFormula Name range (◦C)

1. Nematic liquid crystals

A. Some ordinary classic nematic liquid crystals

OH3C C

H

C4H9-nN

p-Methoxybenzylidene-p′-n-butylaniline(MBBA) 21–47

OH3C N

O

N C4H9-n

p-Methoxy-p′-n-butylazoxybenzene 19–76(mixture of isomers)

OH3C N

O

N O CH3

p-Azoxyanisole (PAA) 117–137

n-H13C6 CNp-n-Hexyl-p′-cyanobiphenyl 14–28

B. Cholesteric–nematic liquid crystals

1. Cholesteric esters

H3C

H3CC

CH3

H

(CH2)3 CH

CH3

CH3

OC

O

CH3(CH2)7

Cholesteryl nonanoate 145–179

2. Noncholesteryl, chiral-type compound

OH3C C

H

N C

H

C

H

C

O

O CH2 C

CH3

H

C2H5

(−)-2-Methylbutyl-p-(p-methoxybenzylideneamino) 76–125cinnamate

II. Smectic liquid crystals

A. Structured smectic liquid crystals

Smectic B

H5C2O C

H

N CH CH COOC2H5

Ethyl p-ethoxybenzal-p′-aminocinnamate 77–116

Smectic E

C2H5OOC COOC2H5

Diethyl p-terphenyl-p-p”-carboxylate 173–189

Smectic G

C5H11-nn-H11C5 ON

N 2-(p-Pentylphenyl)-5-(p- SG79–103pentyloxyphenyl)pyrimidine

Biaxial SB (or smectic H)

OC4H9 C

H

N C2H5

4-Butyloxybenzal-4-ethylaniline 40.5–51

B. Unstructured smectic liquid crystals

Smectic A

C

H

N COOC2H5

Ethyl p(p′-phenylbenzalamino)benzoate 121–131

continues

Page 211: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008E-966 July 3, 2001 14:53

Liquid Crystals (Physics) 721

TABLE I (Continued )

Liquid crystallineFormula Name range (◦C)

Smectic C

COOHOn-H17C8p-n-Octyloxybenzoic acid 108–147

Smectic D

COOHOn-C18H37

O2N

p′-n-Octadecyloxy-3′-nitrodiphenyl-p-carboxylic acid 159–195

Smectic F

C5H11-nn-H11C5 ON

N 2-(p-Pentylphenyl)-5-(p-pentyloxyphenyl)pyrimidine SF103–114

a From Brown, G. H. (1977). J. Colloid Interface Sci. 58, 534.

infinite pitch, which correspond to nematics. A cholestericis formed by adding a small amount of a cholesteric, oreven a substance that is not in itself liquid crystalline butis optically active, to a nematic substance. The pitch ofa cholesteric can vary sensitively with temperature, va-por pressure of certain substances, and other influences.When the pitch corresponds to the wavelength of visiblelight, the scattered light is highly colored. A cholestericalso rotates the direction of linearly polarized light, thatis, is optically active. This activity is roughly 1000 timesstronger than the activity of an ordinary optically activesubstance such as quartz.

FIGURE 2 A cholesteric or twisted nematic. On a small scale,the molecules behave as in the nematic phase. Over longer dis-tances the director rotates along a helix whose pitch is sensitiveto changes in temperature, pressure, etc. [Courtesy of Nuno Vaz.]

c. Blue phases. Some cholesteric compounds ex-hibit a phase between the cholesteric and isotropic phases,usually in a narrow temperature range. The local molec-ular orientation shows a three-dimensional periodicityand is perhaps a stable lattice of defects in the uniformcholesteric structure. Mechanically, the behavior is sim-ilar to that of a cubic crystal with a large resistance toshear (a shear modulus of several hundred to severalthousand ergs per cubic centimeter). Several blue phasesexist.

d. Discotic. Disklike molecules may have a discoticphase in which the molecules are stacked aperiodically,forming liquidlike columns having nematiclike orderingof the symmetry axes of the molecules. Some of thesematerials have smectic phases similar to the smectic A inwhich the columns form a hexagonal array. These phasesthus exhibit translational order in two dimensions, but notin the third. The appearance under a microscope is similarto that of an ordinary nematic.

2. Smectic Phases

The various smectic phases have, in addition to the ori-entational order found in nematics, different degrees ofpositional and, in some cases, bond-orientational order-ing (Fig. 3). On the basis of appearance under a po-larizing microscope, miscibility with known phases, andX-ray scattering, at least nine thermotropic smectic phaseshave been identified, although not all are truly liquid crys-tals. Of these nine phases, eight have a characteristic pack-ing of the molecules in layers. The ninth phase, known assmectic D (the letters used to denote the different phaseswere assigned in chronological order as the phases werefirst observed), has a cubic packing. In the smectic A, B,and E phases, the molecules align with their long axes

Page 212: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008E-966 July 3, 2001 14:53

722 Liquid Crystals (Physics)

TABLE II Characteristics of Thermotropic Liquid Crystalsa

Class Optical properties Textures Structure Examples

I. Nematics

Ordinary nematic Uniaxially positive Schlieren; threaded Parallelism of long p-Azoxyanisole; p-methoxybenzylidenemarbled; molecular axes p-n-butylanilinepseudoisotropic;homogeneous

Cholesteric nematic Uniaxially negative; Focal conic with Nematic packing in planes; Cholesteryl nonanoateor isotropic Grandjean steps; superimposed twist inoptically active homogeneous; direction perpendicular

isotropic to long axes of molecules

II. Structured smectics

Smectic B Uniaxially or Mosaic; stepped drops; Layer structure; molecular Ethylethoxybenzylideneaminocinnamate;biaxially positive pseudoisotropic; axes orthogonal or tilted terephthal-bis-butylaniline

homogeneous; to layers; hexagonalschlieren arrangement within layers

Smectic E Uniaxially positive Mosaic; pseudoisotropic Layer structure; molecular di-n-Propylterphenyldicarboxylateaxes orthogonal to layers;ordered arrangementwithin layers

Smectic G Uniaxially positive Mosaic Layer structure with ordered 2-(4-n-Pentylphenyl)-5-arrangement within layers (4-n-pentyloxy-phenyl)pyrimidine

III. Unstructured smectics

Smectic A Uniaxially positive Focal conic (fanshaped Layer structure; molecular Diethylazoxybenzoateor polygon); stepped axes orthogonal to layers;drops; homogeneous; random arrangementpseudoisotropic within layers

Smectic C Biaxially positive Broken focal conic; Layer structure; molecular Dodecyloxyazoxybenzeneschlieren; axes tilted to layers;homogeneous random arrangement

within layers

Smectic D Isotropic Isotropic; mosaic Cubic structure 4′-Octadecyloxy-3′-nitrodiphenyl-4-carboxylic acid

Smectic F Uniaxially positive Schlieren; broken Layer structure 2-(4-n-Pentylphenyl)-5-(4-n-pentyloxy-focal conic with phenyl)pyrimidineconcentric axes

a From Brown, G. H., and Wolken, J. J. (1979). “Liquid Crystals and Biological Structure,” pp. 30–31, Academic Press, New York.

perpendicular to the layers. In the tilted smectics, C, F, G,H, and I, the long axes are at an angle with the layer nor-mals. The smectic A, C, D, and F phases are unstructuredsmectics: They do not show an ordered arrangement ofmolecules within layers. The smectic A, B, and C phasesare the best known of the smectic phases.

One variation on the ordinary smectic structure can oc-cur in optically active compounds having a tilted smecticphase or in a tilted smectic phase to which a small amountof a chiral compound has been added. A macroscopicallychiral structure can form in which the directors of adjacentlayers of molecules form a small angle, giving an uniformtwist and resulting in a strongly optically active substance.Some of these chiral smectics are ferroelectric and havethe potential for applications in displays with fast responsetimes (microseconds).

Extremely thin films, down to one molecular layer thick,have been made with smectics. These films are being usedto probe surface effects as well as new thermodynamicphase behavior.

In the discussion that follows, the smectic phasesare treated in alphabetic order, which does not alwayscorrespond to the sequence of phases observed on heatingor cooling.

a. Smectic A. Smectic A liquid crystals (Fig. 3) arethe least ordered of the untilted or orthogonal smecticphases. The molecules are arranged in layers with thedirector perpendicular to the layers. Except over shortdistances, the molecules show no correlations in positionwithin the layers. Studies of X-ray scattering by smectic

Page 213: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008E-966 July 3, 2001 14:53

Liquid Crystals (Physics) 723

FIGURE 3 Molecular arrangements in four smectic phases inwhich molecules form layers. In the A and C phases moleculesare aligned but their positions are not ordered. In the C and Hphases molecules are aligned in a direction tilted with respect tothe layers. [Courtesy of Nuno Vaz.]

A materials show that the spatial density is best describednot in terms of sharply defined layers of molecules but asa one-dimensional sine wave in a three-dimensional fluidwith the density wave along the director. The higher spa-tial harmonics of the density wave are surprisingly weak.Furthermore, the correlation in the positioning of the lay-ers dies away algebraically (as an inverse power of thedistance) rather than being constant as in a true crystal. Inliquid crystals having both nematic and smectic A phases,the nematic phase is the higher temperature phase. It ispossible to make well-aligned single crystals of smecticA by aligning the director in the nematic phase, for in-stance, with a magnetic field and then cooling into thesmectic phase. Once “frozen” into this alignment, the di-rector can no longer reorient to align with the field as it isfixed perpendicular to the layers. Such a sample does nothave the turbid appearance of the nematic, as the directordoes not have the variations in direction characteristic ofthe nematic.

b. Smectic B. The structure of this phase (Fig. 3)consists of layers of molecules having hexagonal packingwithin the layers. The director is perpendicular to the lay-

ers. The smectic B phases in some materials may not, infact, be liquid crystals at all, but have long-range order inall three dimensions. There may be a hexatic smectic Bphase having short-range positional order combined withlong-range bond-orientational order in the plane of thelayers.

c. Smectic C. The smectic C phase (Fig. 3) is similarto the smectic A phase except that the director makes anangle, called the tilt angle, with the normal to the layers.The layer thickness deduced from X-ray scattering data isless than the molecular length and the phase is opticallybiaxial, unlike the nematic and smectic A. Tilt angles upto 45◦ have been observed and can vary with temperature.Because the orientation of the director can change whilethe tilt angle is kept constant, the director can change frompoint to point as in the nematic, and as a consequence thereis strong light scattering. Cooling to smectic C from ne-matic or smectic A phases does not create single crystals.As is the case for the ordinary nematic, the addition ofoptically active molecules can give a twist to the smecticC phase. Pure compounds with this structure have alsobeen observed. In a compound having the A, B, and Csmectic phases, the sequence on cooling is A, C, and thenB (Table III).

d. Smectic D. The smectic D phase does not have thecharacteristic layers of the other smectics and is opticallyisotropic. The overall structure has cubic symmetry. Onemodel of the structure has molecules packed hexagonallyinto roughly spherical shapes, which are then packed intoa cubic framework. This kind of structure is also seen incubic lyotropic phases.

e. Smectic E. The director is perpendicular to thelayers in the smectic E as in the A and B phases. Within thelayers, the molecules show correlations in both positionand the orientation of the benzene ring. The moleculespack into a herringbone type of pattern.

f. Smectic F. This phase is similar to the smectic Cphase but is more ordered, having a short-range hexagonalorder within the layers.

g. Other smectics. The smectic G and H phases(Fig. 3) correspond to smectics B and E, respectively,differing in that the directors are tilted at an angle withrespect to the layer normals. The smectic I is yet anothersmectic phase with hexagonal correlations in the layersbut with a tilt that is uniform with respect to neighboringmolecules. New smectic phases have been found in mate-rials in which the molecules have electric dipole moments.These include antiferroelectric phases.

Page 214: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008E-966 July 3, 2001 14:53

724 Liquid Crystals (Physics)

TABLE III Typical Examples of Polymorphic Forms of Thermotropic Liquid Crystalsa

Polymorphic formb Example

N

CH3O N N

O

OCH3

p-Azoxyanisole

N

CH3O C N

H

C4H9-n

4-Methoxybenzylidene-4′-n-butylaniline

ACH2 N N COOCH2CHCHCH2OOC

O

CH2

Diallylazoxybenzene-4, 4′-dicarboxylate

Ch,A

CH3(CH2)7COO

H3C

CH3 CH(CH3)(CH2)3 CH(CH3)2

Cholesteryl nonanoate

A,B

C2H5O C N

H

CH CHCOOC2H5

Ethyl-4-ethoxybenzylidine-4′-aminocinnamate

N,A,Cn-C6H13 COOHO

O2N

4′-n-Hexyloxy-3′-nitrobiphenyl-4-carboxylic acid

A,C,B

n-C10H21O C N

H

CH CH C

O

O-n-C5H11

n-Amyl-4-n-decyloxybenzylidene-4′-aminocinnamate

N,A,C,B

C2H5 N C

H

O C

O

CH HC C N

H

CH CHC OC2H5

O

Diethyl terephthalylidene-bis-(4-′aminocinnamate)

a From Brown, G. H., and Wolken, J. J. (1979). “Liquid Crystals and Biological Structure,” pp. 32–33, AcademicPress, New York.

b Key: N, nematic; Ch, cholesteric; A, smectic A; B, smetic B; C, smectic C.

3. Polymorphism

Many thermotropic liquid crystals display more than onemesomorphic phase on heating from the solid to theisotropic liquid phase. These substances are said to bepolymorphous (Table III). The usual sequence of phaseson heating is as follows: solid, smectic B, smectic C, smec-tic A, nematic, isotropic. A more complete listing of thesequence of smectic phases on heating is E, H, G, F, I,B, C, D, and A. When one or more of the given phasesare not present, the remaining phases appear in the estab-lished order. When a cholesteric phase is present, it takes

the place of the nematic phase in the above sequence.Except for the action of external forces, twisted and or-dinary nematic phases do not occur for the same (pure)compound.

For discotics, a tentative sequence on heating is asfollows: crystal, discotic nematic, columnar (biaxial),columnar (uniaxial), and isotropic.

Most of the transitions between liquid crystalline phasesare reversible or enantiotropic. That is, the transitionsreverse direction on switching from heating to cooling.Supercooling is not uncommon for many of the phase tran-sitions. Some phase transitions are apparently monotropic;

Page 215: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008E-966 July 3, 2001 14:53

Liquid Crystals (Physics) 725

that is, one of the phases forms only on cooling. Oneexample is the smectic phase of cholesteryl nonanoate.

A variety of reentrant phase transitions have been dis-covered in which the samples “leave” a phase and then“reenter” again as the temperature is continuously raised.Some examples of sequences found with increasing tem-peratures are as follows: smectic A, nematic, smectic C,smectic A, nematic, isotropic; smectic C, nematic, smecticC, smectic A, nematic, isotropic; smectic C, smectic A, ne-matic, smectic A, nematic, smectic A, nematic, isotropic;and cholesteric, smectic A, cholesteric, isotropic.

B. Lyotropic Phases

1. General Characteristics

Mixtures of two or more components that change phasewith changes of concentration are called lyotropic. Al-though compounds that form thermotropic phases are byno means uncommon, lyotropic mixtures are very famil-iar; soap-and-water being one of the prime examples ofa two-component system with lyotropic phases. Mixturesof three or more components are also common, one exam-ple being the use of an emulsifier to mix the oil and wateror vinegar of a salad dressing. The soap-and-water mix-ture, of course, is most useful when the soap moleculesallow grease to “dissolve” in the water, forming a three-component system. Although water is a very commoncomponent of lyotropics, other solvents can be used aswell. Most lyotropic phases involve the solution of rod-like molecules or aggregates of molecules in a normallyisotropic solvent such as water. It is principally the rod-likeentities that become ordered.

Soaps are a simple example of a whole class of mol-ecules, called surfactants or amphiphiles, which form ly-otropic phases in water (Fig. 4). These molecules are calledamphiphiles (from the Greek amphi, meaning “of bothkinds,” and philo, meaning “loving”). One part of an am-phiphilic molecule, the polar “head,” has an affinity forpolar solvents such as water (hydrophilic), while the otherend, the organic “tail,” is relatively insoluble in water (hy-drophobic). The result of these opposite tendencies is thatthe molecules prefer to organize themselves into surfaces(hence the name “surfactants”) with the polar heads point-ing toward the water. Such systems may form a numberof possible structures, depending on the concentrations ofthe components and the shapes of the molecules involved(Fig. 5). The amphiphilic molecule is usually representedin a figure with a circle as the polar head and a wigglyline as the organic tail. Simple soaps often have a stringof hydrocarbons forming a hydrophobic tail attached to apolar head. A second type of molecule has two such tailsattached to a polar head.

FIGURE 4 Three molecular models of the same lipid in differentconfigurations. The lipid, dipalmitoyl phosphatidylcholine, consistsof two hydrocarbon tails, (CH2)14CH3, linked to a head group,(CH3)3N+(CH2)2PO−

4 , by ester linkages and a glycerol back-bone, (OCO)2(CH2)2CH. The molecule on the left has a single jog(gauche rotation) in one tail, while the other tail is all trans. Themiddle molecule has a kink (gauche–trans–gauche sequence) inone tail. [Courtesy of H. L. Scott, J. F. Nagle, and the AmericanInstitute of Physics; from “Biomembrane phase transitions,” Phys.Today 31(2), 38–47 (1978).]

When a crystalline amphiphile is added to water, sev-eral mesophases can be observed, ranging from a true so-lution to the crystal state. Some of the phases may showsmectic or lamellar packing (in layers) or even cubic orhexagonal molecular packing. The amphiphile can alsoaggregate into structures such as spherical or cylindri-cal micelles. Micelles have the ability to solubilize anotherwise insoluble chemical by encapsulating it. Thisis what soap does to dissolve grease. Similarly, watercan be dissolved in oil by inverse micelles formed bysurfactant molecules with the tails pointing outward in-stead of inward. One common sequence of mesophasesobtained on adding water is as follows: solid, lamel-lar liquid crystal, cubic liquid crystal, hexagonal liquidcrystal, micellar, homogeneous solution. Table IV givessome of the properties of some of the common lyotropicphases.

2. Lamellar

These phases, also called “neat” or G phases, correspondto the thermotropic smectic phases; that is, they arecharacterized by layers having a well-defined thicknessbut no structure within the layers (Fig. 6a). Differentlamellar phases have been found in the same system. Inthe phase designated Lα , the α is used to indicate thatthe hydrocarbon chains in the tail of the amphiphile arefluid or flexible. It is possible to observe a transition from

Page 216: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008E-966 July 3, 2001 14:53

726 Liquid Crystals (Physics)

FIGURE 5 Characteristic phase diagrams of amphiphile–watersystems. Two-phase regions are shown shaded. (a) Strongly po-lar amphiphiles (e.g., soaps, alkyl sulfates, quaternary ammo-nium salts, and lysolecithins). (b) Amphiphiles with relatively largehydrophobic regions (e.g., monoglycerides and lecithins). [FromFriberg, S., and Larsson, K. (1976). “Liquid crystals and emul-sions.” In “Advances in Liquid Crystals,” Vol. 2, Academic Press,New York, by permission.]

one lamellar phase, such as Lα , to another in which thetails “freeze,” that is, lose much of their flexibility. Thewater and surfactant molecules in the lamellar phase arein alternate layers with the surfactant molecules formingdouble-thickness layers called bilayers in which thehydrophobic tails are separated from the water layers byplanes of polar heads. Single bilayers of lipid moleculesform the underlying structure of biological membranes(Fig. 7).

3. Hexagonal

As water is added to a lamellar phase, the layer struc-ture can be replaced by one in which the surfactantmolecules apparently form cylindrical structures withthe polar heads forming the outer shell. The cylindersline up in hexagonal arrays with the water between(Fig. 6b). This phase is also called the middle, or M1,

phase. At concentrations of surfactant greater than that ofthe lamellar phase, some systems also form an invertedhexagonal phase, also called the inverse middle, or M2,phase, in which the tails point away from the centersof the cylinders. The water occupies the centers of thecylinders.

4. Cubic

A cubic structure, also referred to as the viscous isotropicor V1 phase, sometimes forms at amphiphile concen-trations between those producing lamellar and hexago-nal phases. Ordinary optical observations show only anisotropic structure. X-ray diffraction studies show that thesurfactant molecules pack into spheres, which then packinto a face- or body-centered-cubic lattice. The invertedstructure (V2) can also form between the lamellar and in-verse hexagonal phases. Another viscous isotropic phase(S1c) has been observed at concentrations of amphiphilelower than that of the hexagonal phase.

5. Nematic

Lyotropic nematics have been observed for which theoptical axes are easily oriented, as is the case for ther-motropic nematics. The basic units that align are not sin-gle molecules, but aggregates of molecules whose sizesare comparable to those of micelles—20 to 100 A.

C. Polymeric Liquid Crystals

Examples of liquid-crystalline order have been found influid polymer melts and solutions. Such polymers mayplay an important role in the spatial organization of bio-logical macromolecules, for instance, in the packaging ofDNA in chromosomes and in the aggregation of micro-tubules, which form the structural framework of cells. Afiber formed from the liquid crystalline phase of a poly-mer, Kevlar, is an ultra-high-strength polymer that has astiffness comparable to that of steel with a much lowermass density.

Examples of both lyotropic and thermotropic polymericphases have been studied. Solutions of synthetic polypep-tides can form a helicoidal cholesteric structure in whichthe polymers form twisted rods with pitches between 10−7

and 10−3 m. The rods may be separated by several mil-limeters. Unlike the monomeric thermotropics, however,the rods can be untwisted into a nematic phase by changingtemperature. Polymers having thermotropic liquid crys-talline mesophases have been made by adding single liquidcrystalline molecules to polymer chains to form eithercomb-like or linear polymers. In a comb-like polymer, theliquid crystal monomers are attached by flexible links to

Page 217: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008E-966 July 3, 2001 14:53

Liquid Crystals (Physics) 727

TABLE IV Some Properties of Lyotropic Systems Composed of an Amphiphile and Watera

Suggested structuralarrangement

Percent waterb 0 5–22–50 23–40 34–80 30–99.9 >99.9(approximate range)

Physical state Crystalline Liquid crystalline, Liquid crystalline, Liquid crystalline, Micellar Solutionlamellar face-centered cubic hexagonal compact solution

Gross character Opaque solid Clear, fluid, Clear, brittle, Clear, viscous Clear, fluid Clear, fluidmoderately viscous very viscous

Freedom of movement None Two directions Possibly none One direction No restrictions No restrictions

Microscopic properties Birefringent Neat soap texture Isotropic with Middle soap Isotropic with Isotropic(crossed nicols) angular bubbles texture round bubbles

X-ray data Ring pattern Diffuse halo at Diffuse halo at Diffuse halo at3–6 A about 4.5 A about 4.5 A about 4.5 A

Structural order Three dimensions One dimension Three dimensions Two dimensions None None

a From Brown, G. H., and Wolken, J. J. (1979). “Liquid Crystals and Biological Structure,” pp. 30–31, Academic Press, New York.b The different percentages of water show that different amphiphiles require different amounts of water. For soaps, the lamellar structure gen-

erally occurs between 5 and 22% water; with some lipophiles the water may be as high as 50%. The cubic structure generally occurs between 23 and 40%.

the main chain, like the teeth of a comb. All the com-mon thermotropic phases have been obtained in this way,with the possibility of locking the structure by quench-ing (rapid cooling) in the presence of applied magnetic orelectric fields. Linear polymers formed by linking liquid

FIGURE 6 Molecular arrangements in two lyotropic phases, shown in cross section. (a) A lamellar phase, in whichthe amphiphile molecules form bilayers with their hydrophobic tails toward the layer centers, away from the water. (b)The hexagonal phase, showing cross sections of rod-like structures having their axes perpendicular to the plane ofthe drawing. Water is in the region between the cylinders. [Courtesy of Ging-Sheng Yu.]

crystals end to end have been formed with stable liquidcrystalline phases in the range 100–400◦C. The phaseshave properties in common with the monomeric liquidcrystals, but the response to external stimuli can be muchslower.

Page 218: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008E-966 July 3, 2001 14:53

728 Liquid Crystals (Physics)

FIGURE 7 Main features of a biomembrane: the bilayer of lipid(fat) molecules and the proteins (shaded). At low temperatures thehydrocarbon tails of the lipid molecules appear as regular zigzaglines, as the cross section indicates. The tails are joined in pairsby a backbone, shown on the top surface. Omitted from the top butshown on the bottom layer are the head groups that are attachedto the backbones of the lipid molecules. [From Nagle, J. F., andScott, H. (1978). “Biomembrane phase transitions,” Phys. Today31(2), 38–47 by permission.]

III. PROPERTIES OF LIQUID CRYSTALS

A. The Director and the Order Parameter

For many experiments and in many mesophases, a usefulmodel of the orientational motions of a molecule separatesthe motions into the following classes:

1. Rapid rotations (librations) about the long axis.2. Rapid fluctuations of the long axis about a local

director, designated by a unit vector, n(r, t).3. Fluctuations of the local director, which represent

collective motions of many molecules and are correspond-ingly slower than the individual molecular motions.

Due to external influences, the average director may thenvary over macroscopic distances (Fig. 8). An instanta-neous snapshot would show molecules with their long axesat angles with the local director that can average as muchas 40◦ in a nematic. A nematic would also show symmetrywith respect to alignment of molecules parallel or antipar-allel to the director. One measure of the degree to whichmolecules align with the director is given by the orderparameter s, which is defined by the following equation:

s = 12 〈3 cos2(A) − 1〉 = 〈P2(cos(A))〉, (1)

where A is the angle between the local director and theinstantaneous molecular long axis, P2(x) the second Leg-

FIGURE 8 Static deformations of the director in a nematic liquidcrystal showing pure (a) splay, (b) twist, and (c) bend modes.[Courtesy of Nuno Vaz.]

endre polynomial, and 〈 〉 indicates the average value.A value for s of 1.0 corresponds to perfect order and avalue of 0.0 would indicate complete disorder, as in anisotropic liquid. For a nematic, the values commonly rangefrom 0.4 near a nematic-isotropic transition to as highas 0.8.

B. Elastic Properties

Although a liquid crystal may appear at first glance tobe simply a turbid liquid like milk or a very viscousgel like petroleum jelly, the flow properties can be verycomplicated. For instance, an aligned smectic A in a testtube looks liquid when tilted one way but nearly solidwhen tilted in another direction. Properties such as elas-ticity and viscosity are not simple scalar parameters butdepend on the relation between the direction of the di-rector, layer normals in smectics, and the distortion ormotion.

In the nematic, unlike the solid, there are no perma-nent forces opposing the change of distance between twomolecules or small volume elements. There are, however,torques that oppose the curvature of the director. The as-sumption that the restoring torques are linearly propor-tional to the curvature strains then gives rise to a free-energy density that is a quadratic function of the curvaturestrains. The linear components of curvature are groupedinto three modes of deformation: splay, twist, and bend(see Fig. 8). Splay can be demonstrated by the fingers ofone’s hand when spread out, that is, “splayed.” The fin-gers then diverge from a point; in fact, the divergence ofthe director is the mathematical expression for splay. Inthe twist mode, the director changes as one moves alonga line perpendicular to the original director. The changein director is perpendicular to both directions, unlike thesplay deformation. Bend can be displayed by curling thefingers of one’s hand. In this mode, the director changes indirection as one proceeds along it. Taking the z axis alongthe local director, the first-order derivatives of the directorare classed as follows:

Page 219: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008E-966 July 3, 2001 14:53

Liquid Crystals (Physics) 729

splay: δnx/δx and δny/δy

twist: δny/δx and δnx/δy

bend: δnx/δz and δny/δz.

Making use of the symmetries of the nematic phase, thatis, invariance under rotations about the director and theequivalence of n and −n, the free energy density of anematic in bulk can be written to second order as follows:

F = 12 {K1(∇ · n)2 + K2(n · ∇ × n)2

+ K3(n × ∇ × n)2}. (2)

The constants K1, K2, and K3 are referred to as the Oseen–Frank or splay, twist, and bend elastic constants.

The elastic constants can be determined by severalmethods. One method involves setting up a uniform align-ment of directors by strong anchoring at the parallelboundary surfaces. A magnetic field (or electric field) isthen applied at right angles to the director. The alignmentremains undeformed until a critical field Bc is reached, atwhich point there is a transition, called the Frederiks tran-sition, to a state in which the director varies throughoutthe thickness of the sample. The transition can be detectedwith a polarizing microscope. The critical field in cer-tain geometries of this experiment is simply related to theelastic constants, sample thickness, and anisotropy of themagnetic susceptibility. For instance, for a homeotropicalignment, the critical magnetic field is proportional to thesquare root of K3, the coefficient for bending. In practice,the strength of the anchoring must be taken into account.Measured values are of the order of 10−11 N. The twistelastic constant K2 is generally the smallest of the three,typically 3.0−4.0 × 10−12 N.

Thermal fluctuations in the local director can be de-scribed in terms of the continuum theory and hence de-pend on the elastic constants. As a rigid rotation of thedirector requires no energy, the energy required to createa long-wavelength fluctuation in the director is small, andthe relaxation time for such a fluctuation is long comparedto the period of visible light. The result is significant fluc-tuations in the local optical properties. These fluctuationsgive rise to light scattering, which can then be studied togive information on the elastic constants. For instance, forscattering at an angle A from an incoming beam polarizedat right angles to the sample director and then analyzedwith a polarizer at right angles to the original polarization,the differential cross section per solid angle is roughlygiven by the following:

dσ/d ∼ cot2(A/2) + K1/K2, (3)

allowing the ratio K1/K2 to be determined.

Some of the concepts and results of the continuum the-ory can be applied to other phases. In the smectic A phase,for instance, the smectic layers are easily bent, correspond-ing to a splay deformation, so K1 has values similar tothose found in nematic phases. Twist and bend deforma-tions of the director, on the other hand, are nearly ruled out,as they require changes in the layer thickness comparableto the compression of a normal liquid. One would expectK2 and K3 to increase anomalously in a nematic phase as asmectic A phase is approached, especially when the tran-sition is nearly second order, that is, when the transitionhas a small latent heat.

C. Flow

The coupling between directors and flow complicates thetheoretical and experimental studies of flow properties,even in the “relatively simple” nematic phase. The an-gles between the local director, flow velocity, and velocitygradient (shear) all affect the flow, and the orientationaland translational motions of the molecules are linked. Theformulation of the dynamical properties by Leslie andEricksen is most commonly used in studies of the flowproperties of the nematic state. In this formulation theviscous stress tensor is decomposed into the sum of sixtensors with coefficients having the dimension of a vis-cosity (Leslie coefficients). The effective viscosities mea-sured in different experiments are then analyzed in termsof these coefficients, five of which are independent. Exper-imentally, the direction of alignment must be controlled(by electric fields, typically) and measured. Because ofthe turbidity, optical measurements are restricted to thinsamples. The hydrodynamics and electrohydrodynamicsof nematics often lead to many interesting and potentiallyuseful instabilities.

In a technique used as early as the 1930s to study theviscous properties of nematics, a strong magnetic fieldwas used to align the director of a sample in which a shearflow was set up. That is, the velocity of the liquid wasdirected along the x axis and the shear directed along they axis. With the director along one of the three orthogo-nal axes, the measurement of the ratio of shear stress toshear gave the apparent viscosity for that geometry. Asmight be expected from the elongated shape of typicalliquid crystalline molecules, the measured viscosity wasleast when the director was parallel to the direction of flow(Fig. 9a) and greatest when the director was parallel to theshear (Fig. 9b). Typical values for viscosity coefficients innematic phases at around 130◦C are 1–2 cP for directorparallel to the flow direction, 8–9 cP for director parallelto the velocity gradient, and 2–4 cP for director perpen-dicular to both flow and gradient (1 cP = 1/100 poise =1/100 kg × m2/sec).

Page 220: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008E-966 July 3, 2001 14:53

730 Liquid Crystals (Physics)

FIGURE 9 Two examples of shear flow in which the flow ofmolecules and the velocity gradient (shear) are at right angles.(a) Director parallel to the flow; (b) director parallel to the shear.[Courtesy of Jie Zhu.]

A rotational viscosity can be determined by measuringthe torque required to rotate the director of a cylindricalsample of nematic liquid crystal placed in a magnetic fieldat right angles to the axis of rotation, the latter also beingthe cylinder axis. Disregarding anchoring effects and thecontribution from the bottom of the sample, at low rota-tional velocities the magnetic and viscous torques balancewhen the director makes a uniform angle with the magneticfield. In a variation of this method, a cylindrical nematicsample in a magnetic field is twisted suddenly, rotatingthe director away from the direction of the magnetic field.The director then relaxes to the original orientation, witha characteristic time constant depending on the anisotropyof the magnetic susceptibility, the strength of the magneticfield, and the rotational viscosity of the liquid crystal. Anuclear magnetic resonance (NMR) signal is used to de-termine the orientation of the director.

Several other experiments give information about theflow of liquid crystals. The time dependence of the fluc-tuation of the nematic director can be studied to yieldinformation about the viscosity coefficients. Two such ex-periments are measurements of the frequency modulationof light scattering and the dependence of the NMR lon-gitudinal relaxation time (T1) on (1) the frequency and(2) the angle between the director and the magnetic field.The reflection of ultrasonic shear waves and the attenu-ation of such waves as a function of the angle betweenthe wave vector and the director has also been used to de-termine viscosity coefficients. The sudden application ofan electric or magnetic field to cause a Frederiks transi-tion (see above) is sometimes accompanied by flow, of-ten in a complicated way. As this effect (especially in a

homeotropic sample) limits the behavior of certain typesof liquid crystal displays, it is not infrequently of practicalimportance.

Although a cholesteric liquid crystal behaves locallylike a nematic, its flow properties are vastly different. Itsapparent viscosity increases by as much as a million timesas the shear rate drops to very low values. On the otherhand, for some geometries in which the flow is perpendic-ular to the helical axis, the apparent viscosity is approx-imately of the same order of magnitude as in a nematic.Apparently, at low shear rates, flow in the direction of thepitch axis takes place along a fixed helical structure, withthe molecules constrained to twist as they move along theaxis.

The apparent viscosity of a smectic is very high and, likethat of a cholesteric, depends drastically on the shear rate,with typical values rising from 10 to 104 poise as the shearrate (velocity gradient) is reduced from 100 to 0.01 sec−1.Furthermore, for acoustic waves a smectic liquid crystalwill generally have more than one mode of oscillation at agiven frequency, observable from acoustic studies or fromBrillouin scattering. One branch is associated with den-sity oscillations and has a velocity essentially independentof direction. The second branch, similar to second soundin superfluids, has a velocity that depends on the sine oftwice the angle between the director and the direction ofpropagation. As is the case with elastic constants, some ne-matic viscosity coefficients diverge as a nematic–smectictransition is approached.

D. Director Alignment

The typical effect of an electric or magnetic field on anisotropic liquid is weak, as the external forces work on themolecules individually and the thermal motions dominate;that is, the energy difference between the alignments of amolecule parallel and perpendicular to an electric or mag-netic field is much smaller than kT . On the other hand,in solid crystals the molecules are fixed in position andorientation and the effect of an aligning field is a torqueon the whole crystal. Liquid crystals, having both the flu-idity characteristic of liquids and the collective behaviorsof crystals, respond in unique and sometimes useful waysto external stimuli. The ability to align the director of aliquid crystal and, in turn, affect its optical or mechanicalbehavior by using external fields, surface interactions, andflow gives rise to many of the interesting applications ofliquid crystals such as displays and high-strength materi-als. The interactions of the director with external stimuliare most important for the nematics, especially the ordi-nary nematic, which is fluid in all three dimensions. Tiltedsmectics such as the smectic C can also respond to exter-nal fields, subject to the constraint of having a fixed tilt

Page 221: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008E-966 July 3, 2001 14:53

Liquid Crystals (Physics) 731

angle. Most of the remarks below refer to nematic liquidcrystals.

1. Surface Interactions

Surface forces are often strong enough to impose a well-defined direction to the director at boundaries of the liquidcrystal with other materials. By treating a cleaned glasssurface with certain detergents, it is possible to align thenematic director perpendicular to the surface. It is thuseasy to prepare a sample of nematic liquid crystal be-tween two parallel plates of glass that has a “homeotropictexture,” that is, a single-domain crystal with its opticalaxis perpendicular to the walls. Other surface treatments,such as rubbing the glass with a tissue or evaporating filmsat oblique angles of incidence, can give rise to other an-chorings of the director at the surface. in some cases,as with the free surface of MBBA (the first example ofTable I) in the nematic phase, there is a continuous setof directions that the director can take on, such as thecone of directions that make a constant angle with the sur-face normal. In such a case, transitions in the anchoringhave been observed on changing the temperature of a sam-ple with a thickness on the order of 50 µm.

Surface alignment of the nematic director is used toform the twisted-nematic liquid crystal used in manyLCDs. A nematic liquid crystal having a macroscopic twistis created by placing an ordinary nematic between two sur-faces, each of which has been treated to align the directorparallel to a particular direction in the surface (see Fig. 17).The two surfaces are parallel to one another, but the align-ment directions are at an angle of 90◦, causing the directorto twist slowly through a right angle.

2. Magnetic Fields

While the interaction between an isolated molecule of liq-uid crystalline material and a magnetic field of, say, 1 Tis several orders of magnitude smaller than thermal en-ergies, even a field 10 times smaller will align a sampleof nematic liquid crystal. The molecules in the nematicphase line each other up so that the field acts collectivelyon a large number of molecules. This provides a methodfor aligning liquid crystals that have a nematic phase. Thetime scale for the orientation of the director in a 1-T fielddepends strongly on the viscosity of the sample and canrange from milliseconds to hours. A nematic sample ina cylindrical container, when rotated on its axis with theaxis perpendicular to the magnetic field, can align with itslocal directors sampling all the directions perpendicularto the rotation axis. The competition between the aligningeffects of a surface and a field can give rise to a number

of geometries. For instance, a simple twist can be cre-ated close to a surface at which the director is anchored inone direction contained in the plane, for example, alongthe z axis of an x–z plane. If the magnetic field is ap-plied along the x axis, the director aligns along a right-or left-handed twist, eventually lining up with the mag-netic field far from the surface. In a typical case, thisdistance can be about 3 µm with a field of 1 T. It is in-teresting that a weak external perturbation can be usedto create a distortion on a scale approaching an opticalwavelength. A magnetic field applied at right angles tothe helical axis of a cholesteric can distort the structureand eventually “untwist” the cholesteric into a nematicstructure.

3. Electric Fields

The director of an insulating nematic liquid crystal tends toalign either parallel or perpendicular to an electric field,depending on the structure of the molecule. For a typ-ical nematic that aligns parallel to the electric field, anelectric field of about 1 V/cm is equivalent in effect toa magnetic field of 1 G (roughly the strength of theearth’s magnetic field). In some LCDs an electric fieldis used to switch from the twisted nematic configurationset up by the surface alignment to a nearly homeotropicalignment when the field is turned on. A potential dif-ference of only a few volts is sufficient to cause this re-alignment. The devices use very little energy as only arotation of molecules is involved. Two display devicesthat make use of the alignments due to surfaces andelectric fields are discussed in Section IV (see Figs. 17and 18).

4. Flow Alignment

Flow of a liquid crystal affects the alignment and, con-versely, a realignment caused, for instance, by the appli-cation of an electric or magnetic field can set up a flow ofmaterial. The effects of flow are quite different in differentphases. Because of their characteristic elongated shapes,the molecules in a nematic flow more readily along thedirector than perpendicular to it; thus, if a nematic sam-ple is sheared between two glass plates, the tendency is forthe director to align in the direction of the shear. If the samething is done to a cholesteric having no overall alignmentof the helical axes, a planar or “Grandjean” texture arisesin which the axes line up perpendicular to the direction ofshear. A smectic flows best in directions perpendicular tothe planar normals so that a shear tends to align the smecticlayers with their normals perpendicular to the direction ofthe shear.

Page 222: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008E-966 July 3, 2001 14:53

732 Liquid Crystals (Physics)

FIGURE 10 A schlieren texture obtained in a thin sample of ne-matic liquid crystal. The points from which alternating light anddark areas (brushes) radiate are the end points of line singulari-ties of the director. The picture here is produced in a microscope,as in Figs. 11–16, by viewing the light transmitted through a sam-ple placed between crossed polarizers. [Courtesy of J. W. Doane,Liquid Crystal Institute.]

E. Optical Properties

The optical behaviors of liquid crystals give rise to usefuland often spectacular effects such as the optical switch-ing characteristics used for liquid crystal displays and thevivid temperature-dependent colors of the cholesterics.The appearance of a bulk sample of liquid crystal in or-dinary light ranges from transparent through translucentand from turbid to brightly and iridescently colored. Byusing polarized light, a very wide range of appearancesor textures can be created by means of surface, flow, andfield alignment of the various phases (see Figs. 10–16).Transitions from one phase to another can often have aremarkable appearance.

The microscopic basis for the many optical effects isthe elongated shape and electronic structure of the typical

FIGURE 11 Sample undergoing a transition from a nematic (top)to smectic A (bottom). The temperature is lower at the bottom ofthe picture. One of the “threads” from which the nematic takes itsname is indicated by an arrow in top half of the picture. [Courtesyof Dr. Mary E. Neubert.]

FIGURE 12 Focal conic fan-shaped texture in a smectic A liquidcrystal. The fans in different regions may appear in different colors.[Courtesy of Dr. Mary E. Neubert.]

liquid crystalline molecule together with the tendency forthe long axes of the molecules to align with each otheralong the director. This makes the electric polarizabil-ity and, in turn, the index of refraction of the mediumanisotropic. Phases such as the ordinary nematic and theuntilted smectic A are optically uniaxial with the opticaxis along the director. In fact, the cylindrical symmetryof the phases implies that any macroscopic physical prop-erties, optical ones included, have identical values whenmeasured in any orientation perpendicular to the director.For light propagating along the optic axis, all directions ofpolarization are equivalent, so there is no birefringence.For light traveling along other paths, birefringence is ob-served. Most commonly, the uniaxial phases have positivebirefringence; that is, the refractive index is at its maxi-mum for light polarized along the director (also the opticaxis).

The colors that are produced with liquid crystals canarise in several ways.

1. When the pitch of a cholesteric is in the range ofwavelengths of visible light, Bragg reflections occur forvisible light, similar to the Bragg scattering of X-rays fromsolid crystals.

Page 223: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008E-966 July 3, 2001 14:53

Liquid Crystals (Physics) 733

FIGURE 13 Fan-shaped texture in a smectic E. The striations of-ten bound areas having different colors. [Courtesy of J. W. Doane,Liquid Crystal Institute.]

2. Dichroic dye molecules such as methyl red, whendissolved in liquid crystals, tend to orient along the di-rector. These molecules will absorb light polarized in onedirection. By aligning the director of such a solution indifferent directions with respect to the polarization of alight beam, different colors may be produced.

3. For optical studies of the nature of phases and thetemperatures of transitions, liquid crystals are placed be-tween crossed polarizers. The beautiful and revealing tex-

FIGURE 14 Mosaic textures in a smectic H. The different areasare usually of different colors. [Courtesy of Dr. Mary E. Neubert.]

FIGURE 15 A cholesteric with large, visible pitch, created byadding an optically active material to a normally nematic material.In the dark areas the molecules are perpendicular to the plane ofthe picture. [Courtesy of J. W. Doane, Liquid Crystal Institute.]

tures that are observed are due to the interference charac-teristic of birefringent materials.

A cholesteric is uniaxial on a local scale of tens or hun-dreds of molecules in length with the optic axis rotatingto describe a helix identical to that described by the direc-tor. Light incident along the pitch axis can be thought ofas a sum of two waves, one with electric field rotating inthe opposite sense to the helix and one with the electricfield rotating in the same sense. The first wave behaves asit would in a normal medium, having an effective indexof refraction equal to the average of the refractive indicesfor light polarized along and perpendicular to the opticaxis. The second wave shows anomalous behavior, withnearly perfect reflections in a narrow band of wavelengthsclose to the pitch of the cholesteric. This band is typi-cally only 25 nm (25 × 10−9 m) wide, producing a verypure colored appearance similar to the colors sometimesseen on beetle wings. In the blue phase of cholestericsan unusual platelet structure can be observed when dif-ferent crystal domains have different faces alighned withthe sample surface. Distinct colors are reflected from eachdomain, corresponding to the wavelengths satisfying theconditions for Bragg scattering.

Page 224: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008E-966 July 3, 2001 14:53

734 Liquid Crystals (Physics)

FIGURE 16 Focal conic defects in a chiral smectic C liquid crys-tal. The striped bands are due to the c director helix and run paral-lel to the layers. The typical focal conic ellipse-shaped line defectswith the circular layers nested about the control line defect are ev-ident. [Courtesy of N. A. Clark, University of Colorado, Boulder.]

The colorful interference patterns displayed by manybirefringent liquid crystals observed between crossedpolarizers are useful in determining the symmetry ofunknown phases and the kinds of allowable defects. Whena beam of linearly polarized monochromatic light travelsthrough a birefringent material, the portion of the beampolarized along the optic axis travels at a speed differentfrom that of the portion polarized perpendicular to thatdirection. Depending on the retardation in phase of onecomponent with respect to the other, the two componentscombine to give elliptical, circular, or linearly polarizedlight at various points along the beam. For instance, thebeam returns to its original state of polarization wheneverthe retardation is a multiple of 360◦. It will be absorbedin the second polarizer. When the retardation in phaseis 180◦, the beam is again linearly polarized, but not inthe original direction, and is not entirely absorbed in thesecond polarizer. In a typical liquid crystal sample, therelative retardation depends on the color of the light andthe sample thickness and varies with the alignment of thedirector with respect to the light beam. Thus, with inci-dent white light, different regions of the sample appearto have different colors, which can be changed by rotat-

ing the sample with respect to the polarizers, rotating oneor both polarizers, or changing the alignment of the sam-ple by external fields or flow. A uniaxial sample having ahomeotropic alignment with the director (and thus the op-tic axis) parallel to the beam does not exhibit this behaviorand appears uniformly dark.

A nematic or smectic A sample of constant thickness,prepared with its director everywhere at a constant angle tothe beam of light, will have a uniform colored appearanceexcept at defects, which can then easily be seen. An un-aligned sample will show gradual changes in color orbrightness except at boundaries between domains or di-rector singularities, the shape and motion of which can beused to probe the nature of an unknown phase. A thin sam-ple of nematic having nearly homogeneous alignment dis-plays a characteristic schlieren texture (see Fig. 10) withpoint-like singularities that are the end points of thread-like singularities (known as disclinations) at which thedirectors are undefined. The nature of a singularity, oneexample of which is a radial pattern of directors leadingaway from the point, can be deduced from its shape, itsmovement when the sample is rotated, and the result if itcombines with another singularity. Although planar in na-ture, the smectic C can also form a schlieren texture sincethe component of the director in the plane of the layerscan vary smoothly in direction. Smectics and cholestericsthat are not uniformly aligned usually exhibit a form offocal conic texture displaying numerous elliptical or fan-like structures that scatter light strongly in all directionsand have a strong depolarizing effect (see Figs. 11–13and 16).

There are many other possible textures, from mosaic tofingerprint in appearance, some having very regular grid-like appearances. Disruptions in uniform structures canbe produced by flow, giving rise to patterns such as thesets of parallel lines or even of feathery “chevrons” seenin Williams domains formed by patterns of flow in ne-matics caused by electric fields of certain strengths andfrequencies. At higher voltages these patterns give rise toturbulence, which is accompanied by intense light scat-tering. This “dynamic scattering” is used in some displaydevices.

In bulk, the nematic liquid crystal scatters light stronglydue to small-amplitude collective modes of orienta-tional fluctuations with significant components near vis-ible wavelengths. Mention has already been made ofthe study of light scattering to determine elastic andflow parameters. Light scattering is also measured inthe vicinity of phase transitions to study fluctuationsand critical phenomena. Interesting studies have beenmade of nonlinear light scattering in which the inten-sity of light scattered from an aligned sample was notproportional to the intensity of the original beam of

Page 225: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008E-966 July 3, 2001 14:53

Liquid Crystals (Physics) 735

light. Two laser beams of the same frequency can be di-rected at a liquid crystalline sample to produce a “phasegrating,” which causes a spatial variation in the refrac-tive indices and, in turn, diffracts some of the originallight. This phenomenon may be usable for holographicimaging.

F. Homologous Series

Since it is ultimately intermolecular forces that give riseto the formation of mesophases and determine which ofthem form and at which temperatures, it is important tostudy the relationships between physical properties andchemical structure. Although the shape of a molecule hasa large bearing on the phases it forms and the tempera-tures at which it transforms, other considerations, such asthe rigidity of the bonds and the imbalances in attractionsbetween different parts of neighboring molecules, playstrong roles. These roles can be investigated by studyingseries of compounds differing in, for instance, the natureof the terminal groups or the polarizability of different sub-stituents. One of the chemical variables that lends itself tosuch investigation is the length of the alkyl chains thatcommonly terminate liquid-crystalline molecules. Thecompound MBBA is one of a series of molecules differingonly in the number of carbon atoms appearing in the endchains. Another molecule in the series is the 10th exam-ple of Table I, 4-butyloxybenzal-4-ethylaniline. Typically,the earlier members of such a series of molecules, calleda homologous series, will have nematic phases, the latermembers will have smectic phases, and intermediate mem-bers will display both smectic and nematic phases. Plotsof quantities such as transition temperatures versus carbonnumber will often show an even–odd effect. For instance,a plot of the nematic-to-isotropic transition temperatureversus carbon number will show an alternation betweentwo curves, one for odd numbers of carbons and the otherfor even numbers. The separation of the two curves tendsto diminish at higher carbon numbers. This suggests thatthe attraction between ends of molecules plays a role in thestability of nematic versus isotropic phases, since besidesaffecting the length of the molecule, the addition of onecarbon to an alkyl chain changes the orientation of the lastcarbon–carbon bond with respect to the molecule’s longaxis.

G. Other Studies

Many of the tools available to the physicist and chemisthave been used to study the properties of liquid crys-tals and determine the nature of the phases. Many havebeen mentioned above. Others include measurementsof refractive indices, dielectric constants and relaxation,

magnetic susceptibilities, neutron scattering, and fluores-cence recovery after photobleaching (FRAP, used to studydiffusion). Besides the study of optical textures, X-rayscattering, NMR, and thermal measurements are of par-ticular importance in determining the structure and na-ture of phases and phase transitions and are discussedbriefly.

1. X-Ray Scattering

As in the study of solids, X-ray scattering is used to deter-mine the symmetries of the phases as well as to measurethe separation of the planes of smectics, intermoleculardistances, packing of molecules, and degree of long- andshort-range molecular order. For many thermotropic liq-uid crystals it is possible to study oriented samples, butfor the lyotropics the studies are more often restricted topowder methods. The diffraction patterns seen are neitherthe sharp Bragg peaks of monodomain solid crystals northe diffuse reflections characteristic of isotropic liquids.In oriented samples one finds combinations of sharp anddiffuse rings, arcs of rings, and spots.

The recent availability of synchrotron sources of highintensity and low line width (typically under 10−3 A−1

as opposed to 0.04 A−1 for a rotating-anode generator)has enabled studies of free-standing films of liquid crys-tals with thicknesses down to several molecules. Thishas yielded information relevant to the study of two-dimensional phases and phase transitions and, in fact, ev-idence that the thin films have the expected properties ofa two-dimensional crystal.

2. Nuclear Magnetic Resonance

The phenomenon of NMR offers a number of tools use-ful in elucidating the nature of phases and the motionsof the molecules in those phases. The spectra obtainedfrom the hydrogen nuclei in most liquid crystal phases arenot made up of sets of well-defined absorption lines, asin isotropic liquids, but are typically as broad as 40 kHz,with widths proportional to the order parameter, s. Thisis due to dipole–dipole interactions between neighboringhydrogen nuclei. In isotropic liquids, such interactions arerapidly modulated by the tumbling of the molecules, rapidenough on the NMR time scale (10−7 sec, say, for pro-ton NMR at 100 MHz) to average to zero. In most liquidcrystals the tumbling is not isotropic, and the interactionsare not averaged out, although they are somewhat reducedcompared to the crystalline solid. In the smectic D and vis-cous isotropic phases, chemical shift spectra are obtainedas in isotopic liquids, apparently because the moleculesdiffuse rapidly between areas in which they have differentorientations.

Page 226: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008E-966 July 3, 2001 14:53

736 Liquid Crystals (Physics)

The NMR spectra and relaxation times observed in liq-uid crystalline phases depend on the alignment of the sam-ple in the magnetic field. Especially in a smectic A phase,it is possible to create a single-domain liquid crystal forwhich the director can be rotated with respect to the mag-netic field; this drastically changes the spectrum and evenreduces it to a single line when the director makes a “magicangle” of 54.74◦ with the magnetic field. In a tilted smec-tic, such as smectic C, the limited freedom of the directorto reorient can be seen in the variation of the NMR signal(the free induction decay in a pulsed NMR experiment)when the sample is rotated in the magnetic field. The di-rectors can align in any of a cone of directions centeredon the layer normal. Thus, when rotating a sample orig-inally having a uniform alignment of directors but notnecessarily layers (this is accomplished by cooling from anematic phase while in a strong magnetic field), the direc-tors “follow” the field, although not to the extent seen inthe nematic phase. Unless the sample consists of a singledomain (all the layers are parallel), the final spectrum willbe made up of a superposition of spectra corresponding tothe parts of the sample having various angles between thedirector and the magnetic field.

For a nematic sample that has been suddenly rotated, therealignment of the director can sometimes be observed inthe changing NMR signal, especially near room temper-ature. An effective rotational viscosity can be measuredwith this effect.

The NMR signals from other nuclei are also studied.The signals from deuterium nuclei that have replaced hy-drogen are especially useful, as the difference in chemi-cal behavior is usually slight and the spectra from alignedsamples show many individual, in some cases nonoverlap-ping, lines that can be assigned to the nuclei at particularlocations in the molecules. This allows detailed investiga-tion into the alignment of the molecules in various phases,including the more solidlike smectic phases, in which eventhe rotations of the molecules about their long axes maybe restricted.

Measurements of spin–lattice relaxation times (T1’s)have been a rich source of information pertaining to themotions of the molecules in liquid crystal phases. For ne-matics at temperatures above roughly 50◦C, for instance,the relaxation of hydrogen nuclei is dominated by mo-tions due to cooperative fluctuations in the local director,giving a characteristic frequency and angular dependenceto the measured times. Similar effects can be seen in therotating-frame relaxation rates in smectic liquid crystals.With deuterated samples (samples in which deuteriumatoms have replaced some hydrogen atoms), it is possibleto study the motions of different parts of the molecules.

It is possible to study translational motions of moleculesby using pulsed magnetic field gradients to “label” the nu-

clei of molecules according to their positions. Such exper-iments have allowed direct measurements of self-diffusioncoefficients in a few nematics and smectics as well as thediffusion coefficients of probe molecules such as benzeneor tetramethylsilane (TMS) dissolved in liquid crystals.Since the method is sensitive to the direction of moleculardisplacements, anisotropies can and have been measured.It is also possible to vary the diffusion times, allowinginvestigations into nonlinear diffusion, for instance, thatoccurring between restrictions such as cell walls.

3. Thermal Measurements

Most of the measured values of properties of liquid crys-tals depend on temperature. Studies of these temperaturedependencies are of the utmost importance in determin-ing the correctness of models used to predict them. De-terminations of phase diagrams and measurements of heatcapacities and latent heats are also necessary to discoverthe nature of phases and the differences between them.Because of the large number of mesophases and possibleparameters that can be manipulated to effect transitions,studies of liquid crystals allow investigations into manypredictions of statistical physics, even extending to studiesof the phases of two-dimensional systems. For example,by mixing a compound that has nematic, smectic A, andsmectic C (or only nematic and A) phases with variousproportions of another that has only nematic and smec-tic C, one obtains a phase diagram (with temperature andconcentration as variables) having a “multicritical point,”where the nematic–A, nematic–C, and A–C phase bound-aries meet. As the point is approached by varying con-centrations, the nematic–C transition entropy decreases tozero, as does the “bump” in the specific heat curve near thenematic A transition, while the corresponding bump nearthe A–C transition increases. Although the phase diagramsfor different mixtures differ, high-resolution investigationsclose to the multicritical point have shown universal fea-tures in the shapes of the diagrams. It has been difficultto make theoretical predictions for these shapes because,among other reasons, the smectic A and C phases lacklong-range translational order. As confirmed with X-rayscattering, fluctuations in layer positions diverge logarith-mically with the sample size in those phases.

IV. APPLICATIONS

A. Liquid Crystal Displays

Liquid crystal displays are used in a wide number of ap-plications, from clocks to oscilloscopes. Their popularityis due to the conveniently thin, flat shape and the very low

Page 227: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008E-966 July 3, 2001 14:53

Liquid Crystals (Physics) 737

power required. In these devices, a thin layer of liquid crys-tal (usually nematic) is sandwiched between parallel cellwalls, which have been treated to control the alignment ofthe liquid crystal director. When a potential difference ofseveral volts is applied to transparent electrodes on eitherside of the liquid crystal, the resulting electric field causesa realignment of the molecules and a change in the opti-cal behavior of the layer. Many different approaches havebeen tried. In an early type of display, the dynamic scat-tering display, turbulence was set up in the liquid crystal,causing it to scatter light. Observed with backlighting, theturbulent area would appear darker than the surroundingareas, while it would appear lighter than the surround-ings when observed by reflected light. Polarizers are notrequired for this kind of display.

In a second type of display, a dichroic dye is dissolvedinto the liquid crystal. The dye molecules, which act likepolarizers, are lined up by the liquid crystal molecules sothat the application of an electric field changes the direc-tion of polarization and the amount of light absorbed. Apolarizer is required in this device.

The twisted nematic display, shown in Fig. 17, is theone commonly used for digital watches and other smalldisplays. The surfaces of the cell are treated so that, inthe absence of an electric field, the local directors are allcoplanar, but twist through 90◦ as shown in the top part ofthe figure. Light entering the cell (the wide arrow goingdown at the top of the figure) is polarized parallel to the di-rector at the top surface. The polarization follows the twistin the director and the light passes through the polarizer atthe bottom. It is reflected by a mirror and reverses its pathto emerge at the top surface. This area appears bright. Inan area in which the electric field is turned on, the directorsalign with the field throughout most of the sample. Nowthe beam’s polarization is not rotated by the liquid crys-tal and the light is absorbed by the second polarizer. Thisarea appears dark. The flow that accompanied the changeof state in early designs typically made the response timeof these devices too long to allow the twisted nematic cellto be used in televisions or oscilloscopes.

The essential features of one of the first practical colordisplays to use liquid crystals are shown in Fig. 18. Redand green light is emitted by the phosphors of a cathoderay tube (CRT). A pair of color polarizers is used so thatthe light incident on the liquid crystal consists of red lightpolarized vertically and green light polarized horizontally.As with the twisted nematic cell, if the light passes throughthe liquid crystal cell when the electric field is on, it arrivesunchanged at the final polarizer. In this case the red light isabsorbed and the green is transmitted. On the other hand,with the electric field off, the molecules relax toward theconfiguration favored by the surface interaction, in whichthe directors lie along a curve similar to a parenthesis, (.

FIGURE 17 Operation of a twisted nematic LCD. (a) In the offstate, the molecules align perpendicular to the incoming light (in-dicated by wide arrows) and with a twist from top to bottom thatrotates the direction of polarization of the light so that it passesthrough the second (crossed) polarizer and is reflected, giving abright appearance. (b) With the electric field on, the moleculesline up with the field, except for layers very close to the treatedsurfaces. The incoming light’s polarization is not rotated, and thelight is absorbed in the second polarizer, making the area appeardark. [Courtesy of Jie Zhu.]

The thickness and birefringence of the liquid crystal cellare such that the direction of polarization of the light isrotated by 90◦, with the result that the red light is trans-mitted and the green absorbed. The device switches statesin milliseconds and produces a multicolored display withexcellent contrast, performing well in high ambient light.

Page 228: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008E-966 July 3, 2001 14:53

738 Liquid Crystals (Physics)

FIGURE 18 Liquid crystal CRT sequential color display. Red andgreen information is sequentially written on a multicomponent(R, G) phosphor screen. Color polarizers orthogonally polarize thered and green light emitted. The LC switch sequentially rotatescolored information into the transmission axis of a linear polarizer.[Courtesy of P. J. Bos, Tektronix.]

Other display devices using liquid crystals include ascreen for a miniature color television, storage displaysusing thin-film transistors, and displays that store infor-mation by altering the state of the liquid crystal locally(e.g., destroying the alignment in a small area by heat-ing it with a laser beam). Bistable electro-optic switcheswith switching times under 1 µsec have been constructedwith ferroelectric smectic C liquid crystals. Cholestericshave been used in several bistable displays. In one kind,the cholesteric is initially in a planar configuration. A low-frequency voltage pulse disrupts the alignment into a light-scattering focal-conic structure, which persists after theend of the pulse. A short, higher frequency pulse restoresthe initial alignment. Alternatively, the planar alignmentcan be disrupted with heat from an infrared beam. Liquidcrystals encapsulated in epoxy have been used to createrugged, fast-switching displays with high contrast.

B. Other Commercial Applications

The pitch of a cholesteric liquid crystal, and thus its col-ored appearance, is sensitive to such things as tempera-ture, pressure, electric and magnetic fields, and impuri-ties. Cholesterics are used to create continuous maps oftemperatures on various surfaces, for instance, to locatecircuit board or welding faults and to detect radiation andcarcinoma of the breast. In such applications, a coatingof liquid crystal can be painted on the area. The range oftemperatures to which the coating responds can be widelyvaried by choice of the liquid crystals used.

Ultrasonic waves have been detected with cholestericliquid crystals in which the pitch is altered by local heatingor by the direct effect of high-intensity waves. In anotherapplication the ultrasonic waves directly cause a changefrom one stable director configuration into another. Suchdetectors may be usable in sonar devices.

The development of Kevlar, a high-strength polymercompetitive with steel on a weigh-per-strength basis, hasstimulated the study of liquid crystal polymer phases,from which the fibers are spun. Graphitic fibers formedfrom discotic phases form another class of strong, lightmaterials.

Lyotropic phases are not without applications either.Everyone is familiar with the usefulness of detergents ineveryday life. The correct use of systems—formed fromwater, surfactants, and oil—may help to recover more ofthe oil left in the ground after the primary methods of oilrecovery have been exhausted.

C. Biological and Medical Uses

Structures having liquid-crystalline order occur in manybiological systems. One of the prime examples is the cellmembrane, a representation of which is shown in Fig. 8.The lipid bilayer in the membranes has the same basicstructure as the lamellar structure found in lyotropic liq-uid crystals. More knowledge of the nature of such ma-terials should help in understanding the operation of thecell membrane and is actively being sought.

Liquid crystals are a factor in several diseases. The cellsin sickle-cell anemia have a liquid crystal structure. Hard-ening of the arteries is due to the deposition of liquid crys-tals made from molecules containing cholesterol. It maybe possible to convert the material forming gallstones intoliquid crystals, which can then be passed out of the body.

The uses of cholesteric liquid crystals to determine tem-perature distributions have been mentioned above. Theability of these liquid crystals to convert temperatures intoa visual pattern provides a unique diagnostic tool that hasbeen useful in studying abnormalities in venous patterns,detecting primary or metastatic carcinoma in the skin, andlocating the placenta of a fetus.

D. Statistical Mechanics

The diversity of liquid crystalline phases, the range ofmaterials that form those phases, and their accessibilityto a wide range of experimental studies make the studyof liquid crystals a rich source of information about sta-tistical mechanics. Many of the behaviors have analogiesin, for instance, studies of ferromagnetism, superconduc-tivity, or superfluidity. The blue phase that appears insome cholesterics may be one rare example of a ther-modynamically stable array of defects. Even the rela-tively simple smectic A phase is not completely under-stood. The study of transitions from smectic A to smecticC or nematic phases has yielded much information rel-evant to modern ideas of statistical mechanics, such asthe concept of spontaneously broken symmetries and the

Page 229: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008E-966 July 3, 2001 14:53

Liquid Crystals (Physics) 739

corresponding appearance of new hydrodynamic modes.Studies of the importance of fluctuations for differentsymmetries, ranges of interactions, and the spatial di-mensions of ordering are useful in understanding criticalphenomena.

Near a phase transition, many properties of materialsshould change in ways characteristic of the symmetries ofthe phases involved as opposed to the specific materialsbeing used. This universality can be tested in many wayswith liquid crystals. An example is the shape of phase di-agrams for mixtures of a compound that has nematic andsmectic A phases in its pure form with another compoundthat has nematic and smectic A and smectic C phases.On a plot of temperature versus composition, the curvesforming the boundaries between nematic, smectic A, andsmectic C regions meet at a multicritical point. A secondexample of this behavior has been found in systems thathave a reentrant nematic phase; that is, they display thesequence of phases nematic–smectic A–nematic on heat-ing at constant pressure. Although a phenomenologicalmodel for this behavior is successful in predicting certainproperties of the phase transitions, the microscopic modelto account for such behavior is less certain.

Free-standing films of liquid crystals have been used toinvestigate theories of two-dimensional phase transitionsand have applications to studies of membrane biology andchemical catalysis. Even narrow strands formed by colum-nar liquid crystals have been studied. These experimentsmay be of relevance to possible one-dimensional nematicphases.

SEE ALSO THE FOLLOWING ARTICLES

FERROMAGNETISM • LIQUID CRYSTAL DEVICES • LIQ-UIDS, STRUCTURE AND DYNAMICS • MACROMOLECULES,STRUCTURE • NUCLEAR MAGNETIC RESONANCE • SUR-FACTANTS, INDUSTRIAL APPLICATIONS • ULTRASONICS

AND ACOUSTICS

BIBLIOGRAPHY

Chandrasekhar, S. (1993). “Liquid Crystals,” 2nd ed., Cambridge Univ.Press, Cambridge, UK.

Chigrinov, V. G., and Blinov, L. M. (1996). “Electrooptic Effects inLiquid Crystal Materials,” Springer-Verlag, Berlin.

Collings, P., and Hird, M. (1997). “Introduction to Liquid Crystals:Chemistry and Physics,” Taylor & Francis.

Demus, D., ed. (1998).“Handbook of Liquid Crystals,” Wiley, New York.Dong, R. Y. (1997). “Nuclear Magnetic Resonance of Liquid Crystals,”

2nd ed., Springer-Verlag, Berlin.Kumar, S. (2000). “Liquid Crystals: Experimental Study of Phys-

ical Properties and Phase Transitions,” Cambridge Univ. Press,Cambridge, UK.

Lebedev, V. V., and Kats, E. I. (1994). “Fluctuational Effects in theDynamics of Liquid Crystals,” Springer-Verlag, Berlin.

Mark, H. F., ed. (1987). “Encyclopedia of Polymer Science and Engi-neering: Liquid Crystalline Polymers to Mining Applications,” 2nded., Vol. 9, Wiley, New York.

Vij, J. K., ed. (2000). “Advances in Chemical Physics: Advances inLiquid Crystals,” Vol. 113, Wiley, New York.

Vill, V. (1995). “LiqCryst: Liquid Crystal Database,” Springer-Verlag,Berlin.

Virga, E. G. (1995). “Variational Theories for Liquid Crystals,” CRCPress, Boca Raton, FL.

Page 230: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

LuminescenceJ. N. DemasS. E. DemasUniversity of Virginia

I. IntroductionII. Origins of LuminescencesIII. Excited State TypesIV. Methods of Studying and Characterizing

Excited StatesV. Processes Affecting Luminescence

VI. Types of Luminescence

GLOSSARY

Color centers Absorbing sites in solids caused by latticedefects, trapped electrons or holes, or the formation ofnew chemical species.

Excimer Excited complex that does not exist in theground state and is formed between one excited andone ground-state molecule of the same type.

Exciplex Excimer formed between two molecules of dif-ferent types.

Fluorescence Luminescence characterized by very shortlifetimes; typically a spin-allowed process.

Hole In solids, an electron-deficient center that frequentlycan move through the lattice.

Internal conversion Relaxation of a system from an up-per state to a lower one of the same spin multiplicity.

Intersystem crossing Conversion of a system from astate of one spin multiplicity to another.

Laser Acronym for light amplification by stimulatedemission of radiation—a stimulated emission devicethat produces intense, highly directional, coherent,monochromatic optical radiation.

Luminescence Emission of ultraviolet (UV), visible, orinfrared (IR) radiation of excited materials.

Phosphorescence Luminescence characterized by a longlifetime; frequently a spin-forbidden process.

Quenching Deactivation of an excited state by a none-missive pathway.

Stimulated emission Photon emission from an excitedspecies promoted by the presence of other photons.

Trap Lattice defect or chemical center in solids that cantrap an electron or a hole.

LUMINESCENCE is the emission of ultraviolet (UV),visible, or infrared (IR) radiation from materials andarises from a radiative transition between an excitedstate and a lower state. The classification of the lu-minescence depends on how the excited state was de-rived. Photoluminescence arises following excitation bythe absorption of a photon of light. Electroluminescenceand cathodoluminescence arise from electric current flowin solids or solutions or in gases during an electricaldischarge. Chemiluminescence arises during chemical

799

Page 231: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

800 Luminescence

reactions, and bioluminescence is chemiluminescence inbiological systems. Radioluminescence arises from thepassage of ionizing radiation or particles through mat-ter. Thermoluminescence occurs during gentle sampleheating.

I. INTRODUCTION

Since the beginning of recorded history, and undoubtedlymuch earlier, individuals have been fascinated by lumines-cence. The cold bioluminescences of glowworms, rottingwood, and sea creatures and the spectacular light shows ofthe aurora borealis have been particularly intriguing, anda great deal of effort has been made to understand theirorigins. Until the advent of quantum mechanics, however,the fundamental origins of these emissions could not besatisfactorily explained.

There were numerous ingenious attempts to quantifyluminescence phenomena using photographic and man-ual recording of emission behavior; however, especiallyfor broad molecular emissions, the major breakthroughsin luminescence studies tended to parallel instrumen-tal developments. In particular, the high-sensitivity com-mercial photomultiplier tube marketed in the 1940s andlow-cost spectrofluorimeters of the 1950s can be cred-ited with much of the modern information, theories, andapplications of luminescence. More recently, lasers andnanosecond and subnanosecond decay time instrumentshave revolutionized the types of information that can beextracted.

Any study of luminescence should address the follow-ing key questions:

1. What is the molecular and atomic nature of the originof the luminescence?

2. What are the detailed paths of molecular excitationand deactivation?

3. What are the structures of excited states?4. Can one rationally design systems with specific and

useful properties or exploit existing properties?

This article is concerned primarily with the phenomeno-logical aspects of each type of luminescence rather thanthe theoretical underpinnings of the subject. The originsand factors affecting luminescence are described. Someexperimental methodologies for studying luminescencesare examined. Finally, applications of the various lumi-nescences are described. Emissions of very high-energyphotons from nuclear or inner-electron-shell transitions orfrom the nonspecific incandescence of hot solids or plas-mas are excluded.

II. ORIGINS OF LUMINESCENCES

Specifically considered are emissions that arise by radia-tive transitions between two states of atomic, molecular,or extended molecular systems. A radiative transition isone in which the energy is released as a photon. The na-ture of the emission depends on the nature of the initialand final states and the route to the excited state.

First, types of excited states are categorized, the factorsthat influence excited state emission are described, andthen the methods of excited state population that definethe nature of the emission are discussed.

Figures 1 and 2 show some of the wealth and com-plexities of atomic and molecular emissions. Figure 1shows the absorption and emission (relative intensity as afunction of wavelength) spectra of anthracene. There aretwo distinct emissions. The high-energy band at 400 nmis characterized by a short luminescence lifetime of afew nanoseconds, while the lower-energy emission at700 nm can be characterized by millisecond lifetimes.The overlap between the lowest energy absorption andthe high-energy emission is characteristic of this typeof system. The regular progression of peaks on bothemission systems is also common to many molecularsystems.

FIGURE 1 Absorption (dashed lines) and emission (solid lines)of anthracene. The lower portion displays the electronic and vi-brational assignments of the absorption and emission bands.[Reprinted with permission from Turro, N. (1978). “Modern Molec-ular Photochemistry,” Benjamin/Cummings, Menlo Park, CA.]

Page 232: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

Luminescence 801

FIGURE 2 Uncorrected emission spectrum of a low-pressuremercury vapor discharge. Wavelengths in nanometers are adja-cent to each line. The intensity is on a log scale and the contin-uous background is from the plasma discharge. The line widthsare instrumentally limited, and several of the lines are unresolvedmultiplets. [Data kindly supplied by W. R. Bare.]

Figure 2 shows the cathodoluminescence of atomicmercury in a low-pressure discharge. Particularly note-worthy is the exceptional narrowness of the atomic versusthe molecular emissions. The characteristics and differ-ences of these emissions are discussed in Section III. Notethe logarithmic scale used to display the weaker lines.The broad weak continuum is from plasma discharge inthe supporting Ar gas.

III. EXCITED STATE TYPES

A. Spin Multiplicity

A simplified excited state diagram is pictured in Fig. 3.Details of the quantum mechanical origins and nature ofexcited states are not presented here. The system is char-acterized by a singlet ground state, denoted by So, andsinglet excited states, denoted by Si (I = 1, 2, . . .). Singlet

FIGURE 3 Schematic energy level diagram, or Jablonski dia-gram, for a molecule showing the possible paths of energy degra-dation. Solid lines represent radiative emission processes, anddashed lines represent nonradiative processes. Rate constantsand efficiencies of the indicated constants are denoted by k’sand �’s. [Reprinted with permission from Demas, J. N. (1983).J. Chem. Ed. 60, 803. Copyright 1983 Division of Chemical Edu-cation, American Chemical Society.]

states arise when all the electrons are spin-paired. Alsoshown in the excited manifold is the lowest triplet state,denoted by T1. Triplets arise when there are two unpairedspins. This type of system corresponds to the vast ma-jority of organic molecular species and occurs when thelowest energy configuration of the system is due to all ofthe electrons being spin-paired.

Excited states of such a system generally arise when apaired electron is promoted from a filled to an unoccupiedorbital. The electron can remain paired with the electronleft behind to form excited singlet states, or it can undergoa “spin flip” and become unpaired; this results in a tripletstate. The triplet state derived from a specific orbital pro-motion is of lower energy than the corresponding singletstate (Figs. 1 and 3).

Oxygen and metal ions are the common stable excep-tions to this type of excited state diagram. Atomic speciesin flames or discharges are also frequent exceptions. Oxy-gen has a triplet ground state with singlet and triplet ex-cited states. The ordering of excited singlets and tripletsare inverted over those of Fig. 1, however, with the sing-lets being below their corresponding triplets. Metal ionscan exhibit a multitude of excited state multiplicities,which can range from doublets (one unpaired electron)for Cu2+ and Na, quartets (three unpaired electrons) forCr3+, and octets (seven unpaired electrons) for Eu2+.

Regardless of the nature of the ground state, however,the excited states can have spin multiplicities that are thesame as, or different from, the ground state. Spin selec-tion rules control whether a transition between states isallowed or forbidden. Transitions between states of thesame multiplicity are spin-allowed, while all others areforbidden. Forbiddenness does not mean that a transitionwill not occur at all, but that it will not occur as read-ily as an allowed one. Allowed transitions are character-ized by strong absorptions, large rate constants, and shortlifetimes. Spin-forbidden transitions exhibit weak absorp-tions, long lifetimes, and low rate constants. Compare theallowed 400-nm absorption with the forbidden 650-nm ab-sorption of anthracene (Fig. 1), where the allowed higher-energy transition is 108 times more intense.

Figure 3 is based on the assumption that spin is al-ways a good quantum number. This assumption is notalways correct, especially for species of high atomic num-ber. Spin-orbit coupling can mix orbital and spin-angularmomentum, and then the concept of electron spin fails. Itis necessary to discuss the states of the system in termsof the good quantum number J . Pragmatically, spin-orbitcoupling scrambles the singlet and triplet states and givesa large component of the other spin character to the state.Thus, the mixing of singlet character into a triplet statecan greatly increase the allowedness of spin-forbiddentransitions.

Page 233: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

802 Luminescence

B. Fluorescence and Phosphorescence

Traditionally there has been a phenomenological charac-terization of emission type. Short-lived emissions havebeen considered fluorescences and long-lived emissions,phosphorescences. One would then infer that phos-phorescence arises from spin-forbidden processes andfluorescence from spin-allowed processes.

In the case of many discrete molecular systems this cat-egorization is correct. Figure 1 demonstrates the simulta-neous presence of fluorescence and phosphorescence. Theshort-lived emission is the fluorescence and the long-livedemission the phosphorescence.

This simple model can break down when one considersthe more complex case of fluorescent lamp and televisionphosphors. After use, a television screen glows in a darkroom for minutes to hours. In keeping with the empiricalclassification these glows are called phosphorescences. Asit turns out, however, these very long lifetimes are gener-ally associated with slow secondary trapping processesthat have nothing at all to do with the fundamental lu-minescence processes. Indeed, the fundamental lumines-cence step in many phosphorescences is a spin-allowedprocess.

One also sees numerous incorrect or misleading refer-ences in the literature. The emissions of rare-earth ele-ments (e.g., Tb3+ and Eu3+) and of uranyl are frequentlyreferred to as fluorescences even though their lifetimes arehundreds of microseconds to milliseconds. Furthermore,quantum mechanically these emissions are best describedas spin-forbidden processes. Thus, by the criteria of bothlifetime and quantum mechanics, these emissions are ac-tually phosphorescences. If there is doubt about the originsof an emission, it is best referred to as a luminescence.

C. Energy Degradation Pathways(Nonradiative Pathways)

It is impossible to talk about luminescence without consid-ering additional nonradiative processes. The anthraceneemission spectrum (Fig. 1) is made up of both a fluo-rescence and a phosphorescence. These emissions occurwith the same efficiencies regardless of whether S1 or anupper singlet state is directly excited. Furthermore, be-cause of the weakness of the S0 → S1 absorption, it isusually extremely difficult to excite the triplet state di-rectly. However, efficient phosphorescences on excitationinto the singlet states are common. Finally, there are veryfew molecules that have emission efficiencies (photonsemitted per photon absorbed) of close to 100%. Theseresults imply the existence of both efficient nonradiativedeactivation pathways and radiationless interconversionsbetween states of the same, and of different, multiplicities.

Figure 3 shows a simplified representation of these ad-ditional processes. Relaxation within a manifold of thesame multiplicity is called internal conversion. In con-densed media, internal conversion is very fast comparedwith the rates of radiative emission from upper singletstates and accounts for the rarity of efficient upper-levelemission. This rapidity arises because of the closeness ofthe lower levels, the absence of spin restrictions, and theavailability of vibrational levels of the lower states thatprovide an efficient vibrational cascade mechanism forrelaxation. In condensed media the best known exampleof an upper excited state emission is the S2 → T2 fluores-cence of azulenes.

Radiationless deactivation to the ground state from S1 isalso a special case of internal conversion. However, sinceit competes directly with the main emission process, itis given the separate term quenching. The decreased rateof quenching from S1 to the ground state is attributableto the much larger energy gap between these two levelscompared with the spacing between upper singlets.

Crossing between states of different multiplicities (e.g.,singlet to triplets) is also possible even though the processis spin-forbidden. Conversion between states of differentmultiplicities is called intersystem crossing. Indeed, insome systems with small energy gaps between the singletand triplet states, and with reduction of the forbiddennessbecause of spin-orbit coupling, intersystem crossing canbe so fast compared with radiative coupling to the groundstate that only phosphorescence is observed.

In the triplet manifold as well as in the singlet mani-fold internal conversion usually causes rapid relaxation tothe lowest triplet state before emission occurs. The emit-ting triplet state is also susceptible to direct quenchingto the ground state. Indeed, because of the forbidden-ness of phosphorescence, the long-lived triplet state isvery susceptible to quenching; room-temperature phos-phorescences are relatively rare and generally not veryefficient.

In the gas phase, especially at low pressures, wherecollisions are infrequent, internal conversion is much lessrapid due to the absence of solvent or other molecular vi-brations to help carry away the excess energy. This reducedefficiency of internal conversion makes upper excited stateemissions much more prevalent.

D. Atomic and Molecular Excited States

The states of discrete atoms are described by first deter-mining the one-electron atomic orbitals, then adding thetotal number of electrons by filling the lowest energy or-bitals with two electrons per orbital. The ground state is de-rived from this configuration. Excited states are then gen-erally derived by considering the configurations arising

Page 234: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

Luminescence 803

from promotion of an electron from an occupied to anunoccupied orbital.

For example, the ground-state configuration of atomicmercury is [Xe](4 f )14(5d)10(6s)2, where [Xe] stands forthe closed-shell xenon core. If everything but the 6selectrons are denoted as core, the lowest excited statesof atomic mercury are given by (core)(6s)1(7s)1 and(core)(6s)1(6p)1.

The state diagram for atomic mercury and some of theradiative transitions responsible for emissions of Fig. 2are shown in Fig. 4. The other transitions can be derivedfrom energy differences between states. The state des-ignations based on the quantum numbers S, L , and J areshown above each set of states. The superscript denotes thespin multiplicity of the state, M , and is related to the spinangular momentum quantum number S by M = 2S + 1.The orbital symmetry of the state is determined by theorbital angular momentum quantum number L and isgiven by the upper case letter. The J quantum number,which arises from coupling of spin and orbital angularmomentum and represents the total angular momentum,

FIGURE 4 Energy level and state diagram for atomic mercury.The term symbols for the states are indicated across the top.Some of the radiative transitions are indicated by solid lines. Theorbital configuration is indicated on each state. For example, 6s 6pdenotes a (6s)1(7p)1 outer-shell configuration, and 7d denotes a(6s)1(7d )1. The core is omitted for clarity. [Reprinted with permis-sion from Leverenz, H. W. (1950). “An Introduction to Lumines-cence of Solids,” John Wiley & Sons, New York.]

is the subscript. For example the 253.65-nm emission linearises from a transition from 3 P1 to the 1S0 ground state(S = 1 → S = 0; L = 1 → L = 0; J = 1 → J = 0).

Molecular excited states are derived in much the sameway, except that the orbitals of the system are describedby molecular orbital theory. The single-electron molecularorbitals are made up of combinations of atomic orbitalsderived from the different atoms in the molecule. Thus, themolecular orbitals extend over the entire molecule and arenot localized on a single atom. This delocalization makesfor very rich bonding and spectroscopy. As with the atomiccase the electrons are added to fill up the lowest energyorbitals in order to derive the ground-state configuration.Excited states usually arise from orbital promotions ofelectrons from occupied to unoccupied orbitals.

Excited states of molecular systems are derived froma variety of electron configurations. In organic systemsthe configurations responsible for the low-energy statesgenerally involve π -π∗ and n-π∗ states. The π -π∗ statesare derived from the promotion of an electron from aπ -bonding to a π -antibonding orbital (e.g., anthracene).The n-π∗ excited states are derived from the promotion ofan electron in a nonbonding orbital to a π∗ antibondingorbital; an example is ketones, where an electron in one ofthe nonbonding oxygen orbitals is promoted to the anti-bonding π orbital between the carbon and oxygen atoms.

Metal complexes introduce more new states. The co-ordinating ligands can contribute low-lying π -π∗ orn-π∗ states. Splittings of the degenerate d orbitals by anonspherical ligand environment can give rise to metal-localized d-d transitions in metal complexes with delectrons. In addition, there are charge-transfer transitionsderived from the promotion of an electron from a metal-localized orbital to a ligand-localized orbital or from lig-and to metal orbitals.

A comparison of Figs. 1 and 2 shows a remarkable dif-ference between the molecular and the atomic emissionspectra. The atomic spectrum is incredibly sharp, whilethe molecular spectrum is very broad and exhibits regu-lar progressions. The atomic states are simple because ofthe absence of any other vibrational or rotational states.In contrast, large molecules have a large number of vi-brational and hindered rotational states superimposed onthe simple energy level diagrams of Figs. 3 and 4. Fur-thermore, the molecule can exist in a large number ofconformations in the solvent matrix, each with a charac-teristic absorption and emission. These factors result in abroadening of the molecular transitions.

A more complete energy level diagram is given inFig. 1, where a dominant molecular vibration has its en-ergy levels superimposed on each electronic state. Thisfigure shows why the absorption and emission tend tooverlap with, and be mirror images of, one another. A

Page 235: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

804 Luminescence

well-defined vibrational progression is characteristic ofsystems in which there is little distortion on going fromthe ground to the excited state. Where distortions occur,the vibrational structure is smeared out and the emissionband is broadened and red-shifted.

An interesting type of hybrid atomic molecular sys-tem is exemplified by rare-earth ions in crystal latticesor in molecular complexes. The electronic configurationof rare-earth ions is (core)( f )n(n = 0–14). The lowest ex-cited states are derived, not by orbital promotions, butby rearrangement of the electrons within the f shell. Fur-thermore, these f electrons are so well shielded within theatom that the excited state transitions are very insensitiveto the environment around the atom. Thus, the transitionsof rare-earth elements look more like atomic transi-tions than molecular ones. Atomic-state classifications areused because of the small perturbations on the atomictransitions.

Figure 5 shows emission spectra for a neodymium(III)-doped glass at room and liquid-nitrogen temperatures. Thequasi-atomic line spectra are very clear, especially at 77 K;compare these spectra with Figs. 1 and 2. Emission nar-rowing on cooling is common and one of the reasons whyemissions are frequently studied at low temperatures. Inthis case the 77 K emissions are only 0.16 nm wide.

E. Excimers and Exciplexes

Even if one fully understands ground-state chemistry, onemay find surprises in the excited-state manifold, where to-

FIGURE 5 Emission spectra of Nd3+ in Y3Al5O12 at room and liquid-nitrogen temperatures. [Reprinted with permis-sion from Van Uitert, L. G. (1966). In “Luminescence of Inorganic Solids” (P. Goldberg, ed.), p. 516, Academic Press,New York.]

tally unexpected species suddenly appear. A classic exam-ple of this is encountered in electrically excited mixturesof Ar and F2. There are no known stable Ar compounds.However, one sees an intense ∗ArF emission derived from

∗Ar + F2 → ∗ArF + F (1)

where asterisks denote excited species. The reason for theexistence of ∗ArF but not of ArF is that ∗Ar is not the samechemical species as Ar; it has a completely different elec-tronic configuration. Ar has a closed-shell [Ne](3s)2(3p)6

electronic configuration with no free bonding electronsand so does not form ArF. The lowest excited state of ∗Ar,however, is [Ne](3s)2(3p)5(4s)1, which has unpaired s andp electrons. Chemically this configuration is very similarto that of potassium metal; a free s electron is bound to asingly charged core. Not surprisingly the bonding in ∗ArFis ionic and very much like that of KF.

Rare gas chemistry can be even more complex. At highpressures the rare gas halide can react to give triatomicspecies

∗RgX + Rg → ∗Rg2X (2)

where Rg stands for a rare gas and X for a halogen.Figure 6 shows the emission spectra of several triatomicrare gas compounds.

Excimers and exciplexes are chemically stable excited-state species that can exist only in the excited state anddo not have a corresponding ground-state form. Excimersare excited state dimers formed by the association of twoidentical subunits. Exciplexes are excited state complexes

Page 236: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

Luminescence 805

FIGURE 6 Fluorescence spectra of rare-gas trimers inelectron-beam-excited high-pressure rare gas-halogen mixtures.[Reprinted with permission from Huestis, D. L., Marowsky, G., andTittel, F. K. (1983). In “Excimer Lasers—1983,” AIP ConferenceProceedings No. 100, Subseries on Optical Science and Engi-neering No. 3 (C. K. Rhodes, H. Egger, and H. Pummer, eds.),p. 240, American Institute of Physics, New York.]

formed of two distinctly different subunits. ∗ArF is anexciplex. If the two reactants are chemically similar, thecomplex is a mixed excimer.

The classic excimer, and first to be discovered, ispyrene. Pyrene exhibits no tendency to associate with it-self in the ground state. At higher concentrations, however,excited-state pyrene associates strongly with a ground-state pyrene to form the pyrene excimer, which exhibits anintense emission that is shifted to the red of the monomeremission.

Figure 7 shows the exciplex emission of pyrene on sil-ica gel. The high-energy structured emission is the pyrenemonomer, while the broad low-energy emission derivesfrom exciplexes formed from closely located adsorbedpyrenes.

An interesting dimeric emission arises in the chemilu-minescent reactions of excited-state singlet oxygen. Underthe chemical conditions of generation, high concentrationsof 1O2 exist. Dimer-like species pool energy to producehigher energy emissions:

1O2 + 1O2 → (1O2

)2 → 2O2 + hν (3)

While the emission of 1O2 is in the IR, the “dimol” emis-sion is a spectacular red. Combination bands arising fromstates derived by simultaneous excitation on both oxygenmolecules are observed.

Although not an exciplex, a related area of excited-state behavior is acid-base reactions. Again because ofthe differences in electronic configurations of the groundand excited states, the ground and excited states can have

FIGURE 7 Time-resolved fluorescence spectra of pyrene ad-sorbed on solid silica. The spectra are for the following delays aftera short excitation pulse: (A) 7–52 nsec, (B) 108–162 nsec, and (C)347–404 nsec. The structureless 460-nm band is the pyrene ex-cimer, and the structured high-energy emission is the monomer.The excimer has a short lifetime, which enhances its emission atshort times. [Reprinted with permission from Ware, W. R. (1983).In “Time-Resolved Fluorescence Spectroscopy in Biochemistryand Biology” (R. B. Cundall and R. E. Dale, eds.), p. 53, PlenumPress, New York.]

greatly different pK values. That is, the reaction

H+ + ∗A− ← pK ∗a → ∗HA (4)

has a different pK than the ground-state pKa. This excitedstate pKa, or pK ∗

a , frequently differs from the ground statepKa by 5–10 pK units. Thus, a strong acid may becomea very weak acid in the excited state, or a weak acid maybecome a super acid. Bases can behave similarly. As pro-tonated and unprotonated forms of a species can exhibitvery different properties, emissions of species exhibitingexcited state acid-base chemistry show remarkable and,to the uninitiated, unexpected variations in spectra withpH. Excited state acid-base reactions are now being em-ployed in luminescence based pH sensors in biomedicaland industrial applications.

The excited state reaction of Eq. (4) can involve otherspecies than protons. Metal ion and proteins can react withthe the excited-state species (or the ground-state specieswhich is then excited) to produce a new excited state com-plex with different emission properties (e.g., wavelength,lifetime, quantum yield, or emission polarization). Thesechanges can be used to quantitate the analyte. Many of themodern methods of analysis and probes of biomolecules,polymers, and surfaces are based on such changes.

Page 237: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

806 Luminescence

F. Band States

In strongly interacting solids, the concept of localizedatomic or molecular states fails. Orbitals on adjacent atomsor molecules can interact so strongly that the molecularorbitals of the composite system must be described as ex-tending over the entire lattice rather than being localizedon a specific atom or molecule. A consequence of this isthat there are no longer discrete states of the system, buta series of very closely spaced levels that make up bands.The lower levels are valence bands, and the upper unoc-cupied levels are conduction bands. Conduction occurswhen the electrons in the valence band are promoted tothe conduction band, where they are free to move throughthe lattice. Depending on the energy gap (forbidden band)between the filled valence band and the conduction band,the solid is an insulator, semiconductor, or conductor. Asmall gap permits electrons to be thermally excited to theconduction band at room temperature for conductors orsemiconductors. In insulators, the gap is too large to yieldany appreciable concentration of charge carriers. Lumi-nescence seems to be restricted to semiconductors andinsulators.

Excitation of an electron from the valence to the conduc-tion band produces an excited state of the system, whichcan be treated as any other type of excited state and cangive rise to luminescence. Electron promotion leaves be-hind a positively charged center or hole. Both the electronand the hole can move freely through the solid and areresponsible for photoconductivity. This system is shownschematically in Fig. 8.

The electron and the hole can undergo secondary pro-cesses that influence emission. Both can be trapped at sitesin the lattice. Traps may be defects in the lattice that arisefrom missing ion sites, interstitial ions, or replacementof normal lattice ions with impurities that may introduce

FIGURE 8 Representation of radiative and nonradiative pro-cesses in solids. The lower striped area is the valence band, andthe upper is the conduction band. [Reprinted with permission fromSze, S. M. (1981). “Physics of Semiconductor Devices,” 2nd ed.,Wiley, New York.]

additional lattice defects. Furthermore, other species mayfunction directly as traps if they are easily oxidized orreduced. Figure 8 also shows a schematic representationof electron and hole trapping. An electron is trapped bydropping from the conduction band into a potential en-ergy well. A hole is trapped by pulling an electron froman oxidizable site.

These band states are very sensitive to the size of thesemiconductor particle. Emission colors of CdS nanopar-ticles can vary from the blue to the IR. Such particles arecurrently being adapted to a variety of analytical systems.

Trapped holes or electrons can be well-defined specieswith their own spectroscopy, including characteristic ab-sorption and emission spectra. Such systems are calledcolor centers because of their characteristic colors. Forexample, an electron trapped in a halide ion vacancy inan alkaline halide lattice is called an F center. Sodiumchloride has a yellow F center, potassium chloride a ma-genta one, and potassium bromide a blue one. These Fcenters can undergo reasonably efficient low-temperatureemission. Doping of halide matrices with the activator canproduce a number of new types of centers involving suchspecies as Ag2+, Ag0, Ag+

2 , and Ag02.

Emission can result from direct recombination of theconduction electron with the hole. More commonly, lumi-nescence in band solids arises from impurities. Electronsin the conduction band can relax back to a hole close to theactivator; the energy released excites the activator, whichluminesces. The transition to the trap itself may be radia-tive. Alternatively, if the hole is trapped by oxidizing theactivator, the recombination is a reduction of the center;the chemical energy released by this reaction can lead toexcitation of the center. This is a form of chemilumines-cence.

IV. METHODS OF STUDYING ANDCHARACTERIZING EXCITED STATES

Excited-state processes are usually studied and charac-terized by the following general approaches: emissionand excitation spectra, luminescence efficiencies, polar-ization, temporal behavior, temperature effects, interac-tions with other species, double-resonance methods, flu-orescence line narrowing, spin echoes, transient gratings,and site-selective spectroscopy. This information is corre-lated with absorption processes. Several of the most com-mon approaches are discussed.

An emission spectrum is the relative intensity of emis-sion as a function of wavelength. Data are generally ac-quired by scanning through the emission with a monochro-mator. The relative intensity is measured with an opticaldetector such as a photomultiplier tube or semiconductor

Page 238: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

Luminescence 807

detector. These directly obtained data are not correctedfor the transmission characteristics of the optics or forthe variations in the detector’s sensitivity with wave-length. Uncorrected spectra may bear little resemblanceto true luminescence spectra and must be corrected, usu-ally by calibrating the response of the system with asource of known spectral distribution such as a standardlamp.

Excitation spectra are obtained by measuring the rel-ative emission intensity at a fixed wavelength whilescanning the excitation source. For weakly absorbingsolutions, the amount of light absorbed will be directlyproportional to the sample absorbance. If the emission ef-ficiency is independent of excitation wavelength, then theexcitation spectrum will match the absorption spectrum.As with emission spectra the directly obtained spectra aredistorted by the variations in light output of the sourceversus wavelength. Data are corrected by measuring theexcitation source intensity as a function of wavelength.

Excitation-emission matrices (EEM) are two-dimen-sional plots of excitation and emission spectra. These areinvaluable for characterizing complex mixtures. The ma-trix can provide a unique fingerprint for complex mixtures,and, as such, it is useful in identifying and tracing complexmixtures such as oil spills.

Excited state lifetime measurements are extremely use-ful diagnostic tools of excited-state processes. The stan-dard method is to excite the sample with a pulse that isshorter in duration than the decay phenomena and thenwatch the relaxation by monitoring the luminescence. Itis also possible to monitor the decay by following theexcited state absorption spectrum or the electron spin res-onance spectrum. Using mathematical tricks, one can alsomeasure lifetimes appreciably shorter than the excitation.For extremely short decays, picosecond pulse probe tech-niques are used. Here, a sample is probed using an opticaldelay line where time between excitation and monitor-ing is set by adjusting the distance the probe pulse travelsbefore striking the sample. Lifetimes in the low nanosec-ond range are readily measured using emission relaxationmethods, while picosecond methods measure subpicosec-ond decays.

An alternative approach to lifetime determinations isthe phase shift measurement where one excites the sam-ple with a sinusoidal excitation. The emission is sinusoidaland phase-shifted from the excitation. The phase shift isrelated to the lifetime and the modulation frequency. Avariation is to use a very short duration, high repetitionsource such as a mode-locked laser or a synchrotron. Sucha source can be decomposed in the fundamental at the rep-etition frequency and the higher harmonics. The individualFourier components of the excitation can be used to simul-taneously evaluate the decay times at multiple frequencies.

In principle, multifrequency phase shift and pulse mea-surements provide similar information. Commercial pack-ages are available for both types of measurements. Bothtypes of measurements appear regularly in the literature,and the choice depends on the nature of the problem(i.e., wavelength range, lifetime, required dynamic range,funds) and the researcher’s personal preference. Manyworkers like to see actual decay curves (the sample im-pulse response) rather than a set phase shift versus mod-ulation frequency. However, with the advent of inexpen-sive, bright, easily modulated LEDs for frequencies intothe megahertz range and sophisticated high frequency sig-nal processing, the phase-shift measurement will certainlydominate analytical instruments based on lifetime mea-surements for the forseeable future.

As it turns out, many practical and fundamentally in-teresting systems are not characterized by a single de-cay time, but rather by sums of multiple exponentials oreven more complex decays. This problem is mirrored inphase-shift measurements where the lifetimes determinedat different frequencies differ because the decay is nota single exponential. Such complexity is the rule ratherthan the exception in biological systems, solid-state com-posite sensors, or dynamic multicomponent molecules insolution. Fortunately, the mathematical tools and the ex-traordinary power of inexpensive desktop computers areallowing a successful assault on these important systems.

In terms of the rate constants and paths indicated inFig. 3, the fluorescence and phosphorescence lifetimesare given by

τf = 1/(kf + kqS + kisc) (5a)

τp = 1/(kp + kqT) (5b)

where the subscripts f and p denote the fluorescenceand phosphorescence processes, respectively, q denotesa quenching path, and S and T denote processes from thesinglet and triplet states, respectively; kisc is the rate con-stant for intersystem crossing between S1 and the tripletmanifold.

Luminescence quantum efficiencies (photons emittedper photon absorbed) are given by

�f(S1) = kfτf (6a)

�p(S1) = �isckpτp (6b)

�p(T1) = kpτp (6c)

�ic = kic/(kic + kqS) (6d)

�isc = �ic/(S1)/�p(T1) (6e)

where �ic is the efficiency of internal conversion fromthe upper excited singlet to S1 and �isc is the efficiency ofintersystem crossing between S1 and T1. The parenthetical

Page 239: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

808 Luminescence

FIGURE 9 Absorption (solid line) and excitation spectrum ( ❡) oftrans-[RhBr2py4]+(py = pyridine) at 77 K. Curve A is for the scaleon the left and curve B for the scale on the right. The excitationspectrum is normalized to the absorption maximum. The relativeemission efficiency as a function of excitation energy is in theupper graph. [Reprinted with permission from Demas, J. N., andCrosby, G. A. (1970). J. Am. Chem. Soc. 92, 7626. Copyright 1970American Chemical Society.]

S1 and T1 denote the state into which the photons areabsorbed.

Note that, by measuring the fluorescence efficiency onexcitation into the emitting and any upper state, one candetermine the internal conversion efficiency. By measur-ing the phosphorescence efficiency on excitation to S1 andT1, one can determine the intersystem crossing efficiency.Furthermore, from the luminescence yields, the lifetimesof each state, and the intersystem crossing efficiency, onecan determine kf, kp, kisc, and kq, which largely define thedynamics of the lower excited-state process.

An example of these methods is illustrated in Fig. 9,which shows the corrected excitation and absorption spec-tra as well as the relative luminescence yield of trans-dibromotetra(pyridine)rhodium(III) bromide.

The broad band red emission exhibits a life to 500 µsec,which clearly indicates a spin-forbidden phosphores-cence. There is no fluorescence; therefore, kqS + kisc kf.The relatively intense bands at 25 and 26 kK (1 K =1 cm−1) correspond to an S0 → 1(d-d) transition, whered-d indicates an excited state derived within the metal-localized d-orbitals. The much weaker 20-kK band is thespin-forbidden S0 → 3(d-d) excited-state transition and isthe inverse of the 15 cm−1 emission. The intense bandstarting at 29 cm−1 is another metal-localized state. Ifrelaxation from all levels to the emitting level proceedswith 100% efficiency (i.e., �ic = �isc = 1), the excitationspectrum should match the absorption spectrum. In thisexample, the invariance with wavelength of the emission

yield on the excitation into different singlet states and theemitting triplet level leaves no doubt as to the unity effi-ciency relaxation of all upper levels to the emitting level.Radiationless rate constants for deactivation of the emit-ting level are also easily calculated and can be correlatedwith theories.

Polarized emission spectra can be obtained in eithercrystals or randomly oriented samples. The direction anddegree of polarization of the emission relative to the po-larization of the excitation beam are recorded. From thevariations of this polarization as a function of the absorp-tion bands excited, one can frequently infer the molecularaxis along which the emission originates or the absorp-tions arise.

An exceptionally powerful tool for unraveling the dy-namics of excited-state geometry changes is time-resolvedpolarization anisotropy. Basically, one looks at the degreeof emission polarization following excitation by a shortpolarized excitation pulse. If the molecules stay in a fixedorientation during the emission, the polarization will re-main constant during the decay. If the molecule rotatesduring the emission, the degree of polarization will fall asthe originally ordered system becomes randomized. Fromthe kinetics of the depolarization one can map out the na-ture and the rates of such depolarization processes as en-ergy transfer and localized or whole molecule rotations.This method is invaluable for studying the dynamics ofmotion of large biomolecules.

Rotational anisotropy and steady-state depolarization isalso proving a powerful analytical approach. Many of thenew methods of fluoroimmunoassay are based on changesin the rotational depolarization time as an analyte bindsto fluorescently labeled probes. In addition, much of whatwe are learning about dynamics in biomolecules includ-ing proteins, DNAs, and membranes, involves lumines-cence measurements, especially dynamic depolarizationmethods.

Temperature effects on luminescence efficiencies, life-times, and spectral distributions are valuable diagnostictools for finding the energies of excited states and forexploring excited state relaxation processes. For exam-ple, Fig. 10 shows the excited-state lifetime and emissionyield for [Ru(bpy)3]2+ (bpy = 2,2-bipyridine). The oddtemperature-dependence of the emission can be ascribedto the existence of three states, which are all in thermalequilibrium with one another. Each state has a characteris-tic radiative and radiationless lifetime as well as differentemission yields. The variation in lifetime with tempera-ture arises from the variation in Boltzmann population ofthe three levels. The fitting of these temperature curvespermitted determination of the energy spacing of the lev-els and their lifetimes. State assignments were then basedon these results.

Page 240: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

Luminescence 809

FIGURE 10 Effects of temperature on the lifetime (A) andluminescence efficiency (B) of [Ru(bpy)3]2+ (2,2-bipyridine) inpoly(methyl methacrylate). The solid lines are the calculatedcurves using the three-level energy diagram shown. For eachsublevel, the efficiency as well as the radiative and nonradiativelifetimes used to fit the curves are shown. [Reprinted with permis-sion from Harrigan, R. W., Hager, G. D., and Crosby, G. A. (1973).Chem. Phys. Lett. 21, 487.]

This temperature-dependence was suggested as a cryo-genic thermometer. Especially in the sub-liquid hydrogen(20 K) region, temperature measurements are difficult. Alifetime-based imaging system would allow a continuousspatial readout of the temperature over a complex object.While we are unaware of measurements at such low tem-peratures, an area where spatial resolution of temperatureis important is in wind tunnels where temperatures mayvary from ambient to 100 K. Luminescence intensities ofmetal complexes in conjunction with pressure-sensitiveluminescence paints (PSPs) are now routinely used intemperature sensitive paints (TSPs) on models in windtunnels.

Another interesting temperature-related effect is E-typedelayed fluorescence. This is a long-lived emission witha spectrum that is indistinguishable from the prompt flu-orescence. Delayed fluorescence arises by thermal back-population of the emitting state. Thus, the triplet functionsas a storage state, and fluorescence tracks the triplet con-centration.

One of the major advances in luminescence has been theadvent of fluorescence microscopy (FM). It is now pos-sible to routinely carry out luminescence measurementsof objects with submicron resolution. This has revolu-tionized our understanding of intra- and intercellular pro-cesses, membranes, polymers, and surfaces. For example,dyes have been designed that change their luminescenceon binding to specific metal ions such as calcium. One canincorporate these dyes into living cells and watch the mi-gration and fluctuations of ion concentrations as the cellsgo through different processes. Fluorescent-tagged mono-clonal antibodies are used in clinical diagnosis. Structuralchanges and the organization of membranes and mono-

layers can be followed by staining various domains withdyes. Dynamics on surfaces are easily followed by useof luminescent tracers. The proximity of two proteins canbe judged by monitoring energy transfer between a donorand an acceptor of the different species.

Conventional FM is sufficiently sensitive in that it easyto see visually individual dye-stained DNA molecules.The DNAs are stained with intercalating dyes; each DNAduplex is stained with many dye molecules (about one dyeper four base pairs). The large number of fluorophores perstrand allows easy visualization. Further, since the DNAsare longer than the optical resolution, you can actuallysee the long strands. Contrary to the common view ofDNA as being rigid static molecules, they actually lookmore like very active worms squirming around coiling anduncoiling.

Conventional FM simply replaces the traditionaltransmission-illuminating source with a powerful narrowband excitation source. The sample is viewed through fil-ters that block the excitation and monitor only the sampleemission. Traditionally, images were just viewed or pho-tographed. However, low cost and/or ultrasensitive chargecoupled device (CCD) image detectors have largely re-placed film. The primary distortion of conventional mi-croscopy is that the detector sees contributions from thesample lying above and below the focal plane, whichsmears the image.

Confocal microscopy minimizes the contribution fromout of focal plane images. It uses a raster scan technique togenerate the full image. The sample is excited by a tightlyfocused laser beam. By imaging this small volume througha spatial filter, contributions of the out-of-focus image arereduced or virtually eliminated. This ability to examine asingle focal plane allows confocal microscopy to sectionan image into layers by varying the depth of the imageplane. Then, using 3D imaging software, the scientist canslice, dice, and rotate the image in three dimensions, whichallows previously impossible examination of details.

The resolution of conventional and confocal mi-croscopy is determined by the diffraction limit of the lightor about λ/2. However, resolution can be further improvedby using two-photon excitation with a tightly focusedbeam. Materials with good two-photon cross sections canbe excited at half their absorption wavelength. Since theemission intensity falls off as the square of the excitationflux, and the laser beam intensity falls off rapidly awayfrom the maximum, the effective resolution is closer toλ/3. Figure 11 shows a beautiful example of a collectionof single Rhodamine B dye molecules on a surface. Eachspike represents a single fluorescent molecule. The widthsof the peaks is determined by the optical resolution (about250 nm), but the molecules are far enough apart to giveclear peaks. In addition, background fluorescence from

Page 241: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

810 Luminescence

FIGURE 11 Fluorescence image of single immobilized Rho-damine B dye molecules (5 µm × 5 µm field) dispersed on a glasssubstrate taken with two-photon excitation. Each peak was dueto a single molecule with the fwhm being 250 nm (<λ/3). Thefemtosecond pulse train at the sample had an average power of400 µW, center wavelength of 840, and pulse width of 180 fs.[Adapted from Sanchez, E. J., Novotny, L., Holtom, G. R., andSunney, X. (1997). Xie, J. Phys. Chem. A 101, 7020 (1997). Usedwith permission of the copyright holder the American ChemicalSociety, 1997.]

impurities and Raleigh and Raman scattering are muchless efficient at the longer wavelengths, and detection lim-its can actually improve in spite of the small absorptioncross section of two photon processes.

Further expanding the capabilities of fluorescence mi-croscopy is the addition of lifetime-imaging capabili-ties. This area is called fluorescence lifetime imaging(FLIM). One can get two- and three-dimensional imagingof species based not only on their absorption and emissioncharacteristics, but also on their lifetimes. This allows bet-ter discrimination in complex mixtures.

Major advances in imaging are in the design of selectiveand sensitive probes for different analytes. Genetic engi-neering has proved invaluable in cloning fluorophores orreactive sites for attaching fluorophores in specific targetmolecules. One common tool is green fluorescent jellyfishprotein (GFP) and related proteins. Green flourescent pro-tein has been cloned into mice, which then glow a beautifulgreen when exposed to a black light. Figure 12 shows anexample of a single cell by FM.

V. PROCESSES AFFECTINGLUMINESCENCE

A. Intramolecular Processes

Intramolecular excited-state processes and methods ofstudying them have been discussed in Sections III andIV. A few additional aspects need be briefly mentioned.

FIGURE 12 Fluorescent image of green fluorescent protein(GFP) mutant form of Cyan fluorescent protein (CFP) labeled thePaxillin protein, which is one of the focal adhesion proteins. Cyanfluorescent protein targets Paxillin protein. The small lines all overthe cell represent Paxillin protein with the nucleus at the center.The proteins are in the cytoplasmic area of the cell. The imageis 128 microns × 102 microns and was taken with an OlympusIX-70 microscope with a 60× water immersion lens and a Hama-matsu Orca-2 CCD camera. [Image courtesy of A. Periasamy ofthe Kleck Cellular Imaging Center at the University of Virginia.]

Intramolecular processes are very sensitive to the environ-ment and to excited state types and ordering. For exam-ple, n-π∗ excited states exhibit very efficient intersystemcrossing, while π -π∗ states do not. Thus, ketones such asbenzophenone, the lowest excited states of which are n-π∗,exhibit almost pure phosphorescences, while biphenyl, thelowest excited state of which are π -π∗, exhibits predom-inantly fluorescence because of less efficient intersystemcrossing. Similarly, molecules that phosphoresce stronglybecause the lowest excited state is n-π∗ can be convertedto strongly fluorescent materials by raising the n-π∗ stateabove the fluorescent π -π∗ state. This can frequently bedone by using a hydrogen-bonding solvent, which ties upthe nonbonding electrons and makes their removal ener-getically more difficult.

Rare-earth chelate complexes with organic ligands ex-hibit a form of intramolecular energy transfer. Rare-earthcompounds generally absorb weakly, but strongly absorb-ing compounds can be made by coupling a strongly ab-sorbing ligand to the metal ion. If the energy levels onthe ligand are above the emitting rare-earth level, then ef-ficient energy transfer from the ligand-localized excitedstate to the metal state can occur. Indeed, in many of thesesystems, only metal emissions are observed, even thoughall of the excitation is into the organic ligand.

Another example is radiationless coupling of energyfrom a species to adjacent or bound solvent molecules.

Page 242: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

Luminescence 811

H2O is a much more efficient deactivator of many molec-ular systems than D2O, because of the higher vibrationalfrequencies that more easily bridge the gap between theground and excited states. For example, crystalline Eu3−

and Tb3+ hydrates are more than 10 times less lumines-cent than the corresponding deuterates. This behavior hasbeen used for counting the number of water moleculesaround metal ions in proteins or the average solvent ex-posure of metal complex sensitizers in organized mediasuch as micelles.

B. Bimolecular Processes

Excited states that persist for any appreciable time arealso susceptible to quenching by other chemical species.Excited-state deactivators are called quenchers. Quench-ing can occur by a variety of processes

∗D + Q − k2 → D + Q + � (7a)

D +∗Q (7b)

D+ + Q− (7c)

D− + Q+ (7d)

reaction products (7e)

where k2 is the bimolecular rate constant for deactivationof the excited state. If more than one deactivation pathwayis present, k2 is the sum of all processes affecting thestate.

Equation (7a) denotes catalytic deactivation of the ex-cited state. For example, external heavy atoms can increasespin-orbit coupling in a luminescent molecule during acollision and enhance intersystem crossing from a lumi-nescent singlet to a nonemissive triplet or quench a tripletstate back to the ground state.

Equation (7b) indicates energy transfer from a donor toan acceptor molecule. Efficient energy transfer quenchingmust generally be an exothermic reaction. The photody-namic effect is an example of bimolecular energy trans-fer. A dye in the presence of both light and oxygen willkill organisms. Chemically very reactive singlet oxygenis formed by energy transfer deactivation of dye tripletstates. The chemically reactive singlet oxygen then killsthe organism.

Electron transfer quenching denoted by Eqs. (7c) and(7d) can be either oxidative or reductive depending onwhether the excited species is oxidized or reduced. Elec-tron transfer forms the basis of a number of solar energyconversion schemes and can occur in the excited stateeven though the ground-state species are thermodynam-ically stable. This again points out the great differencesin chemistry that can arise between ground- and excited-state reactions, and stresses the chemical uniqueness ofthe excited state relative to the ground state.

Finally, Eq. (7e) indicates any other type of chemicalreaction. Examples include the addition of singlet oxygento an olefin or the hydrogen abstraction of a proton froma protic solvent by a triplet ketone.

The kinetics of Eqs. (7) yield two Stern-Volmer equa-tions relating emission intensity or lifetime to the concen-tration of the quencher,

(�0/�) − 1 = KSV[Q] (8a)

(τ0/τ ) − 1 = KSV[Q] (8b)

KSV = k2τ2 (8c)

where the �’s are emission intensities and τ ’s are excitedstate lifetimes. The subscript 0 denotes the value in the ab-sence of quencher. The KSV’s and k2’s are Stern-Volmerand bimolecular quenching constants, respectively. Thus,plots of the experimentally determined left-hand side ver-sus [Q] provide KSV’s and k2’s if the unquenched lifetimeis known. The k2’s, in particular, provide a great deal offundamental information concerning the interaction of ex-cited states with quenchers.

Figure 13A shows a typical intensity Stern-Volmer plotsfor luminescence quenching by molecular oxygen of ametal complex in a polymer support. In solution, the plotwould be a straight line, but in the polymer differentmolecules are in different environments, which leads tothe characteristic downward curvature of the plot. Eventhough dissolved in a solid elastomer, the emission isstrongly quenched by oxygen (25-fold at 1 atm of pureoxygen and 8-fold at 1 atm of pure air). This sensitivityto oxygen has led to the development of these systems as

FIGURE 13 (A) Intensity Stern-Volmer quenching plot for[Ru(4,7-Ph2 phen)3 ]2+ (4,7-Ph2 phen = 4,7-diphenyl-1,10-phen-anthroline) in silicone rubber [Carraway, E. R., Demas, J. N.,DeGraff, B. A., and Bacon, J. R. (1991). Anal. Chem. 63, 337.]The solid line is the best fit for a two-site model. (B) Lumines-cence intensity of Ru(4,7-Ph2phen)2+

3 in silicone rubber while be-ing breathed over [Bacon, J. R., and Demas, J. N. (1987). Anal.Chem. 59, 2780. Copyright 1991 and 1987 American ChemicalSociety.]

Page 243: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

812 Luminescence

oxygen sensors in biomedical and industrial applicationswhere they are rapidly replacing the traditional Clark elec-trode sensors. The term quenchometric sensors is used todescribe systems that detect an analyte by quenching ofan excited state.

Figure 13B demonstrates the high sensitivity of the sys-tem to changes in oxygen concentration. The photolu-minescence is being monitored while the film is beingbreathed on. At one point the breath is held and thenbreathing is resumed. The burst in emission intensity iscaused by the increased exchange time in the lungs and,thus, the lower oxygen concentration and higher emis-sion intensity. The irregularities of the following fewbreaths are a result of the higher blood carbon dioxideconcentration.

Quenchometric oxygen sensors also function as pres-sure sensors if the gas composition is always the same.This led to the development of PSPs for monitoring pres-sure over entire aircraft or automobile models. These PSPsare largely supplanting pressure tapes, which are expen-sive and time-consuming to put on models.

If two partners in a collision are excited, new excited-state reactions are observed. For examples, if two excitedtriplets collide, one can have an energy-pooling, spin-conserving reaction that yields an excited singlet state.Also, two excited singlets can annihilate one another toform ground state species:

T + T → ∗S + S (9a)∗S + ∗S → S + S (9b)

Triplet-triplet annihilation is a common form of triplet de-cay under intense excitation conditions where high tripletconcentrations exist. Because of the long lifetimes of thetriplet state, triplet-triplet annihilation can also yield aP-type delayed fluorescence, which can persist for manymicroseconds after the cessation of irradiation. Singlet-singlet annihilation is less common because of the diffi-culty of achieving sufficiently high concentrations of theshort-lived species. However, with the advent of modernhigh-flux lasers it is readily observed and can be a signif-icant nonradiative pathway.

Energy transfer does not require that the molecules bein contact with one another. Resonance energy transfercan occur at distances far exceeding the physical contactdistance if the emission spectrum of the donor overlapsthe absorption of the acceptor and the acceptor absorp-tion spectrum is intense or highly allowed. Dipole-dipoleresonance energy transfer, which is also known as Forstertransfer, can occur at distances approaching 100 nm infavorable cases, and 30- to 50-nm transfers are common.The energy collection in the photosynthetic unit generallyconsists of hundreds of antenna chlorophylls that collect

the energy and then transfer it by a resonance mechanismto the active chlorophyll.

In crystals composed of a single component, long-rangeenergy transfer can occur by a contact mechanism. Theclose proximity of the molecules permits energy to hopfrom one molecule to its neighbor. By a series of hopsthe excitation can sample a very large volume. If there areany quenchers in this volume, the energy can be trans-ferred to the quencher, and no luminescence of the majorcomponent is observed. The classic example of this is thephosphorescence of naphthalene, where it was necessaryto reduce the concentration of β-methylnaphthalene to be-low 10−7 mole fraction to see the host emission.

C. Stimulated Emission

Light absorption occurs if the photon energy matches theenergy gap between the ground excited states. The pho-ton, in effect, induces a transition between the ground andexcited states with the loss of a photon from the radiationfield and the production of an excited state. However, theinverse process is also possible. If the excited molecule isexposed to photons the energies of which exactly matchthe energy gap between the excited state and a lower en-ergy state, the photon induces a transition between theupper and lower states. The net result is the addition of aphoton to the radiation field rather than a loss. This pro-cess is called stimulated emission. Not only is the photonof the same energy as the stimulating photon, but it is alsoof the same phasing and is traveling in the same directionas the original photon.

Stimulated emission is not generally observed because,for it to occur efficiently, a significant fraction of themolecules must be in the excited state. Furthermore, thepopulation of the upper state must have a higher popu-lation than the terminating state (population inversion).Because of the symmetry of the absorption and the stim-ulated emission process, absorption will always be moreefficient than stimulated emission if the terminal state con-centration exceeds that of the upper state.

If stimulated emission occurs in a resonant cavity with ahigh degree of optical feedback, very intense highly direc-tional monochromatic radiation results. Such systems arecalled lasers (light amplification by stimulated emissionof radiation) and have countless practical and fundamentalapplications including surveying, weaponry, excited life-time determinations, and luminescence studies.

D. Multiphoton Processes

At the low fluxes obtained with most conventional lightsources, the only absorption processes generally noted in-volve single photons. With the high fluxes available from

Page 244: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

Luminescence 813

FIGURE 14 Two-photon excitation spectrum for naphthalene.The absorption is into states that occur at half the indicated wave-length. Emission is monitored in the UV. [Reprinted with permis-sion from Wirth, M. J., and Lytle, F. E. (1978). “New Applicationsof Lasers,” ACS Symposium Series 85 (G. M. Hieftje, ed.) p. 24.Copyright 1978 American Chemical Society.]

lasers, however, nonlinear multiphoton processes occur.At high fluxes the absorption criterion is satisfied if the ex-cited state transition energy equals the sum of the energyof two photons. Thus, absorption of two photons producesan excited state that is inaccessible by a single photon ab-sorption. Indeed, efficient two-photon excitations can arisewith light having energies less than the lowest excited stateof the system.

Generally, because multiphoton processes have muchlower cross sections than single-photon processes it isnot possible to monitor the depletion of the beam as inan absorption experiment. However, luminescence fromthe excited species formed provides a powerful and ex-tremely sensitive tool for monitoring the generation ofexcited states.

Figure 14 shows a two-photon excitation spectrum ofnaphthalene using visible excitation. The emission is mon-itored in the UV. The molecular absorptions are transitionsoccurring at half the indicated wavelength (twice the en-ergy).

In addition to having different energy requirementsfrom single-photon processes, the selection rules for two-photon absorptions also differ. Thus, multiphoton ab-sorptions provide a valuable tool for locating and study-ing states otherwise invisible by conventional one-photonspectroscopy.

Multiphoton absorptions are not limited to two photonsof the same wavelength. If a sample is subjected to intenseirradiation by two beams of different colors, absorptioncan also occur if the sum of the energies of the two differentphotons matches the energy of a transition. Even morecomplex schemes can arise if three or more photons ofdifferent wavelengths are required to induce transitionsbetween states to arrive at the final monitored state.

E. Photon-Stimulated Emission and Quenching

Quite unrelated to simultaneous two-photon excited lumi-nescence is a form of sequential multiphoton excited lumi-nescence or quenching. This type of behavior is especiallyprevalent in solid phosphors. Initially, a high-energy pho-ton or charged particle excites the system to a metastablestate. This excited system may be a true excited state, butmore often it is an electron trapped in a high-energy site.Optical excitation with different colored light can lead to astimulated emission if the electron is ejected from the trapby light of an energy matching the gap between the trapand the conduction band (Fig. 8). Once in the conductionband the electron is free to recombine with a suitable cen-ter to produce luminescence. Generally, because of thesmall gaps between the traps and the conduction band,emission of this type is IR-stimulated.

Similarly, photons of suitable energy can induce transi-tions of the excited electron to lower nonemissive excitedstates. This results in quenching of subsequent emissionprocesses. Because of the larger energy gaps to quench-ing levels, photon-stimulated quenching usually requireshigher energy photons than does photon-stimulated emis-sion. Indeed, it is not uncommon for the source that ex-cites the phosphor also to be capable of quenching theluminescence.

Infrared-stimulated emission has potential applicationin IR detectors. The most sensitive optical detectors arephotomultipliers, which do not respond to farther IR pho-tons. However, if a suitable phosphor is pumped up andthen exposed to the IR source, stimulated visible photonscan be detected by a photomultiplier. A related applicationis in light-emitting diodes, where an IR-emitting diode isused to pump up a visible-emitting phosphor by sequentialphoton absorptions.

VI. TYPES OF LUMINESCENCE

Luminescence is categorized by the method of initiation.Although each type of luminescence is described and oneor more applications given, the subject is so broad that anexhaustive discussion is impossible.

A. Photoluminescence

Photoluminescence arises when the emitting excited stateis generated by the absorption of a photon. Excitation canbe either a single or multiphoton process.

The applications of photoluminescence are legion. Theycan be as mundane as blacklight posters or as esoteric asmultiphoton processes for sophisticated state analyses ofmolecules or ions. A particularly powerful application of

Page 245: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

814 Luminescence

photoluminescence is in quantitative analysis. The excita-tion and emission spectra of materials varies greatly. Forexample, the characteristic excitation and emission spec-tra provides “fingerprints” of different crude oils and havebeen used to trace the sources of oil spills or illegal dump-ing. Furthermore, many nonluminescent materials can bemade luminescent by suitable, and frequently highly se-lective, chemical reactions. Concentrations can then bedetermined from photoluminescence intensities.

A major advantage of luminescence over absorptionmethods of analysis is the frequently much greater sensi-tivity of the emission methods. This sensitivity enhance-ment relies in part on the nature of the measurement.Absorption measurements of weakly absorbing samplesdepend on being able to measure small differences in theintensities of large transmitted signals, which is an intrin-sically difficult problem. On the other hand, in emissionmeasurements one is looking for small signals on essen-tially zero backgrounds.

Emission measurements are so sensitive that it is pos-sible to detect single atoms in the gas phase. Indeed, itis possible to hold, study, and cool to cryogenic temper-atures a single atom by using optical tweezers. Due tothe background, detection in solution has proved muchmore difficult.. The supporting solvent has strong scatter-ing (Raleigh and Raman) and it is extremely difficult toget impurity emissions down to low enough levels. Thesolution is to reduce the sampled volume. Scattering andimpurity emission scales with the volume, but the emissionof a single molecule is independent of the volume. So byreducing the sampled volume to pico- or femtoliters, sin-gle molecule detection becomes possible. Single moleculedetection by luminescence is now routine in solution andon surfaces.

In solution, two methods are used. The analyte maybe constrained to a hydrodynamically stabilized flowingstream of a few microns. A tightly focused laser beam in-tersects the stream forming the analysis region of a fewpicoliters or less. As the molecules flow through a fo-cused laser beam, they are excited and fluoresce multipletimes. This burst of photons is collected with a high nu-merical aperture microscope objective, focused through apin-hole spatial filter to remove any unwanted backgroundregions and then focused onto a single photon-countingphotomultiplier tube or avalanche photodiode. In the al-ternative implementation, a confocal microscope is usedto image an exceptionally small solution volume (sub-femtoliter). Then, as molecules diffuse in and out of thesensing volume, bursts of photons occur. One can use thisstrategy for determining the diffusion coefficient of thefluorophore since time spent in the vicinity of the sens-ing element is determined by the diffusion. Kinetics ofreactions such as ligand receptor binding on proteins can

be studied similarly since there is a big difference in thediffusion coefficients of the bound and unbound ligand.

Surface detection of single molecules frequently usednear-field scanning optical microscopy (NSOM) to reducethe detection volume or area. The resolution of normalmicroscopy is limited to the diffraction limit, which isabout half the wavelength of the light. However, by mak-ing the source of collection optically smaller, it is possi-ble to excite or collect from a far smaller region (i.e., afew nanometers). The source/collector is a fiberoptic tipdrawn down to sub-wavelength dimensions. If the tip isvery close to the detection volume, the light does not havedistance enough to broaden by diffraction. Imaging is doneby raster scanning the tip over the surface much as a scan-ning tunneling or atomic force microscope does. Not onlycan single molecules be readily detected, but motion pic-tures of surfaces can be measured to follow such effectsas surface diffusion.

The selectivity of luminescence methods is best exem-plified by fluoroimmunoassay (FIA), by which specificantigens can be detected even in a messy medium such as aserum. Radioimmunoassay (RIA) has dominated the fieldwhen the ultimate in sensitivity is required (pico- to femto-moles). The problems of working with and disposing of ra-dioactive tracers have given impetus to the use of the some-what less sensitive FIA. Furthermore, advances in flowcytometry and laser-based detection schemes promise tomake FIA fully competitive with RIA.

B. Chemiluminescence

In chemiluminescence (CL) the energy necessary forexcited-state generation is derived from the energy re-leased in a chemical reaction. Excluding flames, proba-bly the first man-made CL was the air oxidation of phos-phorus, discovered by Brand in 1669. This discovery isthe subject of a classic and beautifully detailed engraving(Fig. 15).

Three basic processes can initiate CL: (1) decomposi-tion of a high-energy species to lower-energy ones; (2)exothermic reaction of two or more components; and(3) electron transfer reactions. These processes are rep-resented by

A →∗B (10a)

or

∗B + C

A + B → ∗C + D (10b)

or

∗C

Page 246: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

Luminescence 815

FIGURE 15 Chemiluminescence on air oxidation of phosphorus.A portion of The Discovery of Phosphorus by the alchemist Brand(1669), engraving by William Pether (1775) after the painting byJoseph Wright, Fisher Collection, Pittsburgh. [Reproduction cour-tesy of the Fisher Collection, Fisher Scientific Co.]

A+ + B− → ∗A + B (10c)

or

A + ∗B

Atom transfer reactions are included in Eq. (10b). Whilethis classification is convenient, it is frequently simplistic.Ternary reactions are possible, and electron transfer stepscan also be involved in the first two types.

In all cases the energy released by the reaction must beadequate to generate the excited state. Although a smallenergy shortfall can be made up by drawing energy fromthe thermal pool, efficient CLs usually have large excessdriving energies. Furthermore there must be an efficientpathway for the ground-state reactants to pass over to thesurface corresponding to excited-state products.

It is not necessary for one of the chemical productsto be the luminescent species or even to have an availableexcited state. If the energy is released in a system in whicha suitable acceptor is readily available, the energy can betransferred to the acceptor, which then luminesces.

Examples of Eq. (10a) CLs are the reactions of 1,2-dioxetenes. Dioxetenes can give rise to efficient CLs andare implicated in a variety of bioluminescences. Diox-etenes can be formed biochemically by oxidations ofolefins or aromatic molecules:

C C oxidizer C C

O O

+ (11)

The CL arises on cleavage to ketones, with one beingexcited and the other being in the ground state:

C OC C

O O

∗ + C O (12)

The excited ketone can then luminesce or transfer en-ergy to another luminescent species.

The long-lived afterglow in electrical discharges in ni-trogen is an example of recombination CL [Eq. (10b)],

N + N + M → ∗N2 + M (13)

where M is a third body required to carry away excessenergy to prevent the hot product from promptly redisso-ciating. This reaction can give rise to pink, yellow, andblue afterglows. In spite of enormous effort, this systemis far from being fully understood.

Flames are rich sources of CL. For example, the blueglow of hydrocarbon flames arises from C2 as well as CNand OH radicals. At least some of these emissions are CLs.

Even the luminescence of many trace elements in flamescan be attributed in part to chemical excitation of the metalby such processes as

H + OH + M → ∗M + H2O (14a)

H + H + M → ∗M + H2 (14b)

where M represents a metal ion.Hydrazide CL, typified by the oxidation of Luminol, is

given by the following reaction sequence,

Fe2+ + H2O2 Fe3+ + OH −

NH

NH

O

O

R

OH −

NH

N −

O

O

OH•

NH

N

O

O

OH −

N −

N

O

O

O2

CO2−

CO2−

CO2−

CO2−

+

hv

+ OH

R

R R

RR (15)

where R NH2 for Luminol. That the emissive species isthe phthalate dianion is shown by the agreement betweenthe CL and photoluminescence spectra of the product.

Page 247: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

816 Luminescence

Thus, although paths to the excited state are quite dif-ferent, the final excited state is the same.

Only a few milligrams of Luminol produce an impres-sive light show, but the CL efficiency (photons emitted pernumber of molecules reacted) is, in fact, only about 1%.This result emphasizes the impressive quantity of lightthat a mole of photons represents.

The third class of CLs is initiated by electron transferprocesses. This type is exemplified by the one-electrontransfer reduction of [Ru(bpy)3]3+ (bpy = 2,2′-bipyri-dine), which yields a highly visible orange CL with hy-drazine, water, and a variety of other reductants. Indeed,with NaBH4 as the reductant, the spectacular emission ishighly visible in a well-lighted room.

The complexity of simple CL systems is demonstratedby the simple [Ru(bpy)3]3+/oxalate system:

[Ru(bpy)3]3+ + C2O2−4 → [Ru(bpy)3]2+ + C2O−

4 (16a)

[Ru(bpy)3]3+ + C2O−4 → ∗[Ru(bpy)3]2+ + 2CO2

(16b)

The initial reduction step does not appear to be energeticenough to excite the complex, and a secondary reactionwith the energetic C2O−

4 free radical is required for excita-tion. This type of behavior, in which the initial step is notsufficiently energetic to initiate CL, is fairly common, andthus reactive radicals are common in the actual CL step.

A special case of electron transfer CL is that inwhich the reductant is the ultimate reductant, an electron.[Ru(bpy)3]3+ exhibits an intense CL on reaction with hy-drated electrons:

[Ru(bpy)3]3+ + e− → [Ru(bpy)3]2+ (17)

The efficiency of excited state production approaches100%.

Applications of CL are diverse. There are commercialemergency lights that require only the breaking of a sealto permit the mixing of solutions. The long-lived CL isbright enough to read by.

One of the simplest and most sensitive NO analyzersutilizes the very efficient CL of the reaction

NO + O3 → ∗NO2 + O2 (18)

With very simple instrumentation, part per billion levelsof NO can be accurately and precisely measured.

Another useful analytical application uses metastable∗N2 to transfer energy to luminescent metals or molecules.Concentrations as low as 106 atoms per cubic centimeterare readily measured.

Chemical lasers use CLs. In a chemical laser the excitedstate population inversion is produced directly by chemi-cal energy. One of the most efficient chemical lasers usesthe hydrogen-fluorine reaction, which is initiated by thedissociation of F2,

F2 + hν (or electrical energy) → 2F (19a)

H2 + F → H + ∗HF (19b)

F2 + H → 2F + ∗HF (19c)

where ∗HF is vibrationally excited HF produced with pop-ulation inversion. The laser transitions are near-IR emis-sions arising between vibrational levels of HF.

C. Bioluminescence

Bioluminescence (BL) is a CL arising from living mat-ter. Fireflies and the glow of disturbed microorganismsin a ship’s wake are probably the best known. Al-though grouped separately, BLs are actually CLs in whichthe reactants are produced by, and organized in, livingorganisms.

Not surprisingly, one of the best known BLs is fireflyreaction, which involves the enzymatic oxidation of a lu-ciferin. This reaction can have an incredible efficiency,approaching 100%. The luciferin molecule and a num-ber of synthetic analogs have been studied to elucidate themechanism. The mechanism appears to involve a peroxidedecomposition with free radical intervention.

HO S

NOH

O

S

NO2

Enzymes

Photinus pyralis

HO S

N O

S

N

CO2+ hv+

(20)

Bioluminescence is much more pervasive than origi-nally suspected. Low-level BLs are common to many fun-damental and essential biological processes such as lipidperoxidation, intracellular redox processes, and catalasedecomposition of poisonous H2O2.

Bioluminescence has had a crucial role in direct stud-ies of cellular and biochemical processes. For example,there is a CL associated with the formation of the ul-timate carcinogen benzo[a]pyrene-7,8-dihydrodiol 9,10-epoxide from benzo[a]pyrene. Also, a very sensitive CLassay for the benzo[a]pyrene-7,8-diol has been developed.Bioluminescence also continues to play a pivotal role inthe development of the fundamental concepts of CL.

D. Electroluminescence

Electroluminescence (EL) is luminescence occurring onelectron flow in a solid-state device or electrochemicalcell. The mechanisms of these two ELs are quite different.

Page 248: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

Luminescence 817

In the electrochemical cell, electrochemical oxidationsand reductions occur. It is thus not surprising that most ELsare electron transfer CLs, which are also called electro-chemiluminescences (ECLs). The latter are usually pro-duced by rapidly reversing the potential on an electro-chemical cell. In this manner both oxidized and reducedspecies are generated near the same electrode. Electrontransfer reactions between oxidizer and reductant are re-sponsible for the ECL:

A + e− → A− (21a)

B − e− → B+ (21b)

A− + B+ → ∗A + B or A + B (21c)

Except for chemical side reactions these systems do notrun down and, thus, provide light indefinitely. Many suchreactions have been studied as light sources, but light-emitting diodes have proved too simple, stable, and effi-cient to compete against. Among the ELs that have beenstudied the most thoroughly and for the longest time arethe polycyclic aromatic hydrocarbons in aprotic solvents.These systems are difficult to work with, however, be-cause of the great reactivity of the high-energy organicradicals.

Metal complexes also provide efficient ECLs. For ex-ample, [Ru(bpy)3]2+ is readily oxidized to the +3 andreduced to the +1 oxidation states. Reaction of the +1and +3 species gives a very efficient CL. Because of therobustness and relatively low reactivity of the +3 and +1states, which are stable chemical species rather than freeradicals, these inorganic systems are much easier to workwith than the highly reactive free radical aromatic hydro-carbons.

An ECL need not be reversible. If the solution containsan electro-inactive reactant that exhibits CL with a prod-uct of the electrode reaction, then one has an EL. Thesystem will eventually run down when the reactant is con-sumed. Figure 16 shows the ECL of [Ru(bpy)3]3+ withoxalate. The Ru(III) is generated electrochemically at theelectrode and yields a CL with the electro-inactive oxalate.The spiking is caused by the pulsing of the electrode po-tential to form the Ru(III) with periods of low potential topermit more oxalate to diffuse up to the electrode beforethe next pulse. Also shown is the photoluminescence ofthe Ru(II) complex. The agreement between the ECL andthe photoluminescence spectra leaves no doubt that theRu(II) complex is the active species.

Solid-state EL devices fall into two categories: (1) film-type semiconductor devices and (2) solid-state diodes.

The earliest EL was obtained by subjecting a ZnS phos-phor to an AC potential. The resultant current flow inthe semiconductor excited the electrons to the conductionband. Recombination of the electrons and holes either di-rectly or at traps gave rise to the luminescence.

FIGURE 16 Electrochemiluminescence (a) and photolumines-cence (b) spectra of Ru(bpy)3]2+. The ECL is from electrogen-erated [Ru(bpy)3]3+ in the presence of oxalate. The spiking of Ais caused by pulsing of the electrode potential. [Reprinted with per-mission from Rubinstein, I., and Bard, A. J. (1981). J. Am. Chem.Soc. 103, 512. Copyright 1981 American Chemical Society.]

Electroluminescent panels are a common source of low-level night lighting. Their efficiencies are much too lowfor general lighting.

Solid-state diode emitters are called light-emittingdiodes (LEDs). An LED consists of a broad band-gapsemiconductor diode formed by bonding a p-type (hole-rich) semiconductor material to an n-type (electron-rich)semiconductor. The resultant pn junction has both recti-fication and emitting properties.

Light-emitting diodes exhibit light emission whenforward-biased, which causes current flow through thejunction. Current flow arises by the flow of holes fromthe hole-rich p material to the n material, and by the flowof electrons from the n- to the p-type semiconductor. Theelectrons, for example, are now in the conduction band ofthe p semiconductor material, which has recombinationsites suitable for radiative transitions from the conduc-tion to the valence band. The light emission processesarise from direct band-gap radiative recombination withthese holes or with suitable sensitizer centers. A similarfate potentially awaits the holes that traverse the junctionand find themselves in an environment rich in conduc-tion electrons. However, radiative recombination in the psemiconductor is more efficient, and LEDs are physicallybuilt so that emitted radiation can most freely escape fromthe p material.

Commercially available LEDs are blue/violet, green,yellow, red, and different ranges of IR. Intense, expensiveblue-emitting LEDs are now available and some have rea-sonable intensities in the near UV. Figure 17 shows typicalemission spectra for different LEDs. The emission wave-length is controlled by the band gap of the semiconductor

Page 249: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

818 Luminescence

FIGURE 17 Emission spectra of LEDs. The spectral sensitivity ofthe human eye is shown for comparison. [Reprinted with permis-sion from Sze, S. M. (1981). “Physics of Semiconductor Devices,”2nd ed., John Wiley & Sons, New York.]

material. By supplying adequate optical feedback, solid-state LED lasers can be made. Very inexpensive, intenselaser diodes are available down to 630 nm and at a vari-ety of IR wavelengths. The modern laser pointers, as wellas CDs, CD-Rs, CD-RWs, and DVDs, are based on theselasers.

Shorter wavelength laser diodes are only a matter oftime. There is enormous commercial pressure for suchdevelopment since information packing density on opti-cal storage media scales as the inverse square of the wave-length (i.e., halving the wavelength, quadruples the stor-age capacity). Already LEDs and laser diodes are usedin a variety of commercial scientific instrumentation. Inaddition, the red LEDs are so efficient and inexpensive thatthey are replacing incandescent bulbs in red traffic lightsand the green and yellow colors will shortly follow. Whenblue LEDs become efficient enough, look to the replace-ment of fluorescent lights with banks of LEDs emitting inthe three primary colors.

LEDs are used as state indicators and numeric displayson consumer and commercial electronics. For battery op-erated equipment, the lower power consumption of liquidcrystal displays, however, has resulted in their supersed-ing LEDs except where self-luminosity is required, andeven here LCDs backed by electroluminescent panels aremore common than LEDs.

Electrochemiluminescence devices are not commer-cialized. They do, however, have applications in someanalytical work. The ECL of [Ru(bpy)3]3+ with oxalateis a sensitive analytical method for oxalate.

E. Cathodoluminescence

Cathodoluminescence is luminescence arising in gas orlow-pressure electrical discharges. The luminescence canarise either in the electrical discharge itself or on the elec-trode or target.

Luminescences from electrodes result from excitationof electrode material from the impact of electrons orcharged particles on the surfaces. Electronic excitation canresult directly from inelastic collisions of particles withatoms or molecules. Chemical bonds can be broken, andhigh-energy electrons can be ejected. These high-energysecondary electrons can produce other electronic exci-tations, chemistry, or ionizations. The recombination ofelectrons with oxidized sites can release enough energyto excite either the product of the electron transfer reac-tion or excitable species in the neighborhood. Chemicalreactions of decomposition products can produce CL.

Gas-phase excitation arises from the same processesas surface excitation. Inelastic collisions can excitemolecules or atoms, and species can be decomposed orionized. Reactions between products can yield CL or newspecies that can be excited in turn giving rise to newemissions.

Gas-phase emission need not be limited to species origi-nally in the gas phase. The energy transferred to the surfaceby electron or ion impact can eject or sputter surface ma-terial. This sputtering material, once introduced into thedischarge, can be excited by the processes just described.

Secondary processes such as energy transfer andexcited-state processes can also occur. These affect thenature and efficiency of the emission.

Cathodoluminescence forms the basis of innumerablepractical devices. Color and black-and-white televisionsare examples of phosphors excited by electron impact onsolids. The ubiquitous hollow cathode source of elementalemission lines uses electrode sputtering to introduce thedesired element into the discharge. Advertising signs anddischarge lighting are very common.

The fluorescent lamp represents an interesting appli-cation of both cathodo-luminescence and photolumines-cence. The discharge is in a low-pressure rare gas witha trace of mercury vapor. The 254-nm atomic mercuryemission dominates the optical output (Fig. 2). The in-side of the discharge tube is coated with a UV-excitable,visible-emitting phosphor, which transforms the invisibleUV into usable visible light. By changing the phosphor,the emission color tint can be varied, as in the reddishgreenhouse lamp. The long-wavelength blacklight workssimilarly except that the phosphor emits in the 300- to400-nm region. The color TV or monitor is a beautifulexample of three different color phosphors excited inde-pendently to produce an image. Close inspection of a color

Page 250: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

Luminescence 819

screen in operation will clearly show the three color ele-ments. The red phosphor is based on the intense red lineemission of Eu(III); the ghastly flesh tones of early colorTV were the result of the poor red characteristics of earlierphosphors.

F. Radioluminescence

Radioluminescence arises from interactions between ion-izing radiation and matter. This radiation may be particles,such as α or β particles, and cosmic rays or photons, suchas X-rays or γ rays. Materials exhibiting radiolumines-cence are called scintillators, named after the scintilla-tions, or bursts of light, that arise when radiation strikesthe material.

The mechanisms of excitation by ionizing radiation arevery similar to those listed under cathodoluminescence.The X- and γ -ray excitations are initiated by energy lossof the photons as they interact with matter. For higher en-ergies, energy loss occurs by photoionization. Substratesof high atomic number absorb energy more efficiently thanthose of low atomic number because of the high concen-tration of electrons.

Charged particles and secondary electrons arising fromγ rays behave like the charged particles of gas dischargesstriking a surface, and the processes of excited speciesgeneration are the same. Due to the penetrating power ofradiation, however, excitation can penetrate much moredeeply.

An important by-product of ionizing radiation in solidsis that electrons are frequently trapped in lattice sites(Fig. 8). Depending on the depth of the trap sites, theelectron may either recombine rapidly or be trapped foran extended period of time, even for millions of years.Release of these electrons can then yield a delayed re-combination luminescence. This delayed luminescence inthermoluminescence is discussed in Section VI.G.

Radioluminescence forms the heart of many radiationdetectors. The traditional γ detector is thallium-activatedsodium iodide. The high atomic number of iodide in-creases the efficiency of energy absorption, and the energyis used to excite the Tl+ luminescent center.

β-Particle detectors are crystals or organic molecules(e.g., anthracene) or solutions of highly fluorescent or-ganic scintillators dissolved in solid aromatic polymers(e.g., polystyrene) or aromatic solutions. The substrateof an aromatic polymer or solvent is an essential com-ponent. The fluorescent indicator is generally present inrelatively low concentrations, and much of the excitationoccurs away from the scintillator. The bulk of the initialexcited states formed are, thus, the easily excited bulkaromatic solvents. Due to the closeness of the aromaticspecies, the excitation can migrate or hop from molecule

to molecule until the energy comes close enough to thescintillator molecule for efficient energy transfer. The scin-tillator molecule must have an excited singlet state that isbelow the energy levels of the solvent molecules. Thus,the scintillator traps the energy and emits.

Organic scintillators form the basis of the importantliquid scintillation counter. The penetrating power of β

particles is very limited. Therefore, counting weak β emit-ters such as tritium is very difficult with windowed de-tectors because the window or the sample itself absorbsmost of the radiation. The penetration problem is solvedby homogeneously dissolving the material to be countedin a “cocktail” consisting of the solvent and the scintil-lator. Since the particles are emitted directly in the pres-ence of the scintillator, the penetrating power becomesirrelevant.

G. Thermoluminescence

Thermoluminescence (TL) is luminescence that arises ongentle warming of a material and usually occurs belowincandescence. Reading the sample by heating it destroysthe activation, so the readout is a single-shot experiment.Most TL materials can, however, be reactivated and usedrepeatedly.

The mechanism of TL is shown in Fig. 18. The acti-vation process consists of trapping an electron in a trap

FIGURE 18 Thermoluminescence glow curve (a) of quartz ex-tracted from pottery for use in TL dating. The sample has beenartificially irradiated with 550 rads of β radiation as part of thecalibration process. Curve (b) is the red-hot glow measured onreheating the sample. The TL is composed of both the stored andirradiated components. [Reprinted with permission from Aitken,M. J., and Fleming, S. J. (1972). In “Topics in Radiation Dosime-try,” Supplement 1 (F. H. Attix, ed.), p. 1, Academic Press, NewYork.]

Page 251: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

820 Luminescence

site. The trap site must be deep enough to prevent rapidremoval of the electron at the storage temperature. Activa-tion can be accomplished by ionizing radiation or by light.Once the sample is activated, the readout is performed byheating the sample. When the temperature becomes suf-ficiently high, the electrons are thermally excited into theconduction band. These conduction electrons can migratethrough the lattice until they find a hole to relax into. Ifthe hole is an emissive center, or is near one, the center isexcited and emits. Alternatively, the relaxation to the holecan be emissive.

The term thermoluminescence is misleading since itdoes not mean thermal generation of the excited system.Thermal-stimulated luminescence much more accuratelydescribes the phenomenon.

In TL measurements or development, readout is per-formed by heating the sample at a uniform rate and view-ing the sample emission. A filter may be added over the de-tector to restrict the viewing wavelength. Figure 18 showsa typical TL glow curve. The development supplies infor-mation on the distribution and depth of traps. Deeper trapsrequire higher temperatures to boil the electrons out. Forexample, a 375◦C glow curve corresponds to a trap life-time of several million years. Indeed, elaborate modelshave been developed to extract quantitative informationabout trap depth and distribution from glow curves.

In most examples of TL, one is examining a delayedphotoluminescence. Particularly in solids subjected to ra-diation chemistry, the nature of the traps and of the emis-sive centers is modified by irradiation. Therefore, the TLemissions may look quite different from the photolumi-nescence of the unactivated solid.

The principle use of TL is in radiation dosimetry. TheTL signal is proportional to dose over a very wide range.Modern TL dosimeters are sensitive in the microrad re-gion. Thermoluminescence dosimeters have a number ofadvantages. They are inexpensive, robust, reusable, andsensitive, and they intrinsically integrate the dose. Fur-thermore, they require no power source or connections.At one time they were widely used for in vivo monitor-ing during therapy since they could be easily implanted orswallowed.

Lithium fluoride is one of the most popular personaldosimeters. It is sensitive down to 10−2 rads and linear to>103 rads. The average atomic number of LiF is similarto that of tissue, which gives the dosimeter a responseindicative of tissue irradiation. The traps are deep enoughto give long storage times without appreciable fading.

The early work on TL demonstrates the difficulty ofstudying emissions that intrinsically rest on impurity sites.The first attempts at TL dosimetry date back to the early1950s, when Farrington Daniels attempted to use LiF asa dosimeter. The original work was abandoned owing to

problems of sensitivity and storage times. When the workwas resumed in the early 1960s, it was discovered thatall the earlier TLs of LiF arose from impurity centers thatwere no longer present in commercially purer LiF. This ledto a considerable effort to elucidate the impurity problemsand eventually led to usable systems. Also, other usefulsystems were discovered by other groups.

Another type of sensitive, and at one time widely used,dosimeter utilized radiophotoluminescence. Irradiation ofa crystal produces color centers. When photoexcited,many of these color centers emit. Since the emission inten-sity is directly proportional to the concentration of colorcenters, a dosimeter is available. An advantage of photolu-minescent dosimeters is that the readout is not destructiveand can be repeated.

Another important use of TL is in archeological andgeological dating when radiocarbon dating is unsuitable(Fig. 18). Heating destroys any TL and initializes the phos-phor. As all rocks have some degree of radioactivity from238U, 235Th, and 40K, the TL can be used to measure thelength of time that the sample was irradiated since cool-ing. For igneous rocks this is the time since formation.For pottery it is the time since the pottery was fired. Suchdating can give results accurate to better than 10%.

H. Flame Emissions

With the exception of hydrogen-oxygen flames, virtuallyall flames exhibit pronounced visual emissions. We shalldiscuss briefly the origin of some of these emissions.Luminescence can arise from the major components ofthe reaction or from trace materials. The orange glow ofcandle flames and oxidant-starved gas flames arises, notfrom luminescence, but from the incandescence of carbonparticles.

Luminescence from a state is independent of how thestate was populated. Many flames are hot enough and theexcited state of elements and compounds low enough thata significant excited-state population can be achieved ther-mally. For example, the emitting state of atomic sodium isat 589 nm. For flames of various temperatures the per-centages of molecules in the excited state are as fol-lows: 2000 K, 1 × 10−3%; 3000 K, 6 × 10−2; and 4000 K,0.4%. While these excited-state populations may seemvery small, they are in fact very large in comparison withthe concentrations that could be achieved by all but themost intense laser sources in photoluminescence experi-ments. This efficiency is readily seen by the intense yellowsodium fluorescence when even tap water is introducedinto a relatively cool Bunsen burner flame.

An interesting aspect of flame spectroscopy is that aflame can be too hot to yield good elemental emissions.Too hot a flame can thermally ionize many of the atoms

Page 252: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

Luminescence 821

and reduce the population of the emissive element. Thispropensity of luminescent species to ionize can be sup-pressed by increasing the concentration of free electronsin this flame. For example, ionization of calcium can besuppressed by spiking the flame with cesium or potas-sium, which have lower ionization potentials and increasethe concentration of free electrons. Many of the productsof the chemical reactions contributing to the flame areproduced in excited states and emit (see Section VI.B).

While not, strictly speaking, flames, inductively cou-pled plasma torches produce plumes that appear to beflames. Furthermore, these plasmas behave like extremelyhigh temperature flames, which are capable of exciting allbut the most recalcitrant elements.

One of the most useful applications of flames is forelemental quantitative analysis. High-temperature flamesreduce most complex matrices to their elemental compo-nents. Many elements, especially metals, then emit in theflames, and the emission can be used for analytical quan-tification. Even many elements that do not emit directlycan still be analyzed in flames by photoluminescence orby atomic absorption.

Because of the chemical generation of many of the ex-cited states in flames, population inversion over the groundstate can result. This population inversion can form thebasis of a laser. The best known chemical laser is theHF or DF laser, made by burning hydrogen or deuteriumin fluorine. Visible or UV chemical lasers are still beingsought.

I. Sonoluminescence

Sonoluminescence (SL) was first observed in 1934 by H.Frenzel and H. Schultes at the University of Cologne.As an indirect result of wartime research on sonar oracoustic radar, they observed SL in an ultrasonic waterbath. Very strong ultrasonic fields were found to yieldclouds of chaotic flashing bubbles, which are now termed“multi-bubble sonoluminescence.” Such systems were notamenable to systematic study until D. Fellipe Gaitansucceeded in trapping a single sonoluminescing bubbleat the acoustic resonance in the center of a flask. Thissingle-bubble sonoluminescence (SBSL) opened the wayfor an explosion of research on SL.

The majority of SL is a broad continuum, with effectiveblack body temperatures in excess of the sun (e.g., wellbeyond 10,000 K). In many cases the continuum emissionis still rising at the 200-nm cutoff of air. Indeed, somehave suggested that temperatures suitable for fusion maybe achieved this way, but this seems an extremely unlikelypossibility. In addition, under some conditions, broad linesare present. Even though the typical ultrasonic transduceroperates at about 25 KHz, the light pulses are much less

than 50 ps and occur during the collapse phase of the bub-bles. Each optical pulse can generate more then a millionphotons and the phenomena is readily visible to the eye asa steady star-like point.

Water is far and away the best solvent for SL and thereis a strong temperature dependence. Sonoluminescence is100-fold brighter at 0◦C compared to 40◦C. Dissolved raregases appear to be critical to SL; it is the 1% Ar in air thatis responsible for SL under ambient air conditions.

For a phenomenon known for over 60 years, our under-standing of its precise origins is still uncertain. While thereare several explanations for SL, the area is still amazinglycontentious with articles appearing regularly in supportof different mechanisms. The fact that the emission is acontinuum with no unique molecular or atomic characterexacerbates definitive modeling.

While there seems to have been no direct practical ap-plications of SL to date, there are needs for short-lived,inexpensive, very compact light sources in such areas asluminescence lifetime measurements. Since SBSL sys-tems can be built extremely inexpensively, SL may finduse in these areas.

Although not a direct application of SL, ultrasonic-induced chemistry is a growing area with potentially sig-nificant applications. Ultrasound has been found to accel-erate and control a number of chemical reactions. Some ofthis chemistry may be induced by the high energy photonsof SL.

J. Fracto-Emission or Triboluminescence

Fracto-emission (FE) is the emission of particles and pho-tons before, during, and following the propagation of acrack in a stressed material. Particles include electrons,ions, and neutral atoms and molecules. Light emissionunder mechanical stimulation is often called tribolumines-cence (TL). Grinding and breaking of crystal and glassescan produce FE. However, even pulling tape off of a sur-face can produce FE.

Sir Francis Bacon in 1605 first reported FE by grindingsucrose, although FE could not have been missed innumer-able times earlier. Amusing, if not practical, is the parlordemonstration of grinding a Wint-o-Green Life Saver�

between one’s teeth or with a pair of pliers in a darkenedroom to produce bright blue-green flashes. The Wint-o-Green Life Saver� is pressed sucrose flavored with methylsalicylate. Triboluminescence is, in many cases, due tothe production of strong electric fields along fracture lineswith a concomitant electrical breakdown. The observedluminescence is then a mechanically introduced cathodo-luminescence. Another easily observed example of TLarises when you strike two pieces of quartz together. Brightorange flashes result that can be viewed in a dimly lighted

Page 253: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

822 Luminescence

room. For either demonstration eye protection should beworn because of the likelihood of flying particles.

Figure 19A shows the photoluminescence of methylsalicylate, the flavoring of emission of Wint-o-Green LifeSaver� which is pressed sucrose. Figure 19B is the TLfrom grinding pure sucrose. The UV-rich band structurecan be clearly associated with the emission of nitrogengas. Figure 19C shows the TL of a Wint-o-Green LifeSaver�. The emission clearly is the nitrogen band struc-ture augmented by a broad visible emission. This broadspectrum is the same as the PL of methyl salicylate, theflavoring.

There can be an enormous amount of chemistry goingon in FE. Extensive bond breaking can occur. For exam-ple, carbon dioxide is formed during the grinding of cal-

FIGURE 19 (A) Photoluminescence and excitation spectrum ofmethyl salicylate, the flavoring of a Wint-o-Green Life Saver©R . (B)Triboluminescence of sucrose. (C) Triboluminescence of a Wint-o-Green Life Saver©R . [Adapted with permission from Linda Sweet-ing. Copyright 1998.]

cite. Atomic sodium emission can arise from the breakingsodium silicate glasses.

Fracto-emission is a valuable tool for studying the phys-ical and chemical processes that occur before, during,and following fracture. It allows examination of failuremechanisms including fatigue, microcrack initiation, andgrowth. Such tools are invaluable in materials researchfor studying the failure mechanisms of such materialsas glasses, metal ceramics, polymers, and composites.In addition, FE is of interest to geologists and geophysi-cists for detecting fractures in mines and along geologicalfaults.

Crystalloluminescence is emission during the growthof crystals and may arise from cleavage. It is, thus, a formof TL from internally generated breakage.

ACKNOWLEDGMENT

We gratefully acknowledge support of the National Science Founda-tion (CHE 82-06279, 86-00012, and 97-26999) and the donors of thePetroleum Research Fund, administered by the American ChemicalSociety.

SEE ALSO THE FOLLOWING ARTICLES

ENERGY TRANSFER, INTRAMOLECULAR • LASERS •LASERS, DYE • MOLECULAR MICROWAVE SPECTROS-COPY • POTENTIAL ENERGY SURFACES • SONOLUMINES-CENCE AND SONOCHEMISTRY • THERMOLUMINESCENCE

DATING

BIBLIOGRAPHY

Adam, W., and Cilento, G., eds. (1982). “Chemical and Biological Gen-eration of Excited States,” Academic Press, New York.

Alkemade, C. Th. J., Hollander, Th., Snellman, W., Seeger, P. J., and terHarr, D., eds. (1982). Metal vapours in flames, Int. Ser. Nat. Philos.103, Pergamon Press, New York.

Ambrose, W. P., Goodwin, P. M., Jett, J. H., Van Orden, A., Werner,J. H., and Keller, R. (1999). Single molecule fluorescence spec-troscopy at ambient temperature. Chem. Rev. 99, 2929–2956.

Cundall, R. B., and Dale, R. E. (1983). “Time-Resolved FluorescenceSpectroscopy in Biochemistry and Biology,” Plenum Press, New York.

Demas, J. N. (1983). “Excited State Lifetime Measurements,” AcademicPress, New York.

De Silva, A. P., Gunaratne, H. Q., Gunnlaugsson, T., Huxley, A. J. M.,McCoy, C. P., Rademacher, J. T., and Rice, T. E. (1997). Signalingrecognition events with fluorescent sensors and switches. Chem. Rev.97, 1515–1566.

Dickinson, J. T. “Fracto-Emission, Encyclopedia of Materials,” Elsevier,in press.

Dunn, R. C. (1999). Near-field scanning optical microscopy. Chem. Rev.99, 2891–2927.

Page 254: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN008B-389 June 29, 2001 15:35

Luminescence 823

Fouassier, J. P., and Rabek, J. F., eds. (1990). “Lasers in Polymer Scienceand Technology: Applications,” CRC Press, Boca Raton, FL.

Harvey, E. N. (1957). “A History of Luminescence,” American Philo-sophical Society, Philadelphia.

Herman, B. (1998). “Fluorescence Microscopy,” 2nd ed., Springer-Verlag, New York.

Horowitz, Y. S., ed. (1984). “Thermoluminescence and Thermolumines-cent Dosimetry,” Vols. 1–3, CRC Press, Boca Raton, FL.

Lakowicz, J. R. (1999). “Principles of Fluorescence Spectroscopy,” Sec-ond Editon, Plenum Press, New York.

Pawley, J. (1995). “Handbook of Biological Confocal Microscopy,” 2nded., Plenum Press, New York.

Periasamy, A. (2001) “Methods in Cellular Imaging,” Oxford Univ.Press, New York, in preparation.

Schaefer, F. P. (1990). “Dye Lasers,” 3rd ed., Springer-Verlag, NewYork.

Schulman, S. G., ed. (1988). “Molecular Luminescence Spectroscopy,”Vol. 2, Wiley, New York.

Sweeting, L. M., Cashel, M. L., Dott, M. L., Gingerich, J. M., Guido, J.L., Kling, J. A., Pippin III, R. F., Rosenblatt, M. M., Rutter, A. M.,and Spence, R. A. Spectroscopy and mechanism in triboluminescence,Mol. Cryst. Liq. Cryst. 211, 389.

Sze, S. M. (1981). “Physics of Semiconductor Devices,” 2nd ed., JohnWiley & Sons, New York.

Weber, M. J. (1982). “CRC Handbook of Laser Science and Technology,”Vols. 1–3, CRC Press, Boca Raton, FL.

Yen, W. M., and Selzer, P. M., eds. (1981). Laser spectroscopy of solids.Top. Appl. Phys. 49.

Page 255: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

Magnetic MaterialsR. C. O’HandleyMassachusetts Institute of Technology

I. Interactions and Energies in Magnetic MaterialsII. Magnetic Materials, Fundamental Properties

III. Technical PropertiesIV. Related Magnetic Phenomena and Applications

GLOSSARY

Antiferromagnetic material A magnetic material inwhich the exchange interaction favors antiparallelalignment of neighboring spins. Such interactions arelimited to certain crystal structures so that frustrationdoes not occur as in a three-membered set of nearestneighbors. Antiferromagnetic interactions often occurfor smaller interatomic spacings relative to d-orbitaldiameter.

Coercivity, coercive field The negative magnetic field re-quired to reduce the magnetization (intrinsic coerciv-ity, i Hc) or flux density (B Hc) to zero after saturationin a positive field. There is no significant differencebetween i Hc and B Hc in soft magnetic materials.

Curie temperature The temperature above which long-range magnetic order vanishes. A ferromagnetic orantiferromagnetic material becomes paramagneticabove its Curie or Neel temperature, respectively.

Demagnetizing factor The numerical factor that indi-cates the extent to which a given state of magnetizationin a bounded magnetic material produces an internalfield that opposes the state of magnetization. The de-magnetizing factor is a diagonal tensor for an ellipsoidof revolution.

Domain A region in a magnetic material in which allmagnetic moments have essentially the same orienta-tion. The magnetization within the domain is the satu-ration magnetization of the material.

Domain wall The surface across which the direction ofmagnetization rotates from that in one domain to theadjacent one. If the magnetization rotates about the nor-mal to the domain wall, the wall is called a Bloch wall.If the magnetization rotates so that it has a componentparallel to the wall normal, it is called a Neel wall.

Exchange energy The energy associated with the relativealignment of neighboring spins in a magnetic material.In a ferromagnet, exchange energy is more positive ifthe angle between the directions of neighboring spinsis increased.

Exchange interaction The preference in a magneticmaterial for neighboring spins to align parallel orantiparallel to each other. This interaction is quantummechanical in origin.

Ferromagnetic material A magnetic material in whichthe exchange interaction favors parallel alignment ofneighboring spins.

Fundamental magnetic properties Those properties ofa magnetic material that are largely independent ofmicrostructure, defects, and processing conditions.

919

Page 256: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

920 Magnetic Materials

Examples are the saturation magnetization and Curietemperature. Fundamental properties reflect the elec-tronic structure that in turn is a consequence of chem-istry and crystallography.

Hard magnetic material See permanent magnet.Magnetic anisotropy The preference for the magnetiza-

tion vector in a material to point along one or more spe-cific axes. The preferred direction(s) of magnetizationcan be governed by sample shape (magnetostatic en-ergy), cyrstal structure (magnetocrystalline anisotropyenergy), strain in the structure (magnetoelastic energy),as well as surface or interfacial interactions (surface orNeel energy).

Magnetic glasses Noncrystalline or amorphous mate-rials that exhibit strong or useful magnetic proper-ties. Magnetic glasses can be either oxide glasses ormetallic glasses (amorphous magnetic alloys such asFe40Ni40B20).

Magnetization The process by which a magnetic ma-terial is magnetized, generally either by domain wallmotion or by rotation of the magnetization withinthe domains. The term magnetization is also used asa noun to describe the state of a material in termsof the vector sum of its magnetic moments per unitvolume.

Magnetoresistance The change in resistance of a ma-terial depending upon its state of magnetization. Likemagnetostriction, it can have isotropic components thatshow up on heating through the Curie temperature aswell as anisotropic components that vary with the rel-ative direction of current and magnetization.

Magnetostatic energy The energy of a magnetic momentin a field caused by other moments. This energy, some-times referred to as dipole energy, is particularly strongand positive inside a material magnetized normal to athin dimension.

Magnetostriction The strain associated with the state ofmagnetization of a material. The isotropic part of thisstrain, that decreases sharply on heating through theCurie temperature, is called the volume magnetostric-tion. The anisotropic part of the strain, called the Joulemagnetostriction, is a function of the direction of mag-netization relative to the crystal axes.

Nanomagnetic material A material with variationsin magnetic properties over a length scale of a fewnanometers to hundreds of nanometers. Nanomagneticmaterials may be manmade by thin film depositionand/or lithographic techniques or they may be naturalconsequences of thermodynamics and kinetics, suchas the fine microstructures that can result from heat-treating certain amorphous magnetic alloys.

Permanent magnet A magnetic material which retains alarge magnetization upon removal of an applied field

after saturation. Further, permanent magnets are char-acterized by large coercivities so that the remanentmagnetization is reduced only by a very large field. Ma-terials with coercivities in the range of about 25–500Oe (2–40 kA/m) are considered to be semihard mag-netic materials. Greater and smaller coercivities gen-erally characterize hard and soft magnetic materials,respectively.

Remanence The magnetization or flux density that re-mains in a sample at zero field after being exposed toa specific field.

Soft magnetic materials Materials that are relatively eas-ily magnetized and demagnetized. They are typicallycharacterized by high relative permeabilities and lowcoercivities (from a few tens of Oe down to as low asmilli-Oe in some cases).

Superparamagnetism A type of magnetic behaviorcharacterized by zero remanence and zero coercivity.It is observed in ferromagnetic particles or clusters ofatoms that are so small that thermal energy overridesthe tendency of the magnetization to lie in a particulardirection due to crystallography or shape. Specifically,KuV < kBT .

Technical magnetic properties Those properties ofmagnetic materials that depend strongly on processingand microstructure. Examples are coercivity, rema-nence, and permeability.

Zeeman energy The potential energy of a magnetic mo-ment in a field.

A MATERIAL is said to be magnetic when it possessatomic-scale magnetic moments that show long-rangeordering (ferromagnets, antiferromagnets, and ferrimag-nets) below a Curie temperature. In a ferromagnetic ma-terial, all magnetic moments order parallel to each other.In antiferromagnetic and ferrimagnetic materials the mo-ments on one crystallographic site order with their mo-ments antiparallel to those on another, crystallographicallydifferent site. In antiferromagnets, the atomic momentson the two types of sites are equal and no net momentresults; in ferrimagnets, the atomic moments are notequal and a net moment results. A material might alsobe said to be magnetic if its atomic-scale magnetic mo-ments show no spontaneous ordering but do respond toan applied magnetic field with an increase in its magneticmoment density (paramagnetic materials). The class ofmagnetic materials could as well include diamagnetic ma-terials which need not have atomic-scale magnetic mo-ments but do respond to an applied field by producinga weak magnetization directed opposite the applied field.Given this broad definition, it can be said that all materialsare magnetic.

Page 257: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

Magnetic Materials 921

A more restrictive definition of magnetic materialsmight start with those that show ferromagnetic orderingat room temperature, that is, they have Curie tempera-tures which are greater than room temperature. Of all theelements, only four, Fe, Co, Ni, and Gd, are ferromag-netic at room temperature. Several other elements showantiferromagnetism (e.g., Cr and α-Mn) or other formsof magnetic ordering such as helical or spiral moment ar-rangements (Tb, Sm, Ho, etc.). Yet, starting with thesefew elements, the class of ferromagnetic, antiferromag-netic, and ferrimagnetic alloys and compounds expandsto include as significant components about two-thirds ofthe elements of the periodic table. At least three materials,Cu2MnAl, ZrZn, and InSb, are ferromagnetic (the lattertwo only at cryogenic temperatures), although they con-tain none of the abovementioned strongly ferromagneticelements. This article surveys elements, alloys, and com-pounds that are ferromagnetic or ferrimagnetic at roomtemperature.

Magnetic materials are said to be soft if they respond torelatively weak external fields by changing their state ofmagnetization. This change in magnetization is accompa-nied by a change in force on the material if the externalfield is inhomogeneous. Magnetic materials are said tobe hard when they retain a state of net magnetization inthe presence of a significant opposing field. Hard mag-netic materials produce a field that can attract other soft orhard magnetic materials. Half a century ago, most of theinterest in magnetic materials was associated either withthese forces (motors, electromagnetic actuators) or withtheir ability to concentrate magnetic flux (shielding) orenhance flux change (inductors and transformers). Today,engineering interest in magnetic materials has expandedto include their ability to store information at high densityin a nonvolatile way (the information is retained when thepower to the storage device is off), as well as their variedelectrical and spin transport properties.

While exciting new magnetic compositions are still be-ing discovered, the range of properties exhibited by exist-ing magnetic compositions is now often expanded and/ortailored to specific needs by making them in thin film, mul-tilayer, or nanoparticle form. In these reduced-dimensionstructures, novel magnetic properties can be obtained be-cause of the altered atomic and chemical structure nearan interface and by arranging magnetic and sometimesnonmagnetic components at small length scales so thatnew magnetic interactions and transport effects can beengineered. Thus a modern overview of magnetic materi-als must include not only materials whose properties varywith composition, but also bond type and/or crystal struc-ture. Reference should also be made to the exciting newdevelopments achieved by rendering old compositions innew structures.

This article begins with an overview of some of the im-portant manifestations of magnetism in materials (mag-netic domains, magnetic anisotropy, magnetostriction)and outlines the underlying science that explains the prop-erties of magnetic materials. Magnetic materials can beclassified either by their functional properties (e.g., mag-netically soft or hard) or by the nature of their bondingand structure (oxides, alloys, intermetallic compounds).Table I summarizes some of the important crystal struc-tures and magnetic materials systems that occur in a ma-trix of functional classes and bonding types. In this article,the major magnetic functional classes are described andmaterial examples are given. Also, an attempt is made toinclude descriptions of the major bonding and structuretypes of magnetic materials. Magnetic effects at surfaces,in thin films, and fine particles are also described.

The magnetic fields B and H are related in MKS unitsby B = µ0 (H + M) where H is the applied field due tomacroscopic current densities, J = I/area. Ampere’s lawrelates the H field to the current:

∇ × H = J or∮

H · dl = I, (1)

thus H has units A/m. M is the magnetic moment den-sity or magnetization in the material, M = N 〈µm〉/V ,where 〈µm〉 is the average magnetic moment per atomor molecule in the system and N/V is the number of suchentities per unit volume. The magnetization, M , while hav-ing the same MKS units as the H field, is the field due tomicroscopic currents. B is the flux density, φ/A (with unitsof Tesla), whose time rate of change induces a voltage byFaraday’s law:

∇ × E = −∂B/∂t or∮

E · dl = V = −∂φ

∂t(2)

The microscopic currents that give rise to the magne-tization are the spin and orbital angular momenta of theelectrons.

The magnetic susceptibility χm is usually used to de-scribe weak magnetic responses to H as in paramagneticand diamagnetic materials. The magnitude of χm is typ-ically ±10−4–10−6 (dimensionless in MKS units). Dia-magnetism is not a matter of aligning preexisting atomicmagnetic moments. Rather, it is an electronic responseto an applied field that creates a new component of or-bital angular momentum and thus a magnetic moment.The diamagnetic response is always negative as it can betraced classically to Faraday’s law or Lenz’s law. A ma-terial with a paramagnetic (diamagnetic) susceptibility ofχ = +(−) 10−5 would show a magnetization M = 10 A/min a field of H = 106 A/m (applied field B = µ0 H ≈ 1.25Tesla). A magnetization of 10 A/m corresponds to a fluxdensity B = µ0 M of 1.25 × 10−5 T which is five orders

Page 258: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

922 Magnetic Materials

TABLE I Important Examples of Magnetically Soft and Hard Materials can be Found ineach of the Major Bonding Types: Oxides, Metallic, and Covalent

Crystal structure and examples

Bond type Soft Hard

Ionic (oxides) Spinel, TO·Fe2O3 (T = Mn, Ni. . .) Spinel, CoO·Fe2O3

Defect spinel, γ -Fe2O3 Hexaferrites, BaO·6Fe2O3

Metallic (alloys)

3d BCC Fe, FeCo, FCC NiFe, NiCo HCP Co-base

4 f HCP R metals, R-T alloys

Intermetallic compounds

Ordered alloys and Ni3Fe (CuAu) CoPtcovalent (p-d) Heusler (L21) Cu2MnAl Co5Sm, Co17Sm2, Fe14Nd2B

Borides: Fe2B, Co3B LavesTbFe24, DyCo2 MnAl

Amorphous alloys

(metallic + p-d covalent) Fe80B20, Co70Fe5Si15B10 Co80Gd20 TbFe2, DyCo2

T = transition metal; R = rare-earth metal.

of magnitude less than the spontaneous magnetization ofthe ferromagnetic materials listed in Table I.

The reason for the weak magnetic response of param-agnetic materials to an external field is that thermal en-ergy kB T overwhelms the energy that favors alignment ofa paramagnetic moment with the field, g µB B. The fieldand temperature dependence of the magnetization in para-magnets (or ferromagnets, see below) is described by theBrillouin function, BJ (x):

M(H , T )

M(∞, 0)= BJ (x) =

2J + 1

2Jcoth

(2J + 1

2Jx

)

− 1

2Jcoth

(x

2J

), (3)

as shown in Fig. 1. In Eq. (3), J is the total (spin plusorbital) angular momentum quantum number and x =g µo µB J H/kB T , expresses the energy of the magneticmoment in the H field relative to the thermal energykBT .

Ferromagnetic materials are characterized by a long-range ordering of their atomic moments, even in the ab-sence of an external field. The observed field dependenceof the magnetization extrapolates from its high-field val-ues to a nonzero value, called the spontaneous magnetiza-tion (Fig. 2a). The spontaneous, long-range magnetizationof a ferromagnet is observed to vanish above an orderingtemperature called the Curie temperature TC (Fig. 2b).Experimental curves of M versus H do not always showtrue saturation. Thermodynamics suggests that M(H ) ap-proaches saturation like 1 − H −1. For polycrystalline sys-tems, M(H ) approaches saturation like 1 − H −2.

In a ferromagnetic material, the H field in Eq. (3) in-cludes not only the applied field but also the strong internalfield called the Weiss molecular field, λM : B = µo(H +λM). The Weiss molecular field expresses the tendencyfor long-range magnetic order due to the exchangeinteractions between local moments. The constant λ iscalled the molecular field constant. Because λM � H ,the energy of the magnetization in the exchange field,namely g µB J µo λM , is comparable in magnitude tokB T and magnetic ordering occurs spontaneously in theabsence of an applied field. Thus, for a ferromagnet, Eq.(3) becomes a transcendental equation that can be solvedgraphically or numerically to describe the results in Fig.2b. Figure 2b also compares the experimental data forNi with the form derived from Eq. (3) L(x) for J = ∞).

FIGURE 1 Brillouin function versus x = µµ B/kBT for various val-ues of J. The infinite-spin limit is given by the classical Langevinfunction, L(x). [From O’Handley, R. C. (2000). “Modern MagneticMaterials, Principles and Applications.” Reprinted with permissionof John Wiley and Sons.]

Page 259: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

Magnetic Materials 923

FIGURE 2 (a) Magnetization of a ferromagnetic material versus field. (b) Reduced magnetization versus reducedtemperature for nickel (open data points, from Weiss and Forrer, Ann. Phys. 5, 153, 1926) and Brillouin function, B1/2(x)(solid line). [From O’Handley, R. C. (2000). “Modern Magnetic Materials, Principles and Applications.” Reprinted withpermission of John Wiley and Sons.]

Saturation magnetic moments and Curie temperatures arelisted in Table II for some representative ferromagneticmaterials.

The relative magnetic permeability µ = µr/µ0 = B/His used more often than susceptibility (χm = M/H ) to de-scribe the response of ferromagnetic materials to H . Thisis because ferromagnets are useful in electromagnetic de-vices where it is B rather than H that is important ininducing a voltage [Eq. (2)].

TABLE II Fundamental Magnetic Data for Various Magnetic Materials

µ0 Ms Ms µ0 Ms Ms nB/FU TC (T N )

(T)( emu

cm3

)(T)

( emucm3

)= Ms/µB Nv

(µB/FU)Substance Structure (290 K) (0 K) (0 K) (K)

Fe BCC 2.1 1707 2.2 1707 2.22 1043

Co HCP, FCC 1.8 1440 18.2 1446 1.72 1388

Ni FCC 0.61 485 0.64 510 0.606 627

Ni80Fe20 FCC 1.0 800 1.17 930 1.0 —

Gd HCP — 2.6 2060 7.63 292

Dy HCP — 3.67 2920 10.2 88

Ni2MnGa Heusler 0.6 480 — 0.6 373

CrO2 — 0.65 515 — 2.03 386

MnOFe2O3 Spinel 0.51 410 — 5.0 573

FeOFe2O3 Spinel 0.6 480 — 4.1 858

Y3Fe5O12 Garnet(YIG) 0.16 130 0.25 200 5.0 560

Nd2Fe14B Tetragonal 1.6 1280 — — —

a-Fe80B20 Amorphous 1.6 1260 1.9 1480 2.0 650

The quantity nB is called the magneton number, the number of Bohr magnetons per atomor per formula unit (FU) in a material.

I. INTERACTIONS AND ENERGIESIN MAGNETIC MATERIALS

A. Magnetostatics

Magnetostatics refers to the consequences of the magneticfields that appear near the surfaces of magnetized bodies. Ifthe magnetization of a polycrystalline ferromagnetic sam-ple measuring, for example, 1 × 1 × 5 mm is measured,

Page 260: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

924 Magnetic Materials

FIGURE 3 Magnetization curves for a polycrystalline, ferro-magnetic sample with field applied in different directions. [FromO’Handley, R. C. (2000). “Modern Magnetic Materials, Principlesand Applications.” Reprinted with permission of John Wiley andSons.]

the approach to saturation is different for application ofthe field along the long axis as opposed to the short di-rection of the sample (Fig. 3). A greater external field isneeded to achieve the same degree of magnetization forfields applied in the short direction compared to the longdirection.

This shearing effect on the B-H loop is related to themagnetic poles produced at the surfaces across which themagnetization has a discontinuity in its normal compo-nent. A magnetic field, H , emanates from the north polesand terminates at the south poles. These magnetostaticfields are responsible for the effect of sample shape onthe magnetization process. The shape and aspect ratio ofthe sample affect the strength of the magnetostatic fieldinside the sample and opposing M. The magnetostaticH field can be derived from a scalar magnetic poten-tial, φm , H = −∇φm because ∇ × H = 0 in the absenceof macroscopic currents.

The scalar potential results from volume as well assurface magnetic “charges” or poles, ρm = ∇ · M andσm = M · n, respectively, according to the relation

φm = − 1

∫ ∫ ∫ ∇′ · M(x ′)|x − x | d3x ′

+ 1

∫ ∫M(x ′) · n|x − x | d2x ′ (4)

The consequences of this result are readily seen by theexact two-dimensional solution for the field due to a lineof magnetic poles. If the line of magnetic charge subtendsan angle θ from the observation point and the distanceto the ends of the pole distribution are r1 and r2 then thefield components parallel and perpendicular, respectively,to the linear charge distribution are, in SI units:

H‖ = σ

2πln

(r2

r1

)H⊥ = σ

2πθ (5)

Figure 4 provides an excellent pedagogical summary ofthe important issues in magnetostatics. This figure plots

the results of micromagnetic calculations of the field dis-tribution inside and around a uniformly magnetized bar.Only the upper right quadrant of the bar, of infinite extentout of the paper is shown. Note that the surface chargesare sources for the H field inside and outside the sample.An equivalent current through the surface windings canbe considered the source of the B field inside and out-side the sample. Further, outside the sample the B and Hfields are parallel to each other: B = µo H (SI) and B = H(c.g.s.). Note also that the boundary conditions on B andH are properly satisfied in these calculated fields. M isproportional to B − H, which in this case is held uniforminside the bar. The H field inside the magnetized samplein Fig. 4 opposes the state of magnetization in which thesample is held. In a sample that is not so constrained, thisinternal field would tend to demagnetize the sample.

The internal field due to surface magnetic poles is calledthe demagnetizing field Hd. This field is proportional tothe magnetization of the sample, Hd ∝ −M [Eq. (5)]. Ingeneral, Hd is a strong function of position inside a uni-formly magnetized body. The exceptions are ellipsoids ofrevolution, in which case

Hd = −N M (6)

where N is called the demagnetizing factor. It is a third-rank tensor with unit trace,

3∑i =1

Ni = 1 (7)

where i indicates the direction of magnetization. Thevalues of Nx , Ny, Nz for an infinite sheet in the x–y

FIGURE 4 B and H fields in and around a uniformly magnetizedbar of length L and width W (infinite extent out of paper). At theleft is sketched the magnetized bar and its surface poles (or theequivalent surface current around the bar) which are the sourcesfor the H and B fields, respectively. Only one quarter of the baris sketched at right because of the symmetry of the situation.[From Bertram, H. N. (1994). “Theory of Magnetic Recording,”Cambridge Univ. Press, Cambridge.]

Page 261: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

Magnetic Materials 925

FIGURE 5 Crystal structures for Fe, Ni, and Co showing easy and hard magnetization directions with respectivemagnetization curves below. [From O’Handley, R. C. (2000). “Modern Magnetic Materials, Principles and Applications.”Reprinted with permission of John Wiley and Sons.]

plane are Nx + Ny = 0, Nz = 1. For a rod infinite along z ,Nx = Ny = 0.5, Nz = 0. For a sphere Nx = Ny = Nz = 1/3.

B. Magnetic and Magnetoelastic Anisotropy

The shape of a material is an extrinsic factor affecting theform of the M–H curve for a field applied along differ-ent directions in a sample. Several intrinsic factors canalso give rise to anisotropy in magnetic properties. Theseare (1) magnetocrystalline anisotropy (the preference forM to lie along certain high-symmetry crystallographic di-rections), (2) magnetoelastic anisotropy (the preferencefor M to lie in a direction dictated by the symmetry ofan imposed anisotropic deformation of the material), and(3) field-induced or directed-pair anisotropy (the prefer-ence for magnetization along an axis determined by a fieldpresent during prior heat treatment).

1. Magnetocrystalline Anisotropy

The physical properties of a material must exhibit a sym-metry no lower than that of the crystal structure. The sym-metry of the crystal structure affects the preferred direc-tion of magnetization at the local level. A magnetic atomsenses the crystal symmetry through the Coulomb elec-tric field of its nearest neighbors. This crystal field can beexpanded in harmonic functions that reflect the symmetryabout a given site. The crystal field symmetry is corre-lated with the orbital symmetry of the bonding valenceelectrons, which are characterized by their orbital angularmomentum, L. The spin, S, of the magnetic atom, couples

to the electron’s orbital motion via quantum mechanicalspin-orbit interactions, ξ L · S. Hence, the spin is coupledto the symmetry of the crystal field. Thus, for strong mag-netic anisotropy, a low-symmetry crystal field and strongspin-orbit coupling are required.

Figure 5 depicts the hexagonal-close-packed crystalstructure of cobalt and shows the M–H curves for fieldsapplied in a “hard,” base-plane direction [1000] as well asalong the “easy” c axis, [0001]. The uniaxial anisotropyof HCP cobalt or other uniaxial magnetic materials canbe described phenomenologically by the anisotropic freeenergy density

f hexa =

∑i=0

Kul sin2l θ. (8)

The field energy density, −µo M · H, must be added toEq. (8) to give the total free energy

fTot = −µ0 Ms H sin θ + Kµ1 sin2 θ (9)

where θ is the angle between Ms and the c axis. Thezero-torque condition, namely −∂ f /∂θ = 0, gives for thefirst-order, uniaxial hard-axis (base plane) magnetizationprocess:

M = Ms sin θ = µo M2s H

2Ku1(10)

Fitting this form to the cobalt magnetization data givesKu1 = 4.1 × 105 J/m3. Inclusion of the second-orderanisotropy, Ku1 sin4 θ , describes the negative curvature ofthe hard axis M–H curve near the approach to saturation.Fitting experimental curves gives Ku2 ≈ 1.5 × 105 J/m3.

Page 262: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

926 Magnetic Materials

FIGURE 6 Reduced magnetization versus reduced field appliedat an angle θo to the easy axis. The linear m-h curve is forθo = 90◦, and the other curves of increasing remanence are forθo = 80◦, 60◦, and 30◦. Possible magnetization distributions areshown as inserts for nucleation-inhibited, single-domain particles.[From O’Handley, R. C. (2000). “Modern Magnetic Materials, Prin-ciples and Applications.” Reprinted with permission of John Wileyand Sons.]

For fields applied at arbitrary angles θ0 relative to theeasy axis of a uniaxial material, the M–H loops vary ina richer way between the square easy-axis (θ0 = 0) limitand the hard-axis (θ0 = 90◦) limit (Fig. 6).

The cubic crystal structures of α-Fe (BCC) and γ -Ni(FCC) require inclusion of φ dependence in the harmonicexpansion of the anisotropy energy:

f cubica = K c

0 + K c1

(α2

1 α22 + cycl

) + K c2 α

21 α

22 α

23 + · · ·

(11)

Values of K α1 and K α2 (α = u or c) are listed in Table IIIfor some common magnetic materials. Positive (negative)K1 implies 〈100〉 (〈111〉) easy axes if K2 = 0.

Figure 7 shows the variation of K1 with compositionin the FCC Fe-Ni alloys. Compositions exist near 80%Ni for which K1 = 0, that is, the magnetization is equallystable along 〈100〉 and 〈111〉 directions. These materialsare generally easily magnetized and hence make excellentsoft magnetic materials.

2. Magnetoelastic Interactions

The most common manifestation of magnetoelastic (ME)interactions is magnetostriction, λ = �l / l, the strain ina material that accompanies a change in its direction ofmagnetization. Figure 8 shows the field dependence ofthe magnetostrictive strain measured parallel and perpen-dicular to the applied field direction, starting from a ran-domly demagnetized, isotropic sample. Below saturation,the strain is generally quadratic in the field-induced mag-netization; above the anisotropy field, the strain saturates.

FIGURE 7 First-order anisotropy constant for FCC Ni-Fe alloys.[From Bozorth, R. M. (1993). “Ferromagnetic Materials,” IEEEPress, New York.]

The difference between the parallel and perpendicularstrain curves is 3/2 of the saturation magnetostriction con-stant. When the strain parallel to the field direction is pos-itive, the magnetostriction is positive.

Because magnetocrystalline anisotropy reflects thecrystal field symmetry, a distortion of the crystal fieldresults in a new, strain-induced anisotropy. For example, auniaxial strain applied to a cubic crystal causes a uniaxialME anisotropy that adds to the cubic anisotropy [cf.Eq. (11)]:

f cme = B1[ε1

(α2

1 − 13

)+ cycl ]+ B2[ε12 α1 α2 + cycl] +· · ·

(12)

Here B1 and B2 are axial and shear magnetic stress co-efficients that, when multiplied by the appropriate straincomponent, give the ME anisotropy energy density. Thus,if B1 > 0, an extensional strain in the z direction (εzz > 0)increases the energy for magnetization in that direction(αz = 1) relative to magnetization in the x–y plane. Eventhough the anisotropic strain may be small, e.g., |εii| •10−3, the ME anisotropy may be comparable to or exceed

FIGURE 8 Representations of a demagnetized sample (left) andtwo states of magnetization with the sense of magnetostrictivestrain shown for positive magnetostriction constant: (�l/ l )|| > 0. Atthe bottom is illustrated the field dependence of strain parallel andperpendicular to the magnetization direction. [From O’Handley,R. C. (2000). “Modern Magnetic Materials, Principles and Appli-cations.” Reprinted with permission of John Wiley and Sons.]

Page 263: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

Magnetic Materials 927

TABLE III Magnetic Anisotropy Constants K11 and K22 for SelectedMaterials

T = 4.2 K RT

K 1 (−k2) K 2 (−k4) K 1 K 2

3d metals

Fe 5.2 × 104 −1.8 × 104 4.8 × 104 −1.0 × 104

Cou 7.0 × 105 1.8 × 105 4.1 × 105 1.5 × 105

Ni −12 × 104 3.0 × 104 −4.5 × 103 −2.3 × 103

Ni80Fe20 — — −3 × 102 —

Fe50Co50 −1.5 × 104 a

4 f metals

Gdu −1.2 × 105 +8.0 × 104 +1.3 × 104 —

Tbu −5.65 × 107 −4.6 × 106 — —

Spinel ferrites

Fe3O4 −2 × 104 — −0.9 × 104 —

CoFe2O4 +106 — 2.6 × 105 —

Garnets

YIG −2.5 × 103 — 1 × 103 —

Hard magnets

BaO·6Fe2Ou3 4.4 × 105 — 3.2 × 105 —

SmuCo5 7 × 106 — 1.1–2.0 × 107 —

Fe14Nd2Bu −1.25 × 107b — 5 × 106 —

aDisordered; K1 ≈ 0 for ordered phase.bUniaxial materials are designated with a superscript “u” and their values K u

1 and K u2

are listed under K1 and K2, respectively. The sign convention for the uniaxial materialsis based on the sin2 θ notation of Eq. (8): K1 > 0 implies easy axis. Units are J/m3;multiply these values by 10 to get erg/cm3.

the magnitude of the cubic anisotropy. For example, inNi, B1 ≈ 6.2 × 106 N/m2, so for ε ≈ 10−3 , | f cme | ≈ 6.2 ×103 J/m2 which is comparable to |K1 | ≈ 4.8 × 103 J/m3.

Some values of Bs and λs are listed in Table IV forrepresentative materials.

The ME free energy in Eq. (12) also implies that if thedirection of magnetization changes (i.e., the αi s change)then the material may change its equilibrium state of strain.This is the phenomenon of anisotropic magnetostrictionshown in Fig. 8. The dependence of the magnetostrictivestrain on the direction of magnetization may be derivedby adding to Eq. (12) the form of the elastic energy for acubic crystal

fel = 12 C11

(e2

11 + cycl) + C12(e11e22 + cycl)

+ 12 C44

(e2

12 + cycl). (13)

Minimization of Eqs. (12) and (13) with respect to the εij’sleads to the form of the magnetostrictive strains λ100 andλ111:

λ100 = −2

3

B1

C11 − C12(14)

and

λ111 = −1

3

B2

C44(15)

Eqs. (14) and (15) show that the magnetostrictive strainsλijk are related to the ME stress coefficients Bi by theelastic constants in a way that parallels the mechani-cal stress strain relations, εij = Cjklσkl, (except for the

TABLE IV ME Coupling Coefficients or Stresses (MPa) andMagnetostriction Constants λ100 and λ111 in PPM at RoomTemperature for Several Materials

λ100 B1 λ111 B2 Polycrystal,

(λγ,2) (MPa) (λε,2) (MPa) λs

3d metals

BCC-Fe 21 −2.9 −21 +2.9 −7

HCP-Cou (−140) +6 (50) +13 (−62)

FCC-Ni −46 +6.2 −24 +4.3 −34

a-Fe80B20 — — +32

4f metals

TbFe2 2600 1753

Oxides

Fe3O4 −15 56 +40

CoFe2O4 −670 120 −110

Yttrium-iron garnet −1.4 −1.6 −2

See Eqs. (14) and (15). Some polycrystalline magnetostriction valuesare also listed. The prefix “a-” designates an amorphous material. Foruniaxial materials (superscript “u”) where λγ,2 or λε,2 was reported, theirvalues are given in parentheses in the λ100 and λ111 column, respectively.

Page 264: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

928 Magnetic Materials

FIGURE 9 Room temperature magnetostriction constants forFCC Fe-Ni alloys. [From Bozorth, R. M. and Wakiyama, T. (1969).J. Phys. Soc. Jpn. 17, 1669; Hall, R. C. (1960). J. Appl. Phys. 31S,5157.] Crossover behavior of anisotropy constant K1 for slow-cooled (labeled SC) and quenched (labeled Qu) Fe-Ni alloys arealso shown for reference.

minus sign which results from the definitions of theBi’s).

Figure 9 shows the variation of λ100 and λ111 with com-position in the FCC-Ni-Fe alloys. Note that the magnetiza-tion coefficients vanish near 80% Ni, close to the compo-sition at which K1 = 0 in these alloys (Fig. 7). This furtherestablishes the soft magnetic character of these alloys.

C. Exchange Energy, MagneticDomain Walls, Domains

The exchange interaction, which couples the directionsof the spins in a material to each other, is a quantum-mechanical phenomenon based on the Coulomb interac-tion between electrons. The result is that spins behave asif there were an interaction of the form

�Heisenberg = −2∑i<J

�ijSi · Sj . (16)

This form of exchange coupling is called the Heisenberginteraction and � is the exchange integral over the wavefunctions responsible for transferring the spin informa-tion.

On a macroscopic, continuum level, Eq. (16) can beexpressed as an exchange energy density

uexch = Uexch

V= A

(∂θ (x)

∂x

)2

= A

(∇MM

)2

. (17)

Here, A = �S2a2/(2�) is the exchange stiffness constant(with a being the lattice constant and �, the atomic vol-ume) and θ (x) is the position-dependent orientation ofmagnetization in the material. Eq. (17) expresses the factthat there is an energy cost (proportional to A) associ-ated with local departures from uniform magnetizationorientation.

The magnetostatic energy (Section II.A) can be re-duced by the formation of magnetic domains (each domainis uniformly magnetized to saturation along one of theanisotropy easy axes). These magnetic domains are sepa-rated by magnetic domain walls, which are surfaces overwhich the magnetization direction gradually changes fromone easy direction to another. The equilibrium domain wallthickness is determined by minimizing the integral overthe wall of its position-dependent magnetic energy. Thisenergy is made up of local magnetic anisotropy and ex-change energy densities. The result is that a 180-degreedomain wall in a uniaxial material has domain wall thick-ness and energy density given, respectively, by

δdw = π

√A

Ku(18)

and

σdw = 4√

AKu . (19)

These parameters for Ni and Fe are δdw = 72 and 30 nm,respectively, and σdw = 0.7 and 3.0 mJ/m2, respectively.The thickness and energy density of domain walls are thefundamental parameters that determine the mobility of adomain wall or, on the other hand, the ability of variousdefects to pin or impede domain wall motion.

When the energy of a domain wall fluctuates graduallywith position in a material, its motion can be impededbecause a gradient in surface energy is equivalent to apressure on the wall. The Zeeman energy difference acrossthe domain wall is also a pressure working against thewall energy gradient. Balancing these pressures leads to anexpression for the coercivity due to defects that are largerelative to the wall width (D > δdw) and across which thewall energy varies gradually:

Hc ≈ 2Ha

π

δdw

D

⌊�A

A+ �K

K+ 3

2λs

�σ

K+ · · ·

⌋. (20)

Eq. (20) indicates that the upper limit to the coercivityis governed by the anisotropy energy density (expressedhere through the anisotropy field, Ha = 2Ku/Ms) and theratio of domain wall thickness to defect size. The extentto which Hc approaches this limit is proportional to themagnitude of the relative fluctuation in the strength ofparameter, X , over a defect size D. A more general the-ory of coercivity treats defects whose properties differfrom those of the host material in a step-like fashion. Thismodel predicts that in the small defect regime (D < δdw),the coercivity increases linearly with defect size and forD > δdw, the coercivity is independent of defect size[rather than dropping off with the inverse of the defect di-mension as for a gradual-fluctuation defect, Eq. (20)]. The

Page 265: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

Magnetic Materials 929

FIGURE 10 Schematic variation of coercivity with normalized de-fect size spanning two regions, small defects and large defectsrelative to wall thickness. The predicted behavior in each case isshown. [From O’Handley, R. C. (2000). “Modern Magnetic Mate-rials, Principles and Applications.” Reprinted with permission ofJohn Wiley and Sons.]

predictions of these models are compared schematically inFig. 10.

II. MAGNETIC MATERIALS,FUNDAMENTAL PROPERTIES

A. Electronic Structure of Magnetic Oxidesand Metals

In oxides, the electronic wave functions are atomic-like;quantum numbers can be used to describe the energiesand spatial distributions of the electronic states. Oxygen(Fig. 11a) has a valence electronic structure of 2p4; the 2s2

states are at much lower energy. For transition (T) metalatoms (Fig. 11b) the valence electrons include those in the3dn states (0 < n < 10) as well as the lower-energy 4s2

states. Figure 11c depicts what happens to these atomic

FIGURE 11 Schematic valence electronic structures of atomic oxygen (a), T atom (b), and how they combine to formT-metal oxides (c), and T metals (d, e).

energy levels (T atom, oxygen atom) when transition metalatoms bond with oxygen to form stable ionic compounds(T-oxide, c) or to form a transition metal solid (T-metal,d,e).

When 3d T atoms interact with oxygen (whose 2p4

electrons are more electronegative than the 3dn states) toform a T-metal oxide, bonding and antibonding s-p-(d )orbitals are formed; 4s (and possibly some 3d) electronsare transferred to the oxygen atom to completely fill thebonding s-p-(d ) orbial, which is localized at the oxygensite. The remaining 3dn −x electrons on the T ion assumelower energies than before the charge transfer and are lo-calized more strongly there because the core potential nowexceeds the valence electronic charge. This electronic sta-bilization and the Coulomb attraction between the O2− andTx + ions are responsible for the binding energy of the T-metal oxide. The 3dn −x states lie in the energy gap betweenthe bonding and antibonding p-d states (Fig. 11c). Theselocalized d states may be split by the symmetry of the crys-talline electric field of the surrounding oxygen ions. Thecrystal field and exchange splittings of these states deter-mine the magnetic moment per T ion, nT = (n ↑d − n ↓d ) µB ,as well as other physical properties of the oxide such ascolor and electrical conductivity.

When transition metal atoms are brought together toform a solid T metal, the s states (l = 0) of differentatoms begin to interact first as interatomic distance de-creases and the d states (l = 2) interact at smaller atomseparations. This interaction of the electronic states overmany atoms spreads their energies over a range (energyband) due to bonding and antibonding interactions amongthe variously spaced atoms (Fig. 11d). The energy bandsof the 4s and 3d states of the approaching atoms canoverlap so that the available valence electrons (4s2 3dn)can be redistributed over the lowest energy states giving

Page 266: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

930 Magnetic Materials

4s2−y3dn +y per atom. This decreased electronic energyaccounts for the stability of the metal at a particular inter-atomic spacing ro (Fig. 11d). At ro the energy overlap ofthe d and s bands can be plotted versus the density of states(states/energy/atom) as shown in Fig. 11e. The partial fill-ing of the delocalized 4s band accounts for the electricalconductivity, the unequal filling of the exchange-split, nar-row 3d ↑ and 3d ↓ bands accounts for the metallic magneticmoment per atom, nB = (n ↑d − n ↓d ).

The bonding in intermetallic compounds generally hasboth metallic and covalent character. At the metallic endof the spectrum one finds magnetically soft or hard or-dered compounds such as Ni3Fe (Cu3Au structure) andCoPt (tetragonal CuAu structure). With increasing cova-lent bond character (i.e., more directional p-d or p- fbonds) the transition metal borides and the Heusler al-loys based on Cu2MnAl are found. A mixture of metallicand covalent bonding is found also in amorphous mag-netic alloys such as Fe80B18Si2 or Fe40Ni40B20. Here, theintroduction of p character from glass-forming elements(such as boron or phosphorus) adds directional charac-ter to the bonding and tends to stabilize the liquid state,forming a eutectic in the phase diagram, in compositionranges between stable compounds (e.g., between BCC Feand Fe2B).

B. Selected Fundamental Magnetic Properties

The magnetic properties at a given site in a material area function of the number, type, distance, and symmetryof the nearest neighbors. This dependence is expressedin terms of the Stoner criterion for magnetic momentformation

�(EF)�(EF) > 1. (21)

Here � is the strength of the intra-atomic exchange energyand � is the density of electronic states, both evaluated atthe Fermi level; they are both functions of the local en-vironment. Bonding tends to reduced �(EF) and weakenmagnetism. The Curie temperature can also be expressedas a function of the local environment by the mean-fieldexpression,

TC = 2z � s(s + 1)/kB (22)

In this case, � is the interatomic exchange energy, z is thecoordination number, and s is the spin quantum number.

1. Spinel Ferrites

Oxides (including the magnetically hard magneto-plumbite, hexaferrites) are very stable and mechanicallybrittle; they are typically used in polycrystalline (sinteredceramic) form. The most commonly used soft magneticoxides are the ferrites of the spinel structure, having

FIGURE 12 Magnetic moments in transition metal-zinc ferrites asT = Cu, Ni, Co. . . are substituted for divalent iron. [From Guillaud,P. C. (1951). J. Phys. Rad. 12, 239; Gorter, E. W. (1954). Phillip.Res. Rpt. 9, 295.]

the formula TO·Fe2O3 where T is a divalent transitionion. In this structure the Fe+3 and T+2 ions can occupytwo types of sites having different oxygen coordination:tetragonal “A” sites and octahedral “B” sites. Becauseof the dominant negative superexchange interactionbetween the moments on A and B sites, these materialsare typically ferrimagnetic. For the TO·Fe2O3 series withT = Mn2+ (3d5 , µm = 5 µB), Fe2+ (3d6 , µm = 4 µB), . . .Cu2+ (3d9 , µm = 1 µB), the two Fe3+ ions occupy differ-ent types of sites so their moments (5 µB each) cancel.The net magnetization is given by the moment on theT2+ species (see Fig. 12). As Zn2+ is substituted for theT2+ species, the Zn ion displaces the Fe3+ ions fromA to B sites because, among the most common ions,Zn2+ has the strongest chemical affinity for the A site.As a result, the net moment per formula unit (FU) ofthe compounds of the series T1−x Znx O·Fe2O3 increaseswith x even though Zn2+ bears no moment. At higherZn concentrations, the moment on the A sites becomessmall enough to allow the antiferromgnetic exchangeinteraction that exists among the moments on the B sitesto dominate, thus reducing the net moment per FU. Of theoxides in these series, Ni-Zn ferrites and Mn-Zn ferritesare widely used for soft magnetic applications because oftheir relatively large magnetization densities (few ferritesexhibit Bs > 0.4 T at room temperature), high electricalresistivity, and small values of K and λ.

The temperature dependence of the specific magne-tization in ferrites is generally more complicated thanthat of a single-sublattice ferromagnet (cf. Fig. 2). Inferrimagnetic materials the moments on the two sublat-tices may show different temperature dependences leadingthe net moment to show compensation temperatures andd M/dT > 0 over some temperature ranges (schematics atright in Fig. 13).

Page 267: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

Magnetic Materials 931

FIGURE 13 Temperature dependence of the magnetization inNiFe2−xVxO4. [From Blasse, G., and Gorter, E. W. (1962). J. Phys.Soc. Jpn. 16, Suppl. B-1, 176.] The schematic M–T curves at rightillustrate how the magnetization of two sublattices having differenttemperature dependences combine to give the unique featuresobserved in the data.

2. Transition Metals and Alloys

In 3d transition metal alloys, the saturation magnetic mo-ment µB (per average T species) varies as indicated inFig. 14, which is called the Slater-Pauling curve. For alloysto the right of the peak moment, the Fermi level cuts acrossonly the spin-down d band. As the d electron concen-tration, nd = n v − ns ≈ nV − 1, increases from 7.5 e/atomoward 10 e/atom, the number of holes per atom (unpairedspins), and hence the moment per atom, decreases linearly

FIGURE 14 The Slater-Pauling curve showing moment per atom(in Bohr magnetons) for metallic 3d alloys as a function of valenceelectron concentration or alloy composition. [From Diederichs,P. H . et al. (1991). In “Magnetism in the Nineties.” (Freeman, A. J.and Gschneider, K. eds.), North-Holland.] The inset schematicband structures illustrate the main difference between alloys tothe left and right of the peak in the curve.

toward zero at Ni40Cu60. The systematic behavior on theleft-hand side of the curve involves holes in both spin sub-bands and is not as easily explained in a simple model.Band structure calculations (Diederichs. P. H. et al., 1991)are able to explain most of the features in Fig. 14.

Figure 14 could lead one to expect that every atom,for example, each Fe and Ni atom in a Ni50Fe50 alloy, hasthe same magnetic moment, namely µT-ave ≈ 1.6 µB. Thisis, in fact, not the case. Spin-polarized neutron scatteringshows that there are different local magnetic moments onthe different T species with their sum given by the data inFig. 14. For example, Ni50Fe50 shows µFe ≈ 2.5 µB, andµNi ≈ 0.7 µB.

3. Intermetallic Compoundsand Amorphous Alloys

The dependence of magnetic moment on the local envi-ronment (number, type, distance, and symmetry of nearestneighbors) is shown clearly for a series of Mn-containingintermetallic compounds. The magnetic moment per Mnatom shows a linear decrease with increasing Paulingvalence (Fig. 15). The Pauling valence is a function ofinteratomic distance and coordination and is a measure ofcovalent (as opposed to ionic) bonding. Increased covalentbonding hinders magnetic moment formation because co-valent bonds have spin-paired electrons. A similar model(Corb, B. W. et al., 1983) has been shown to apply to 3dmetalloid alloys; it also shows that increasing the numberof bonding orbitals per T atom decreases its moment.

The variation of magnetic moment with valence elec-tron concentration in crystalline transition-metal boridesTB and T2B, shows a characteristic variation that appearssimilar to that of the Slater-Pauling curve but with thepeak moment suppressed and shifted more toward lower

FIGURE 15 Variation of Mn magnetic moment with Pauling va-lence. [From Mori, N., and Mitsui, T. (1968). J. Phys. Soc. Jpn. 25,82.]

Page 268: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

932 Magnetic Materials

FIGURE 16 Variation of magnetic moment per transition metalatom in crystalline and amorphous alloys as a function of numberof valence electrons, nv . nv = 8, 9, and 10 corresponding to Fe, Co,(or Fe0.5Ni0.5) and Ni, respectively. The data for crystalline materi-als are based on Fig. 14. [From O’Handley, R. C. (2000). “ModernMagnetic Materials, Principles and Applications.” Reprinted withpermission of John Wiley and Sons.]

T-valence with increasing boron content (Cadeville, M. C.et al., 1966). This shift is not due to the effect of the pres-ence of boron on the valence electron concentration (theshift would require boron to increase the valence electronconcentration). The shift is more subtle than that: thesecompounds show an entirely different electronic structurethan the 3d metallic alloys.

Amorphous magnetic alloys combine some of the bond-ing characteristics of both 3d alloys and intermetalliccompounds. Figure 16 shows the variation of the low-temperature saturation moment per transition metal atomas a function of T content for amorphous alloys based onboron, T80B20, and on phosphorus, T80P20. The variationof magnetic moment in crystalline alloys (cf. Fig. 12) isshown as a dotted line for reference. Amorphous T80B20

alloys show magnetic moments that are shifted relativeto the Slater-Pauling curve in a way that is consistentwith data for crystalline TB and T2B compounds andalloys.

In rare-earth-transition metal intermetallics (R-T), themagnetic moments of transition metals couple ferromag-netically with light rare-earth moments (JR · sT > 0) andantiferromagnetically with heavy rare-earth moments(JR · sT < 0). Thus the spin-spin coupling between Rand T species is always antiferromagnetic (Fig. 17, left).This coupling can be ascribed to the 5d conductionelectrons of the rare-earth (whose spin is always parallelto that of the 4 f electrons) and their interaction withthe symmetry-compatible 3dn electrons of the transitionmetal. Exchange between these two sets of d statesis invariably antiferromagnetic with respect to the delectrons involved; there are only minority-spin holes in

FIGURE 17 Simplified schematic representation of spin and an-gular momentum coupling at rare-earth site and antiferromagneticexchange coupling between R and TM spins. Right, schematicband structure that accounts for antiferromagnetic R-TM spin cou-pling. [From O’Handley, R. C. (2000). “Modern Magnetic Materials,Principles and Applications.” Reprinted with permission of JohnWiley and Sons.]

the TM 3d orbitals, and spin is conserved in the dominanthopping process (Fig. 17, right). The antiferromagneticR-5d–T-3d interaction explains the ferromagnetic cou-pling of T moments to light (J = L − S) R momentsand antiferromagnetic coupling to heavy (J = L + S)R species. The net result of R-T exchange coupling isgenerally larger magnetic moments for intermetallics oflight rare-earths and transition metals.

III. TECHNICAL PROPERTIES

A. B-H Loops and Magnetic Domains

Consider a magnetic material in the demagnetized state(B = 0, H = 0 in Fig. 18). Even though the local magneticmoments show long-range order (typically over severalmicrons), the demagnetized state can be achieved by theformation of magnetic domains. Domains are regions ofhomogeneous magnetization separated by domain walls,surfaces over which the orientation of the atomic moments

FIGURE 18 Hysteresis loop of a magnetic material showing thevariation of B with changing H . Initial magnetization curve fromthe demagnetized state is shown with the initial permeability µi ,indicated. The remanence Br and coercive field, Hc, are indi-cated. The approximate domain structures are indicated at rightfor demagnetized state and for approach to saturation. [FromO’Handley, R. C. (2000). “Modern Magnetic Materials, Principlesand Applications.” Reprinted with permission of John Wiley andSons.]

Page 269: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

Magnetic Materials 933

changes relatively abruptly (Section II.C). The vectormagnetizations of the domains sum to zero in the de-magnetized state. Application of a field to a demagne-tized sample results in the motion of domain walls soas to expand the volume of those domains having thelargest component of M along H. The initial flux den-sity, B = µo (H + M) = φ/area, produced in response toa small field, H , defines the initial permeability, µi =B /H ]H ≈0. At stronger fields, the permeability increases toits maximum value µmax . The relative initial permeability,µr , can be as great as 105 or µr µo ≈ 10−1 in some materi-als. When most domain wall motion has been completed,there often remain domains with nonzero components ofmagnetization at right angles to the applied field direc-tion. The magnetization in these domains must be rotatedinto the field direction to minimize the potential energy—M · B. This process generally costs more energy than wallmotion.

Upon decreasing the magnitude of the applied fieldfrom saturation, the magnetization rotates back towardits “easy” directions, generally without hysteresis. As theapplied field decreases further, new domains may nucleateand domain walls begin moving back across the sample.Because energy is lost when a domain wall jumps abruptlyfrom one local energy minimum to the next (Barkhausenjumps), wall motion is an hysteretic or lossy process. Theflux density and magnetization remaining in the samplewhen the applied field is zero are called the residual fluxdensity, Br , and remanence, Mr , respectively. The reversefield needed to restore B to zero is called the coercivity,Hc. (The field needed to restore M to zero is called the in-trinsic coercivity, i Hc. The distinction between Hc and i Hc

is important only in permanent magnets, because in a softmagnetic material Hc � M so that M = 0 for essentiallythe same field that gives B = 0). Hc is a good measure ofthe ease or difficulty of magnetizing a material.

In some soft magnetic materials, domain walls can bemoved with fields of order 0.1 A/m. In general, defectssuch as grain boundaries and precipitates cause the wallenergy to depend on position, so in most soft magnetic ma-terials, higher fields (of order 10–1000 A/m) are requiredto move domain walls or rotate the magnetization vector.

B. Soft Magnetic Materials

A soft magnetic material is one for which Hc is lessthan or equal to about 5 × 103 A/m. Pure iron is theprototypical soft magnetic material. It has a very highsaturation flux density, Bs = µo Ms = 2.2 T, and its cu-bic crystal structure leaves it with a small magnetocrys-talline anisotropy, K1 = +4.8 × 104 J/m3, and small mag-netostriction, λ100 = +21 × 10−6. Domain images from a(100) iron −3% Si single crystal taken by scanning elec-

FIGURE 19 Magnetic domains at the surface of a 3% Si-Fe crys-tal taken by scanning electron microscopy with spin polarizationanalysis (SEMPA). Crystallographic 〈100〉 axes lie in the imageplane along the horizontal and vertical directions. Left panel showsmagnetic contrast when the instrument is sensitive to the horizon-tal component of magnetization: dark is magnetized to the left,light to the right. In the right panel, the contrast is sensitive to ver-tical component of magnetization: dark is magnetized down, lightis magnetized up. (Courtesy of Celotta, R. J. et al., unpublished.)

tron microscopy with spin polarization analysis (SEMPA)are shown in Fig. 19. The magnetization within each do-main follows the easy 〈100〉 direction (dictated by K1 > 0)leading to 90◦ and 180◦ domain walls. Addition of smallamounts of silicon improves the usefulness of iron for anumber of soft magnetic applications at modest frequen-cies (typically for 50–60 Hz transformers).

There are three major Fe-Ni or permalloy compositionsof technical interest:

1. % nickel permalloys (e.g., Supermalloy�,Mumetal�, Hi-mu 80�). The 80% nickel permalloysare very important because the magnetostriction andmagnetocrystalline anisotropy both pass through zeronear this composition (see Figs. 7 and 9). They showa saturation flux density of about 1 T. These alloys areused where the highest initial permeability is required.This includes inductors for power supplies and circuitsas well as magnetic recording read and write heads.

2. 65% nickel permalloys (e.g., A Alloy�, 1040Alloy�). The 65% nickel permalloys show a strongresponse to field annealing while maintaining smallanisotropy.

3. 50% nickel permalloy (e.g., Deltamax�). The 50%nickel permalloys are important because of theirhigher flux density (Bs = 1.6 T) as well as theirresponsiveness to field annealing to give a verysquare loop.

The equiatomic BCC FeCo alloys (called Permen-durs, Fig. 20a) have very high saturation flux density(Bs ≈ 2.4 T) as well as relatively low magnetic anisotropy.While the magnetocrystalline anisotropy (as well as stress-induced-anisotropy) limits the soft magnetic properties

Page 270: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

934 Magnetic Materials

FIGURE 20 (a) Magnetic properties of BCC Fe-Co alloys. Anisotropy and magnetostriction. [After Hall, R. C. (1959).J. Appl. Phys. 30, 816; dotted lines are for CsCl-ordered phase.] (b) Magnetic properties of amorphous Fe-Co-B alloysfor which the magnetocrystalline anisotropy is essentially zero over the entire composition range. [From O’Handley,R. C. et al., (1979). J. Appl. Phys. 50, 3603; O’Handley, R. C. (2000). “Modern Magnetic Materials, Principles andApplications.” Reprinted with permission of John Wiley and Sons.]

of these alloys, grain size and texture are the primary fac-tors determining the technical magnetic properties actu-ally attained.

Amorphous metallic alloys are materials that are rapidlyquenched from the melt or deposited from the vaporstates as thin films so that their atomic structure lacksthe long-range order of a crystalline solid. Without long-range order, amorphous alloys have no magnetocrystallineanisotropy. Thus amorphous metallic alloys based on tran-sition metals can show a very easy magnetization process.Magnetoelastic anisotropy remains as the major impedi-ment to easy magnetization in amorphous alloys. Figure 21shows that the coercivity of amorphous Co80−x Fex B20 al-loys is minimized near the composition x = 4 for whichλ ≈ 0.

The presence in amorphous magnetic alloys of sig-nificant concentrations of non-magnetic, glass-forming

FIGURE 21 Variation of coercivity (left scale) and magnetostric-tion (right scale) with Fe/Co ratio in amorphous (CoFe)80B20 al-loys. [From O’Handley, R. C. et al. (1976). IEEE Trans. MAG-12,924.]

species reduces the saturation magnetization but often hasother beneficial effects (increased resistivity, decreasedmagnetostriction). The high electrical resistivity ofamorphous alloys (120–150 µ�cm) compared to Si-Fe(30−50 µ�cm) and iron-nickel alloys (20 µ�cm) makesthem attractive for high-frequency operation. Reasonablystrong magnetization can be realized in a variety ofamorphous alloys based on iron, cobalt, and/or nickel (cf.Figs. 16 and 20b for Fe-Co-B amorphous alloys).

The most widely used soft magnetic ferrites are basedon manganese-zinc ferrite and nickel-zinc ferrite. Thesematerials have the spinel structure of Fe3O4. Figure 22shows the dependence of magnetostriction and permeabil-ity on Fe2O3 content in [(MnO)0.7(ZnO)0.3]1−x ·(Fe2O3)x .In these compositions, K1 is small so the permeabilitypeaks near the composition at which λs = 0 (cf. Fig. 21).The insulating character of oxide magnets makes themuseful at frequencies in excess of the MHz range.

FIGURE 22 Variation of permeability and magnetostriction withiron oxide content in MnZn ferrites. [After Guillaud, P. C. (1957).Proc. IEEE 104B, 165.]

Page 271: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

Magnetic Materials 935

FIGURE 23 Pulse permeability versus maximum flux swing forthree classes of soft magnetic materials: amorphous metallicalloys (Co-Fe-Nb-B-Si and Co-Mn-Fe-Mo-b-Si), crystalline Mo-permalloy (Ni-Fe-Cu-Mo), and ceramic Mn-Zn ferrites. Data fortwo thicknesses are given in each case. [From Boll, R., andHiltzinger, H. R. (1983). IEEE Trans. MAG-19, 1946.]

Figure 23 compares the AC performance of representa-tive soft magnetic materials in a plot of pulse permeabilityversus flux density with thickness (or particle size for fer-rites) given as a parameter. While ferrites generally havean advantage at higher frequencies, in pulse power appli-cations, where large dB/dt is important, the other classesof materials show advantage.

C. Nanomagnetic Materials

Composite materials in which one of the component mi-crostructures has one, two, or three nanoscale dimensionsallow new properties and functions to be realized that maynot be achievable in simpler structures or by changingcomposition in a single-phase alloy. Controlling struc-ture and feature sizes at the nanometer scale in magneticmaterials is particularly effective because many of theimportant magnetic length scales that govern magneticproperties fall in the nanometer range.

1. The domain wall thickness, Eq. (18), can range from5 nm to several hundred nanometers on going fromhard to soft magnetic materials.

2. The exchange length, lex =√

2A /(µo M2s ), is the

minimum distance over which the local momentdirection can change to reduce magnetostatic energy.Exchange lengths are typically less than 10 nm.

3. The critical size below which magnetic particles areunable to support a domain wall, the single-domainlimit, is typically in the range of 20–40 nm.

4. Below a certain particle size called thesuperparamagnetic limit, thermal energy, kBT issufficient to demagnetize a particle that wouldotherwise retain a set direction of magnetization byvirtue of its anisotropy. This size dependsexponentially on the time scale over which thermaldemagnetization occurs, but is typically less than10 nm.

Nanostructured soft magnetic materials can be engi-neered by creating single-domain crystalline particles inan amorphous magnetic matrix. The prototype of this classof materials is the nanocrystalline magnet Fe-Si-B-Nb-Cu (α-Fe3Si particles in a matrix of residual amorphousphase (Yoshizawa, Y. et al., 1988). The Cu and Nb areadded to what is otherwise a common glass-forming com-position in order to enhance nanocrystal nucleation and toretard growth of those nuclei, respectively. In these mate-rials, the properties can vary widely depending upon thesize of the nanocrystalline particles and the dimensionsand magnetic properties of the intervening amorphousmatrix.

The effect of the intergranular amorphous phase as anexchange-coupling medium between the single-domainparticles is illustrated by experiments shown in Fig. 24.Above TC of the Nb-rich, amorphous matrix, the materialis an assembly of noninteracting, single-domain, Fe3Siparticles; as such, it shows low coercivity. Below TC ofthe intergranular phase, the single domain particles areexchange coupled to each other so that they switch coher-ently in an applied field. Here too, the coercivity is rela-tively small. A peak in coercivity appears near the decou-pling temperature. It is due to the independent switching ofadjacent grains, creating domain walls between them. Thepresence of domain walls leads to discontinuous responseand hysteresis in the system.

FIGURE 24 Schematic summary of the results of measurementson nanocrystalline Fe-B-Si-Nb-C alloys having different Nb con-tents. [From Skorvanek, I. et al. (1995). J. Magn. Magn. Mater.140–144, 467.]

Page 272: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

936 Magnetic Materials

Starting with amorphous Co-Nb-B alloys, the coerciv-ity increases by nearly four orders of magnitude throughvarious stages of devitrification corresponding to variousnanostructure length scales. The coercivity initially in-creases consistent with the D6 power law derived theoret-ically (Herzer, 1993). Above a peak, Hc drops off moregradually, consistent with the 1/D behavior predicted fordefect sizes greater than the domain wall width (Fig. 10).

Iron-rich amorphous and nanocrystalline alloys gener-ally show larger magnetization than those based on Co orNi. However, iron-rich amorphous alloys generally havefairly large magnetostriction, limiting their permeability(Fig. 20b). Formation of a nanocrystalline iron-rich al-loy can lead to a dramatic reduction in magnetostriction,thus favoring easy magnetization (Hasegawa, N. et al.,1993).

Most magnetically soft nanocrystalline systems arebased on DO3 crystallites (e.g., Fe3Si) in an amorphousmatrix. However, the presence of 25 atom% Si signif-icantly reduces the saturation flux density of the alloy(Hasegawa, N. et al., 1991). Spinodal decomposition inmetastable amorphous transition metal-carbon alloys canbe used to form nanocrystalline alloys of the general for-mula (Fe, Co)81Ta9C10. Annealing at 550◦C for 20 minresults in primary crystallization of α–Fe (or α−FeCo)particles measuring 5–10 nm in diameter and dispersedtransition metal carbide nanocrystals (generally at triplejunctions). The primary nanocrystals share grain bound-aries making grain-to-grain exchange coupling stronger.The softest magnetic properties are obtained for the small-est nanocrystalline grain sizes. In these alloys the stablecarbide grain boundary phase inhibits grain growth just asNb does in FeBSiCuNb.

Other magnetic nanocrystalline materials show po-tential for increased saturation flux density. Alloys ofthe type Fe88Zr7B4Cu1 (Suzuki, K. et al., 1991) and

FIGURE 25 (a) Second quadrant M-H loops of some common permanent magnets. (b) Increase in (BH )max ofpermanent magnets over recent decades. [From O’Handley, R. C. (2000). “Modern Magnetic Materials, Principlesand Applications.” Reprinted with permission of John Wiley and Sons.]

Fe44Co44Zr7B4Cu1 (Willard, M. A. et al., 1998) crystal-lize to high-magnetization α-Fe nanostructures. The lattershow saturation flux densities in excess of 2 T.

D. Hard Magnetic Materials

Permanent magnets are used to produce strong fields with-out having to apply a current to a coil. Hence they shouldexhibit a strong net magnetization. It is also importantthat the magnetization be stable in the presence of ex-ternal fields. These two conditions indicate that the B-Hloop should have large values of remanent induction, Br ,and coercivity Hc, respectively. Permanent magnets havecoercivities in the range of 104–106 A/m.

The shapes of the M-H loops in the second quadrant(which determines demagnetization behavior) are com-pared in Fig. 25 for some common permanent magnets.The permanent magnets with the highest energy productsare based on Fe14Nd2B1. Figure 25b shows the evolutionof the maximum energy product in permanent magnetsover recent decades.

The earliest permanent magnets were the natural formsof magnetite, Fe3O4. More recently, the high-anisotropyhexagonal ferrites (barium or strontium hexaferrite)(Kojima, H. et al., 1982) and the magnets of the alnicofamily were developed (McCurrie, R. A., 1982). Thehexagonal ferrites derive their magnetic hardness fromtheir very large magnetocrystalline anisotropy (Table I)and relatively weak magnetization. The alnico magnetsachieve high-coercivity by the formation of a high-aspect-ratio structure consisting of columns of nonmagneticNi3Al in a matrix of α-Fe by spinodal decompositionfrom a Heulser composition such as Fe2NiAl. Alnicomagnets can achieve high remanence by processingin a mgentic field to achieve strong orientation of thecolumnar microstructure.

Page 273: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

Magnetic Materials 937

Hard magnets of the type SmCo5 boast the highest uni-axial anisotropies of any class of magnets, Ku ≈ 107 J/m3.On the other hand, phases in the class Sm2Co17 exhibithigher flux density and Curie temperature. Most Co-Rmagnets are, in fact, multiphase composites of these twostructures and sometimes other phases.

Because of the large magnetic anisotropy of SmCo5, a180◦ Bloch wall in this material should have a width ofonly 3.1 nm. Further, the domain wall energy density is40 mJ/m2 (40 erg/cm2), 100 times that of a soft material.Such walls are not easily nucleated and thus the mag-netization process in single-phase RCo5 intermetallics islimited by reversal domain nucleation. Once nucleationoccurs, domain walls move relatively easily until theyreach a grain boundary or other defect. Hence, initial ef-forts to produce cobalt rare-earth magnets focused on thefabrication of single-domain SmCo5 particles. Small sub-stitutions of Cu for Co would lead to the precipitation ofa nonmagnetic phase that increases the coercivity. Theyshowed that heat treatment of R(CoCu)5 magnets resultsin precipitation of a dispersion of fine (d ≈ 10 nm) second-phase, Cu-rich particles in a R2Co17 matrix having a grainsize of order 10 µm (Fig. 26). The coercivity mechanismbecomes domain wall pinning on the small nonmagneticSmCu5 particles.

The more-recently-developed magnets based onFe14Nd2B exhibit the highest energy products achievedso far in permanent magnets. Their development came as

FIGURE 26 Precipitation microstructures of 1–5 type R-Co mag-nets, bulk-hardened with copper: SmCo3.5Cu1.0Fe0.5 homoge-nized at 1100◦C for 3 hr, quenched and aged at 525◦C. The fine(10-nm) dark precipitates are platelets of Cu-rich, Sm(CuCo, Fe)5.[After Strnat, K. (1988). In “Ferromagnetic Materials,” Vol. 4 (Wohl-farth, E. P. and Buschow, K. H. J., eds.), Elsevier Press, NorthHolland, Amsterdam.]

FIGURE 27 X-ray composition micrograph of a sinteredNd0.15Fe0.77B0.08 magnet; T1, T2, and Nd denote Nd2Fe14B,Nd1+eFe4B4, and an Nd-rich phase, r espectively. [From Sagawa,et al. (1987). Jpn. J. Appl. Phys. 26, 785.]

a result of the cost and limited world supply of cobalt.Commercial Fe-Nd-B magnets based on sintering andmelt spinning are available.

The attractive permanent magnet properties ofFe14Nd2B1 magnets arise from several factors: (1) thelarge uniaxial magnetic anisotropy (Ku = +5 × 106 J/m3)of this tetragonal phase, (2) the large magnetization(Bs = 1.6 T) owing to the ferromagnetic coupling be-tween the Fe and Nd moments; and (3) the stability of the14-2-1 phase which allows development of a compositemicrostructure characterized by 14-2-1 grains separatedby nonmagnetic B- and Nd-rich phases (Fig. 27) whichtend to decouple the magnetic grains. Figure 28 shows

FIGURE 28 The easy and hard axis magnetization curves forthe two principal Co-Sm magnets and Fe14Nd2B. [After Strnat, K.(1988). In “Ferromagnetic Materials,” Vol. 4 (Wohlfarth, E. P. andBuschow, K. H. J., eds.), p. 131. Elsevier Press, North Holland,Amsterdam.]

Page 274: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

938 Magnetic Materials

TABLE V Comparison of Some Magnetic Properties for SmCo5, Sm2(CoFe)17, andFe14Nd2B Permanent Magnets at 25◦C

i Hc (MA/m) (BH)max (MGOe)µo Ms TC K u

(T) (◦C) (MJ/M3) Isotr. Aligned 2-D 3-D

SmCo5 1.0 685–700 10 0.8–1 2.9 14–16 18–24

Sm2(CoFe)17 1.2–1.5 810–970 3.3 1–1.3 2.4 16–20 24–30

Fe14Nd2B 1.6 312 5 — 1.2–1.6 34–45

the easy and hard axis magnetization curves for thetwo principal Co-Sm magnets and Fe14Nd2B (see alsoTable V).

Many permanent magnets are essentially nanostruc-tured materials consisting of high-anisotropy, single-domain particles (or multidomain particles with pinneddomain walls) that are magnetically decoupled from eachother.

IV. RELATED MAGNETIC PHENOMENAAND APPLICATIONS

A. Thin Film and Surface Magnetism

Research in magnetic thin films has grown in sophistica-tion as advances in ultrahigh vacuum technology and sur-face characterization techniques kept pace with improve-ments in theoretical understanding and computationalaccuracy.

At a clean surface, the atoms of a material have re-duced nearest-neighbor coordination, reduced symmetryand generally different bond lengths compared to inte-rior atoms. These changes generally result in narrower dbands and increased density of states, which in turn favorincreased saturation magnetic moment and lower Curietemperature. First principles band structure calculations atsurfaces bear this out. It is very difficult experimentally toisolate the magnetic moment of surface atoms sufficientlyto detect a 10% increase in moment.

Near a surface, the narrower, more atomic-like d bandsgive rise to increased angular momentum; the reducedsymmetry near a surface generally leads to stronger mag-netic anisotropy. The ratio L2

z /(L2x + L2

y) is increased be-cause L = r × p and momentum perpendicular ot the sur-face must be reduced. The surface magnetic anisotropycan be expressed as an energy density in powers of the di-rection cosines of M [cf. Eq. (11)] having the appropriateuniaxial symmetry:

σ = K s1 α

23 + K s

2 α23 + K 2

3 α21 α

22 + · · · (23)

With this convention, K s1 > 0 (<0) favors in-plane (per-

pendicular) magnetization near the surface. The first term

in Eq. (23) is most often measured in thin film systems andis clearly a function of the nature of the material acrossthe interface from the magnetic species. Compared to en-hanced surface magnetic moments, surface anisotropy ismore readily determined experimentally but is harder tocalculate from first principles. The literature abounds withobservations of surface anisotropy effects in thin films andat surfaces.

The first term in Eq. (23) is often expressed as:K eff sin2 θ , with the contributions to the effective aniso-tropy defined by

K eff = −2π M2s + 2B1 εxx + 2

K s

t. (24)

Here B1 is the magnetoelastic coupling coefficient, εxx isthe in-plane film strain, K s is the surface anisotropy [−K s

in Eq. (23)]. Measured values of K s are typically of order0.1–0.5 mJ/m2.

Figure 29 shows the measured effective, first-order uni-axial anisotropy for Cu/Ni/Cu(001) epitaxial films. Thepositive values of K eff over a relatively wide thicknessrange, 2 nm < tNi < 140 nm, imply perpendicular magne-tization there. Here the perpendicular magnetization re-sults from small magnetostatic energy and large positivemagnetoelastic energy.

There has been considerable interest in the switchingof the easy axis of magnetization in ultrathin, epitaxialFe films (tFe < 1 nm). This phenomenon has been studiedas a function of film thickness and temperature and theresults point to the origin of perpendicular magnetizationin surface anisotropy.

Co thin films layered with Au (den Broeder, F. J. A.et al., 1988) or with Pd (Engle, B. et al., 1991) can showstrong perpendicular magnetic anisotropy up to 2 nm inthickness. In these systems both magnetoelastic energyand interface anisotropy appear to be important for per-pendicular magnetization.

B. Electronic Transport in Magnetic Materials

The electrical transport properties of magnetic materials(called galvanomagnetic effects) are interesting becausethe response of the conduction electrons can depend on

Page 275: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

Magnetic Materials 939

FIGURE 29 Variation of effective anisotropy times Ni thickness (K eff × t) versus Ni thickness in Cu/Ni/Cu(001)sandwiches [after Jungblut, R. et al. (1994). J. Appl. Phys. 75, 6424; Bochi, G. et al. (1995). Phys. Rev. B 52, 7311;and Ha, K. et al. (1999). J. Appl. Phys. 85, 5282], using in situ magneto-optic Kerr effect, ex situ vibrating-samplemagnetometry, and ex situ torque magnetometry, respectively, to determine the effective anisotropy. K eff × t > 0implies perpendicular magnetization in the uniaxial approximation. [From O’Handley, R. C. (2000). “Modern MagneticMaterials, Principles and Applications.” Reprinted with permission of John Wiley and Sons.]

their spin direction. The unique features result from thepresence of both s and d electrons at the Fermi energy(Fig. 11e) as well as differences in the density of 3d ↑and 3d↓ states at EF. In magnetic transition metals, thetwo equally-populated sub-bands of s electrons (↑ and ↓)carry most of the current in two parallel spin channels.In each spin channel there is a small resistivity associatedwith s electron scattering and a larger resistivity associ-ated with scattering of s electrons into localized d states,s-d scattering. To a first approximation, there is no mix-ing between the two spin channels. For iron, the Fermienergy cuts across both spin-up and spin-down d bands,hence there is a large s-d contribution to the resistivity ineach channel. In Ni, the Fermi energy is above the top ofthe majority-spin d band and hence the resistivity in thatband is less than in the minority-spin sub-band where s-dscattering can occur.

Magnetotransport effects in a given magnetic mate-rial can arise from two types of magnetic interactions—exchange coupling and spin-orbit interaction. A con-duction electron generally has a greater scattering crosssection at sites having spin opposite its own. This is re-ferred to as exchange scattering or spin disorder scattering.

Conduction electrons can also interact with a scatteringsite by the spin-orbit interaction. The total angular mo-mentum must be conserved during the scattering process.

Thus, an electron of spin-up before scattering can onlyscatter into a spin-up state (no change in spin-angular mo-mentum), if it does not change its orbital angular momen-tum; it can scatter into a spin-down state if it increasesits orbital angular momentum. These two spin-dependentscattering mechanisms can therefore serve to mix thecarriers in the two spin sub-bands. Opening a path be-tween two parallel conduction paths always lowers the netresistivity.

The “ordinary” galvanomagnetic effects, i.e., the Halleffect, E = RH(J × µ0 H) and magnetoresistance, are ob-served in most materials. These effects are classical andarise from the Lorentz force, F = q(v × B), on the chargecarriers. In ferromagnetic materials, the ordinary effectsare present but are generally overshadowed by phenom-ena with similar symmetries. The galvanomagnetic effectsunique to ferromagnets are called “extraordinary,” “spon-taneos,” or “anomalous” because of their greater strengthrelative to the ordinary effect. The extraordinary galvano-magnetic effects derive their strength from the fact that therole of the external field is replaced by the internal field,which is proportional to the magnetization and hence ismuch stronger than an applied field. The mechanism bywhich M couples to the electron trajectory in ferromag-nets is the spin-orbit interaction between the current carrier(orbit) and the magnetization (spin).

Page 276: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

940 Magnetic Materials

In ferromagnetic materials the Hall resistivity may bewritten

ρH = EH

J= ρoH + ρs H

= Ro B + Rs µo M (25)

where the first term is the ordinary Hall resistivity pro-portional to the external field and the second term isthe spontaneous effect, proportional to the magnetiza-tion. We can write these two terms as a simple sumbecause the vector symmetry of the spin-orbit energy,L · s, responsible for the spontaneous Hall effect, is com-patible with the energy of the classical Lorentz force,r · F ∝ r · (v × B) = (r × v) · B ∝ L · M ∝ L · s .

The anisotropic magnetoresistance (AMR) may be de-termined by extrapolation of high field MR (ordinary MR)data to H = 0 (see Fig. 30). The fractional change in re-sistance with field due to anisotropic magnetoresistance,while only a few percent in the best cases (Ni90Fe10), isused in numerous sensor applications. While the Hall ef-fects are linear in B = µo(H + M), the MR effects arequadratic in B.

There is a more-recently-discovered galvanomagneticeffect that is unique to electronic transport in thinfilms. Called giant magnetoresistance (GMR), it isof a different physical origin than AMR. GMR arisesfrom spin-spin exchange scattering and requires smalldimensions in the components of a ferromagnetic/noblemetal/ferromagnetic sandwich or multilayer. In 1988,Baibich, M. N. et al. (1988) reported an MR ratio oforder 50% at 4.2 K in multilayers of the Fe-Cr alloys.This magnetoresistance was approximately an orderof magnitude greater than the highest values knownto that time. The Fe layers in these experiments weretypically 30–60 A thick and separated by Cr layersfrom 9 to 60 A. The iron layers are strongly coupled

FIGURE 30 (a) Resistivity of Ni0.9942Co0.0058 at room temperature versus applied field (McGuire, T. 1975). (b) Low-field magnetoresistance for cobalt thin film showing even field symmetry and hysteresis. [After Parkin, S. S. P. (1994).In “Ultrathin Magnetic Structures,” Vol II (Heinrich, B. and Bland, J. A. C. eds.) Springer-Verlag, Berlin; O’Handley,R. C. (2000). “Modern Magnetic Materials, Principles and Applications.” Reprinted with permission of John Wiley andSons.]

antiferromagnetically through the Cr layers and hence aredifficult to saturate. If electrons bearing the polarizationof one ferromagnetic layer drift into the other (through athin 2–3 nm of noble metal) the scattering probability isgreater or less depending on whether the polarizations ofthe two layers are antiparallel or parallel. In order for thedifference in scattering probability to show up as a large�R /R, two general conditions should be satisfied.

1. The thicknesses of the three layers must be smallenough that a large fraction of the charge carriersfrom one ferromagnetic component diffuse into theother before experiencing spin-dependent scattering

2. The magnetization in the two components should beable to be controlled independently either by havingdifferent coercivities or different anisotropy fields

A spin valve is a simple embodiment of the GMR effectin which there are only two magnetic layers separated bya nonmagnetic conductor. The magnetic layers are uncou-pled or weakly coupled in contrast to the generally strongAF exchange operating in Fe/Cr-like multilayer systems.The magnetoresistance can be made to change in fieldsof a few tens of Oe rather than tens of kOe. One of thelayers is magnetically soft and the other is magneticallyhard. Thus, a modest field can cause a change in the anglebetween the moments of these two magnetic layers.

In operation of the spin valve, cycling the field causesM1 and M2 to lie antiparallel or parallel to each other. Theresulting M-H loops are shown schematically in Fig. 31.The sharp magnetization reversal near H = 2 Oe is due tothe switching of the soft NiFe layer 2 in the presence of itsweak coupling to layer 1. The more rounded magnetiza-tion reversal near 100 Oe is the switching of the hard layer.The relative orientations of layers 1 and 2 are indicated bythe pairs of arrows in each region of the M-H curve. In

Page 277: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

Magnetic Materials 941

FIGURE 31 Room temperature magnetization and relativechange in resistance for Si/(NiFe 150 A)/(Cu 26 A)/(NiFe 150 A)/(FeMn 100 A)/(Ag 20 A). Current is perpendicular to the easy axisdetermined by the FeMn film which is exchange coupled to theadjacent permalloy layer. [After Dieny, B. et al. (1991). Phys. Rev.B 43, 1297.]

the lower panel, the change in resistance during the samemagnetization cycling is shown. The GMR resistance islarger for antiparallel alignment of the two magnetic lay-ers, whereas the classic AMR of single magnetic layergenerally shows ρ‖ > ρ⊥

Spin tunnel junctions have some similarities to spinvalves and spin switches in their structure and field de-pendence. However, in a tunnel junction, the nonmagneticspacer layer is an insulator.

A tunnel junction is conveniently formed from a crossedpair of metal film stripes. The first deposited stripe, e.g.,aluminum, may be oxidized partially to form a barrier be-fore deposition of the second electrode. A voltage appliedacross such a junction can result in a current if there areoccupied states in one electrode at the same energy asunoccupied states in the other, plus or minus kB T .

Moodera et al. (1996) have shown that by using thinmagnetic films of different coercivities, the ferromagnet-insulator-ferromagnet (F-I-F) tunnel junction can form asensitive magnetic field probe. Figure 32 shows the frac-tional resistance change (�R normalized to the high-fieldvalue of resistance) in FeCo-Al2O3-Co junctions. Alsoshown is the AMR measured in each individual electrode.These AMR measurements show that the small value ofthe AMR effect contributes little to the tunneling MR ef-fect. They also clearly indicate the coercivities of the twouncoupled ferromagnetic layers. The field-dependence ofthe tunneling MR ratio then appears much like that of aspin valve or a spin switch with higher resistance when thetwo ferromagnetic electrodes are magnetized antiparallelto each other.

Very large magnetic-field-induced changes in resistivityhave been observed in the doped perovskite, lanthanum-strontium manganate, (La1−x Srx )MnO3 (Jin et al., 1994).This so-called colossal magnetoresistance (CMR) occursin a region where a metal-insulator transition coincideswith the ferromagnetic-paramagnetic transition. The resis-tivity increases with increasing temperature in the metallicmagnetic phase. Application of a field expands the fer-romagnetic phase, displacing to higher temperatures themetal-insulator transition and hence displacing the sharpincrease in metallic resistivity. Thus the MR ratio is pro-portional to the temperature derivative of the R(T) curvetimes the field derivative of the metal-insulator transitionby a derivative chain rule. That is, the sharper the resistiv-ity transition and the stronger the field dependence of thattransition, the greater will be the CMR ratio.

Although CMR is a different physical effect than GMR,it has a similar formal dependence on magnetization orien-tation; aligning the moments on adjacent cation sites (AFto F) causes the resistance to decrease. This is due to theincrease in hopping conductivity of the cation eg electronsfor parallel spins. Fields of tens of kOe are needed to sat-urate the effect because they are working against thermalenergy, which overcomes the long-range ferromagneticcoupling above the Curie temperature.

C. Magnetic Recording

A variety of magnetic materials, transport effects, thinfilm, and nanostructured materials find applications inmagnetic data storage systems.

FIGURE 32 Above, anisotropic magnetoresistance in each indi-vidual CoFe and Co electrode, and below, junction magnetoresis-tance in CoFe/Al2O3/Co spin tunnel junction versus applied field.Measuements done at room temperature and arrows indicate therelative directions of magnetization in the two magnetic layers.[After Moodera, J. et al. (1996). Appl. Phys. Lett. 69, 708.] Notesimilarity with Fig. 30.

Page 278: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

942 Magnetic Materials

FIGURE 33 Schematic representation of longitudinal, digitalmagnetic recording write processes. Insert, upper right, sequenceof transitions constitute the bits which are read as binary informa-tion. [From O’Handley, R. C. (2000). “Modern Magnetic Materials,Principles and Applications.” Reprinted with permission of JohnWiley and Sons.]

In the digital recording process (Fig. 33), the magneticrecording medium (tape or disk) moves relative to an elec-tromagnetic transducer that is essentially a magnetic cir-cuit with a gap. When a current passes through windingsabout the head, the head is magnetized and a fringe fieldappears in the gap. In the write process, the fringe field inthe gap magnetizes the medium alternately in one directionor the other as the drive current changes polarity. Becausethe head and the medium move relative to each other, in-formation can be described in the head reference framein terms of the variable ωt (e.g., e −i ωt ) or in the mediumreference frame by the variable kx where k = 2π/λ. Thesequence of binary states has digital information signifi-cance (Fig. 33, inset). A clock sets the system frequency,indicating when or where a transition might occur. Thepresence or absence of a transition at expected intervals(called bits) is read as a “one” or a “zero” to representbinary coded information.

The write head must have adequate magnetic perme-ability at high frequency so that it can be driven to sat-uration with minimal current. The write head must havea high enough saturation magnetization so that its fringefield exceeds the coercivity of the medium, typically 40–250 kA/m (500–3000 Oe). Ni81Fe19 (µo Ms ≈ 1.0 T) isgenerally used in thin-film write heads but higher induc-tion permalloys, such as Ni50Fe50, and iron nitrides basedon Fe16N2 (µo Ms ≈ 3T), are beginning to be used. Thefilm thickness in the write head is typically 2–3 µm and theair gap between the head and medium is of order 100 nmfor high density recording.

A read head, on the other hand, operates from its qui-escent or demagnetized state. A read head should respondto the fringe field of the medium by magnetization rota-tion rather than wall motion (wall motion generates noise).

Thus, the read head material should be able to be field an-nealed to develop a weak, cross-track uniaxial anisotropyin order to define the demagnetized domain state. The readhead must have low coercivity, low noise, and extremelyhigh permeability in order to respond with a substantialchange in flux to the weak fringe field above the medium.Near-zero-magnetostriction permalloy is generally used inthin-film read heads. The read and write functions can befilled by the same inductive head but there are advantagesto separating these functions.

In 1975, Thompson et al. (1975) described the use of theMR effect in magnetic recording heads. The resistance ver-sus field for the anisotropic MR effect follows the generalform shown in Fig. 30a: �ρ(H )/ρ = �ρ/ρ](cos θ − 1/2).A bias field is needed to allow the head to operate on thesteep, nearly linear portion of the curve. Shield layers oneither side of the MR element were found to increaseits sensitivity and reduce signal pickup from adjacenttransitions.

Spin valves are also used as magnetic read heads. Theyshow increased sensitivity compared to AMR heads andallow for higher recording density.

Magnetic recording media are ideally comprised ofa regular array of isolated single-domain magnetic ele-ments. These elements should be capable of being magne-tized using a reasonable field strength. Further, they shouldbe bistable, i.e., when the field is removed, the elementsshould have a large remanent magnetization. The mediumshould be comprised of small, independent magnetic en-tities (grains or single-domain particles) which can retaintheir direction of magnetization across a sharp transition.A bit ideally should be comprised of a single-domain, iso-lated magnetic particle. Because this is generally not prac-tical, approximately N = 103 particles should constitute abit in order to insure a sharp transition.

Currently used particulate magnetic media includechromic oxide, Cr2O3, so-called metal particles (ironwith an unavoidable oxide surface layer), variants ofγ -Fe2O3 usually doped with cobalt, as well as Ba or Srhexaferrite (BaO·6Fe2O3).

The most widely used magnetic recording media todayare the thin-film media found in hard disk drives. Longi-tudinal thin-film media are based on CoCr with Pt an Taadditions. Pt is used to increase the magnetic anisotropy ofthe cobalt-rich film. It also improves the epitaxial relationbetween the cobalt film and the Cr underlayer. Cr plays arole in isolating the magnetic grains. The use of Ta as analloying addition in CoCr longitudinal media is found toenhance segregation of Cr to the grain boundaries as wellas improving epitaxy to the Cr underlayer.

One way to achieve low-noise, high-density media is tomake each bit consist of a single piece or grain of magnetic

Page 279: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

Magnetic Materials 943

FIGURE 34 Schematic of the read and write processes in a pseudo-spin-valve random access memory. The writeprocess involves current pulses through both the word line and the sense line such that the field at the PSV exceedsHc2. The read process involves a field pulse that takes the device to the high-resistance state without switchingthe semihard layer (Hc1 < H < Hc2). [From O’Handley, R. C. (2000). “Modern Magnetic Materials, Principles andApplications.” Reprinted with permission of John Wiley and Sons.]

material. Such bits should be arranged periodically in or-der to be synchronized with the signal channel. This can beachieved using high resolution lithography. The term pat-terned media is used to refer to media for which each bitconsists of a single, lithographically defined grain. Such arecording medium eliminates the random

√N noise asso-

ciated with multigrain bits. It also eliminates the noise as-sociated with irregular or saw-tooth transitions that causenoise in thin-film media. Patterned media will allow forhigh bit densities because the superparamagnetic limit willthus apply to a single bit, not to each of the many grainsin a multigrain bit. Finally, the patterning process definesa sharper transition between bits, and dispersion of easyaxes can be minimized relative to that in thin-film media.Thus patterned media have relaxed conditions on coerciv-ity and Mr t product.

Considerable current interest is focused on a class ofstorage devices called magnetic random access memories(MRAMs). They have several advantages over hard diskdrives: they have no moving parts, need no read or writeheads, and offer the ability to access information at anarbitrary sequence of addresses (random access) as op-posed to sequential access as in tape and disk storage. An

MRAM is basically an array of spin valves, pseudo-spinvalves or spin tunnel junctions (Fig. 34), each of whichcan be set to a given bistable state by a relatively largewrite field; that state can be read without destroying it byusing a smaller read field.

The write process consists of magnetizing both the freeand semihard layers in one direction or another by an ap-propriately directed word current pulse (and simultaneoussense-line current). After the write process, the two layersare in their remanent states and the resistance of eitherstate has the same minimum value (Fig. 34).

The MRAM read process consists of applying a bipo-lar current pulse to the word line. This pulse produces afield sufficient to switch the soft layer but not the hardlayer: Hc1 < Hword < Hc2 . Thus, depending upon the stateof the element, “0” or “1,” the resistance in the sense linechanges in phase or out-of-phase, respectively, with theword current pulse. After application of the read pulse,the MR element reverts to its original remanent state (↑↑or ↓↓). This is only possible if the two magnetic layersare ferromagnetically exchange coupled through the Cuspacer, or if the read pulse is followed by a small resetpulse of opposite polarity.

Page 280: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS/GJK P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN008N-393 June 29, 2001 15:46

944 Magnetic Materials

The materials requirements for MRAMs are similar tothose of spin-valves and spin-tunnel junctions plus theadditional resrtrictions imposed by nanofabrication andmultimaterial compatibility.

D. Postscript

The recent history of magnetic materials research and de-velopment is characterized by frequent reinvention of thefield. The decade of the 1960s was dominated by workon permalloys, ferrites, transport phenomena, and mag-netic resonance in bulk and thin film samples. In the1970s,magnetic “bubble films” (perpendicularly magnetized do-mains in which information could be stored and manip-ulated) and rapid developments in amorphous magneticalloys attracted considerable attention. The 1980s sawcontinued improvements in amorphous magnetic alloys,rapidly accelerating activity in magnetic thin films andsurfaces, and the magnetic recording developments thoseactivities supported, as well as the emergence of Fe-Nd-B permanent magnets. The 1990s witnessed enormousgrowth in magnetic recording research, development ofnanocrystalline magnetic materials, and extensive basicand applied work on a host of new magnetotransport phe-nomena.

The immediate future of magnetic materials willinvolve greater use of nanofabrication (thin films, phaseseparation) to engineer new properties and devices.Many of these new magnetic materials technologies willconverge in the emerging field of spin-tronics, wherespin-dependent transport phenomena hold the potential

to supplant semiconductors in some microelectronicapplications.

SEE ALSO THE FOLLOWING ARTICLES

FERROMAGNETISM • GEOMAGNETISM • MAGNETIC

FIELDS IN ASTROPHYSICS • MAGNETIC RECORDING •THIN FILM TRANSISTORS

BIBLIOGRAPHY

Baibich, M. N. et al. (1988). Phys. Rev. Lett. 61, 2742.Bozorth, R. M. (1955, 1993). “Ferromagnetic Materials,” Van Nostrand,

New York; IEEE Press, New York.Cadeville, M. et al. (1996). J. Phys. (Paris) 27, 29.Chikazumi, S. (1997). “Physics of Ferromagnetism,” Oxford University

Press, Oxford.Corb, B. W. et al. (1983). Phys. Rev. B 27, 636.Diedrichs, P. H. et al. (1991). In “Magnetism in the Nineties,” (Freeman,

A. J. and Gschneider, K. eds.), North-Holland, Amsterdam.den Broeder, F. J. A. et al. (1988). Phys. Rev. Lett. 60, 2769.Engle, B. et al. (1991). Phys. Rev. Lett. 67, 1910.Hasegawa, H. et al. (1993). J. Mater. Eng. Perf. 2, 181.Hasegawa, H. et al. (1991). J. Appl. Phys. 70, 6253.Herzer, G. (1993). J. Mater. Eng. and Perf. 2, 193.Jin, S. et al. (1994). Science 264, 413.Livingston, J. D. (1996). “Driving Force: The Natural Magic of Mag-

nets,” Harvard Univ. Press, Cambridge MA.Moodera, J. et al. (1996). Appl. Phys. Lett. 69, 708.O’Handley, R. C. (2000). “Modern Magnetic Materials; Principles and

Applications,” John Wiley and Sons, New York.Willard, M. A. et al. (1998). J. Appl. Phys 84, 1.Yoshizawa, Y. et al. (1988). J. Appl. Phys. 64, 6044.

Page 281: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FMX/LSU P2: GQT Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN011B-553 July 25, 2001 16:54

Permittivity of LiquidsJ. BarthelR. BuchnerUniversity of Regensburg

I. Phenomenological AspectsII. Analysis of Complex Permittivity Spectra

of LiquidsIII. Experimental MethodsIV. Molecular Interpretation of Relaxation Modes

GLOSSARY

Dielectric relaxation Delayed response of the electricpolarization of a material system to a perturbation ofthe electric field.

Dispersion Frequency dependence of a material property,here the relative permittivity or specific conductivity ofa solution.

Kinetic depolarization Reduction of polarization of thedipolar solvent molecules of electrolyte solutions (andin turn the reduction of ionic mobilities) resulting fromthe torque produced by an ion in its adjacent solventmolecules, counteracting the force of the external elec-tric field.

Libration Partial reorientation (small-angle oscillation)of molecular dipoles or the translational (linear) oscil-lations of ions in the cage produced by the adjacentmolecules.

Loss angle Phase angle between the polarization andelectric field vectors of a dissipative system, charac-terizing the energy absorption of the sample.

Relaxation time Time constant of a relaxation process;inverse of the rate for approach to equilibrium.

Rotational diffusion Random (Brownian) movement ofa molecular probe vector, the dipole moment in the caseof dielectric relaxation, referred to an initial state.

Step response function Characteristic time-dependentfunction controlling the development of polarizationtoward equilibrium after a jump of the electric field.

THE APPLICATION of an electric field E (E, electricfield strength) to a liquid polarizes the molecules of thefluid and produces a charge transport if the investigatedsample is an electrolyte solution.

The polarization of the molecules results from an align-ment of their permanent dipole moments µµ against ther-mal motion (orientation effect, orientational polarizationPµ) and the induced dipole moments, µµind = αE, becauseof the action of the electric field on their polarizability α

(deformation effect, induced polarization Pα).The total polarization P of the sample relates the macro-

scopically measurable relative permittivity ε (ε0, per-mittivity of the vacuum) to the microscopically definedmolecular quantities µ and α, owing to its definitions atboth macroscopic and microscopic levels.

697

Page 282: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FMX/LSU P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN011B-553 July 25, 2001 16:54

698 Permittivity of Liquids

macroscopic: ε0(ε − 1)E = P = Pµ + Pα: microscopic(1)

The flow of electric charges in electrolyte solutions iscaused by the gradient of the galvanic potential ψ, E =−grad ψ . It satisfies Ohm’s law, j = κ E ( j , electric cur-rent density; κ , specific conductivity).

The material properties ε and κ depend on the frequencyν (ν, linear frequency; ω = 2πν, circular frequency) ofthe applied electric field E. Separate measurements un-der static and quasi-static (very low frequency) conditionsyield the static values of permittivity ε and specific con-ductivity κ; but note that for κ > 0 the electric currentprevents the direct determination of ε.

Static permittivity ε is related to the equilibrium polar-ization of the sample. In this case, no energy is dissipatedin the system. With increasing frequency, situations arereached in which the polarity change of the electric fieldcauses significant variation of the electric field strengthwithin periods that are characteristic of molecular mo-tions (dipole rotation, ionic mobility, etc.) or reactions(formation of ion pairs or hydrogen bonds, etc.). Then thepolarization lags behind the electric field, and energy isdissipated in the system. This effect is commonly calleddielectric dispersion or dielectric relaxation. The energydissipation, expressed as dielectric loss, is practically usedfor dielectric heating devices, such as the common mi-crowave oven. At optical frequencies, the relaxation pro-cesses are paralyzed, and only the resonance processes(intramolecular atomic and electronic movements) under-lying induced polarization are observable.

The ultimate aim of dielectric relaxation spectroscopy(DRS) is the deconvolution of the observed dielectric re-laxation behavior into individual contributions which canbe interpreted on a molecular scale. Such an analysis yieldsa wealth of information about molecular and cooperativemotions, kinetic processes, and liquid structure which isoften not available by other methods. In principle, pro-cesses ranging in time scale from several hundred fem-toseconds to hours are susceptible to dielectric relaxation.Figure 1 gives a survey of processes and corresponding fre-quencies relevant for the dielectric relaxation of commonsolvents and electrolytes around ambient temperature. InSection IV, a brief introduction into some relaxation mech-anisms will be given.

Note that for conducting samples there is always a con-tribution to energy dissipation from Ohm’s law. The char-acteristic time constant of conductivity is the relaxationtime for the reestablishment of electroneutrality.

I. PHENOMENOLOGICAL ASPECTS

The phenomenological description of the interactions be-tween material systems and electromagnetic fields is based

FIGURE 1 Frequency regions of relaxation processes of the sol-vent and of the electrolyte which may contribute to the dielectricpermittivity and relaxation of liquids and solutions around roomtemperature.

on the Maxwell equations

curl H = j + ∂ D∂t

(2)

curl E = −∂ B∂t

(3)

div D = ρ (4)

div E = 0 (5)

and the relations

D = ε0εE (6)

B = µ0µH (7)

j = κ E (8)

which define the material properties ε (relative permittiv-ity; ε0, permittivity of the vacuum), µ (relative perme-ability; µ0, permeability of the vacuum), and κ (specificconductivity), using the electric vectors E (electric fieldstrength), D (dielectric displacement), and j (electric cur-rent density) and the magnetic vectors H (magnetic fieldstrength) and B (magnetic flux density); ρ is the chargedensity of the electric field. Polarization P is related to theelectric field strength E via Eq. (1). The contributions Pµ

and Pα to the total polarization, Eq. (1), are consideredto be linearly independent. This allows us to express Pα

with the help of the “infinite frequency” permittivity ε∞,separating relaxation and resonance processes so that thetotal polarization P can be split into two parts

Pα = ε0(ε∞ − 1)E (9)

Pµ = ε0(ε − ε∞)E (10)

A time domain (TD) experiment, Fig. 2(a), shows theresponse of Pα and Pµ when the static field E applied

Page 283: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FMX/LSU P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN011B-553 July 25, 2001 16:54

Permittivity of Liquids 699

FIGURE 2 Methods for the determination of complex permittiv-ity: (a) time domain method: application of a jump in the fieldstrength |E| and measurement of the response P(t) = P eq · Fp(t);(b) frequency domain method: application of a harmonic fieldE(t) = E0 · exp(i ωt) and measurement of amplitude and phase ofthe response Pµ(t, ω) = ε0(ε − ε∞)E(t).

initially to a liquid sample is switched off at time t = 0.Pα breaks down without a time lag, that is, it is always inequilibrium with E, whereas Pµ decreases monotonicallywith time to its final value Pµ(∞) = 0. Formally, this canbe expressed by the relationship

Pµ(t) = Peqµ · For

p (t) (11)

with the static (equilibrium) orientational polarizationPeq

µ = Pµ(0) and the step response function (time auto-correlation function) of the orientational polarization

Forp (t) =

〈P µ(t) · P µ(0)〉〈Pµ(0) · Pµ(0)〉 = 〈Mµ(t) · Mµ(0)〉

〈Mµ(0) · Mµ(0)〉 (12)

which can be shown to be identical with the autocorre-lation function of the macroscopic dipole moment of thesample, M = ∑

i µi (t). M is defined as the vector sum ofall constituting molecular dipole moments, µi (t).

Conversion to the frequency domain (FD) is made by theconsideration that an arbitrary time dependence of the fieldstrength E can be expressed as a sequence of infinitelysmall time steps. In the case of a harmonically chang-ing electric field, E(t) = E0 exp(iωt), that is, a monochro-matic electromagnetic wave of circular frequency ω, therelation

Pµ(ω, t) = ε0(ε − ε∞)E(t)

×∫ ∞

0

(−∂ For

p (t ′)

∂t ′

)exp(iωt ′) dt ′ (13)

= ε0(ε − ε∞)F(ν) E(t)

is obtained, which expresses the phase shift and ampli-tude modulation of Pµ(ω, t) relative to E(ω, t), Fig. 2(b).

The relaxation function F(ν) is the frequency domain ana-logue of the autocorrelation function For

p (t); note that F(ν)is the Fourier transform of the time derivative of For

p (t),called the step response function f or

p (t) = −(∂ Forp (t)/∂t).

To incorporate the observed energy absorption, the treat-ment of dissipative systems exposed to electromagneticwaves of circular frequencies ω may be advantageouslybased on the use of electric and magnetic vectors ofthe type A(t) = A0 · exp(iωt) in the Maxwell equations(A= E, D, P, j, H, B) and complex material propertiessuch as ε = ε′ − iε′′, κ = κ ′ − κ ′′, and µ = µ′ − iµ′′.Note that permeability can usually be set equal to unity inliquids. The comparison of Eqs. (9, 10) and (1), after trans-formation to complex notation, then yields the frequencydependence of permittivity:

ε(ω) = ε′(ω) − iε′′(ω)(14)

= ε∞ + (ε − ε∞)F(ν)

On the frequency scale, the quantity ε∞ is the permit-tivity corresponding to a nonpolar liquid (zero dipole mo-ment) between zero frequency and the infrared (IR) range(before the onset of intramolecular vibrations) and to a po-lar liquid at IR frequencies only where Pµ equals zero. Thefluctuation of the induced polarization, Pα , is very rapidand is affected only by resonant (quantum mechanical)transitions in the IR and ultraviolet (UV)/visible regions.

Processes linked to structural rearrangements take placeon the nanosecond to subpicosecond time scale for mostliquids at room temperature, corresponding to frequenciesin the megahertz to terahertz region on the frequency scale;they contribute only to For

p and hence to Pµ. Such pro-cesses are dipole reorientation, breaking and reforming ofhydrogen bonds, or returning the system to chemical equi-librium (for example, ion pair or complex formation), andreestablishement of electroneutrality in electrolyte solu-tions (see Fig. 1). Figure 3 shows schematically the typicalbehavior of the frequency dependence of the real, ε′, andimaginary, ε′′, parts of the complex permittivity spectrumof a nonconducting liquid.

The real part ε′(ω), the permittivity spectrum, is a mea-sure of the polarization at frequency ν; the imaginarypart ε′′(ω), the dielectric loss or absorption curve, char-acterizes energy dissipation in the system and is the rel-evant quantity for the assessment of dielectric heatingeffects. The phase shift between electric field and po-larization, δ = arctan[ε′′(ω)/ε′(ω)], is called the loss an-gle; the dissipated energy per unit volume and time isW = E2

0ωε0ε′′(ω)/2.

Electrically conducting systems require special consid-eration. The appropriate combination of Maxwell equa-tions yields the wave equation

grad divA + k2 A = 0 (15)

Page 284: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FMX/LSU P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN011B-553 July 25, 2001 16:54

700 Permittivity of Liquids

FIGURE 3 Dielectric permittivity, ε′(ν), and loss, ε′′(ν), spectrafor a polar liquid with a single Debye relaxation process in themicrowave region and two resonant transitions in the IR andUV/visible range; nD is the refractive index in the visible spectralrange.

where the vector A is either the electric, E, or the mag-netic field strength, H, both propagating perpendicularlyto one another in the medium which is characterized bythe complex propagation coefficient k

k2 = k20

(ε(ν) + κ(ν)

iωε0

)= k2

0 η(ν) (16)

where k0 is the propagation coefficient of the vacuum.The dissipated energy per unit volume and time is W =E2

0ωε0η′′(ω)/2. Equation (16) clearly shows that the only

measurable quantity is the generalized permittivity η(ν),which is reduced to ε(ν) for nonconducting samples wherej = 0. It is known from experiments that the disper-sion of conductance, the so-called Debye–Falkenhageneffect, due to the relaxation of the diffuse ion cloud sur-rounding each ion, is rather small for electrolyte solu-tions and may be neglected at high frequencies, that is,κ ′′(ν) = 0, κ ′(ν) = κ . Then follows

η′(ν) = ε′(ν) (17)

η′′(ν) = ε′′(ν) + κ

ωε0(18)

from which ε′′(ν) can be calculated using the measuredquantity η′′(ν) when κ is known from conductance mea-surements under quasi-static conditions. For systems inwhich interfacial charges are important (dispersions ofcharged polymers or colloids, ionic micelles), the con-sideration of the frequency dependence of conductivity isalways required.

II. ANALYSIS OF COMPLEX PERMITTIVITYSPECTRA OF LIQUIDS

The complex permittivity spectrum of ethanol, Fig. 4, istypical for the dielectric response of liquids in the mi-crowave region. The aim of DRS is the interpretation ofthe observed relaxation behavior on the molecular scale.Prerequisite of such an endeavor is the fitting of ε(ν) byan appropriate model to extract the independent relaxationmodes contributing to dielectric permittivity. Generally,this analysis is performed in the frequency domain. Dueto the often limited frequency range and the large band-width associated with each process this is a nontrivial task.Although a satisfactory fitting of the spectra, that is, witha superposition of dispersion steps, is always possible, the

FIGURE 4 Dielectric permittivity, ε′(ν), and loss, ε′′(ν), spectra ofethanol in the temperature range −25 ≤ ϑ/◦C ≤ 55.

Page 285: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FMX/LSU P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN011B-553 July 25, 2001 16:54

Permittivity of Liquids 701

obtained result is not necessarily correlated with physi-cal processes. If possible, only extended series of mea-surements, for example, as a function of temperature orelectrolyte concentration, should be considerd to checkthe self-consistency of the applied model. To minimizethe impact of systematic errors, which affect ε′ and ε′′

differently, permittivity and loss spectra should be fittedsimultaneously.

The simplest case of response function, generally foundfor the dipole rotation of small symmetric molecules likeacetonitrile, is a first-order exponential decay of orienta-tional polarization with response function

Forp = exp(−t /τ ) (19)

governed by the relaxation time τ . In the frequency domainthis corresponds to the Debye equation

ε(ν) = ε − ε∞1 + i2πντ

+ ε∞ (20)

with the dispersion of permittivity

ε′(ω) = ε − ε∞1 + ω2τ 2

+ ε∞ (21)

between ε = limν→0 ε′ and ε∞ = limν→∞ ε′ (a cen-trosymmetric curve with respect to the critical frequencyνc = 1/(2πτ )), and a Lorentzian bandshape of the dielec-tric loss

ε′′(ω) = (ε − ε∞)ωτ

1 + ω2τ 2(22)

which peaks at νc.Generally, however, even simple molecular liquids ex-

hibit a more complex relaxation behavior and require morecomplex bandshape models and/or superposition of n in-dividual relaxation processes j of amplitude (relaxationstrength) Sj = ε j − ε j∞

ε(ν) =n∑

j=1

Sj F j (ν) + ε∞ (23)

where

ε = ε1 =n∑

j=1

Sj + ε∞; ε j ∞ = ε j +1 ; εn ∞ = ε∞

(24)

The relaxation functions F j of the individual dispersionsteps may generally be represented by modifications ofthe Havriliak–Negami equation

F j (ν) = [1 + (i2πντ j )

1−α j]−β j (25)

each one with relaxation time τ j and relaxation time dis-tribution parameters, 0 ≤ α j < 1 and 0 < β j ≤ 1. Specialcases of Eq. (25) are the asymmetric Davidson–Cole re-laxation time distribution, α j = 0, and the symmetricallybroadened Cole–Cole distribution, β j = 1. The limiting

FIGURE 5 Propylene carbonate–1,2-dimethoxyethane (PC-DME) mixture (mole fraction xPC = 0.2; 25◦C) as an example fortwo superposed Debye processes with characteristic parametersε = ε1, ε1∞ = ε2, ε2∞ = ε∞, τ1 and τ2. The contribution of PC ischaracterized by the dispersion amplitude S1 = ε1 − ε1∞ = 9.74and relaxation time τ1 = 22.0 ps; that of DME by S2 = ε2 − ε∞= 5.79 and τ2 = 4.7 ps.

case of α j = 0 and β j = 1 is the Debye equation. Note thatfor κ �= 0 only an analysis based on Eq. (23) permits thedetermination of the static permittivity ε, a quantity im-portant for many thermodynamic properties of electrolytesolutions.

Examples of spectra with a superposition of dispersionsteps where the individual contributions to ε are indicatedare given in Figs. 5–8. The significance of the quanti-ties εi , εi∞, and Sj becomes obvious in a plot of ε′′(ν)versus ε′(ν), the so-called Argand diagram or Cole–Coleplot, Fig. 5. The example is typical for a mixture of dipo-lar aprotic liquids without specific intermolecular inter-actions. For such systems, dominated by dipole–dipoleinteractions and packing requirements, the contributionsof the individual components are detectable over the entiremixture range, with their dynamics reflecting the smoothchange from only A:A interactions in the pure liquid A toA:B interactions in dilute solutions of A in B.

A disadvantage of the Argand diagram is the lacking fre-quency coordinate so that assignment to the dynamics ofthe system is not immediately obvious. Such informationis especially brought out by the loss spectra, ε′′ = f (ν),since the relaxation times are immediately related to thepeak frequencies of the individual contributions. Figure 6shows the complex permittivity spectrum of an associ-ating electrolyte in water. The two observed dispersionsteps, which are of Debye type and well separated in thefrequency scale, can be attributed to the cooperative relax-ation of the water’s hydrogen bond network, τH2O ≈ 8.3 ps,and to the tumbling motion of long-lived La[Fe(CN)6]ion pairs formed by this electrolyte, 290 ≤ τIP/ps ≤ 630.Note that pure water, Fig. 7, and most aqueous electrolytesshow an additional fast process, τH2O,2 ≈ 1 ps, which isnot resolved for the La[Fe(CN)6] solutions. The spectrumof ethanol, Fig. 4, is a superpostion of three dispersion

Page 286: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FMX/LSU P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN011B-553 July 25, 2001 16:54

702 Permittivity of Liquids

FIGURE 6 Dielectric permittivity, ε′(ν), and loss, ε′′(ν), of0.0345 mol dm−3 La[Fe(CN)6] in water at 25◦C. Experimental data(FD: •, TDR: �) are fitted to a superposition of two Debye pro-cesses (solid lines) attributed to the ion pair (IP) and to the solvent(H2O). Also indicated is the total loss, η′′(ν), of the solution.

steps with relaxation times in the order of τ1 = 164 ps,τ2 = 7.9 ps, and τ3 = 1.5 ps at room temperature.

Figure 8 shows the spectrum of the monohydrate of tri-fluoromethanesulfonic acid which is of potential interestas a solvent in fuel cells. Its spectrum is rather featureless,and the extraction of four Debye processes pushes the bandfitting procedure with a model based on Eq. (23) to thenumerical limits. For such spectra, a physically meaning-ful deconvolution into molecular-level processes is onlypossible with information from other methods. Spectraof similar shape are usually observed for liquids of flex-ible molecules with high molecular weight like polymermelts. Here a broad distribution of relaxation times mustbe assumed, which significantly complicates a detailedanalysis.

FIGURE 7 Dielectric permittivity, ε′(ν) (•), and loss, ε′′(ν) ( ❡),of water at 5◦C. Experimental data are fitted to a superpositionof two Debye processes (solid lines) attributed to the cooperativerelaxation of the hydrogen-bond network (1) and to the rotation ofmobile H2O molecules (2).

FIGURE 8 Dielectric permittivity, ε′(ν), and loss, ε′′(ν), of themonohydrate of trifluoromethanesulfonic acid at 40◦C. Experi-mental data (FD: •, TDR: �) are fitted to a superposition of fourDebye processes (solid lines).

III. EXPERIMENTAL METHODS

For electrolyte solutions of common solvents around roomtemperature the time scale of dielectric relaxation pro-cesses is in the order of 0.1 ps to 10 ns, meaning that idealexperiments should span from megahertz to terahertz (far-infrared, FIR) frequencies to obtain the full informationon the dynamics of the investigated system. The relevantequation for the construction of measurement devices isEq. (15). However, in the broad frequency range, to coverthe ratio of the characteristic dimension, l, of the mea-surement cell to the wavelength λ of the applied electro-magnetic radiation changes considerably. At low frequen-cies, l/λ � 1, broadband coaxial transmission lines can beapplied. Broadband experiments are again possible withfree-space methods from the FIR region upward wherel/λ � 1, but for the intermediate microwave range, wherel/λ ≈ 1, narrow-band waveguide equipment is necessary,which makes experiments cumbersome and expensive.

Below 50 MHz, impedance bridges with the sample en-closed between the electrodes of a parallel plate condenserare used for the determination of resistance and capaci-tance of the sample, to yield the complex permittivity η.Limitations of the method result from electrode polariza-tion and from stray fields at high frequencies. Low losssamples permit the use of heterodyne beat measurementslinking the capacitance of the measuring cell to the fre-quency shift of a megahertz oscillator.

Such resonance circuits are also applied above 50 MHz.However, due to the recent progress in electronic instru-mentation coaxial transmission lines probing amplitudeattenuation and phase shift of a transmitted or (more con-venient) reflected wave have become far more attractive.Signal generation and detection can now be performed inthe FD with the help of accurate vector network analyzers

Page 287: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FMX/LSU P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN011B-553 July 25, 2001 16:54

Permittivity of Liquids 703

FIGURE 9 Schematic diagram of a coaxial-line time domain re-flectometer (TDR). SO: digital sampling scope (20 GHz band-width); SH1, SH2: sampling heads; Z: matched pairs of reflectioncells; T: precision thermostat; R: personal computer with accessto a work station for data analysis.

(VNA). Alternatively, in a TD experiment, a fast risingvoltage pulse can be applied to the sample-filled cell, andη(ν) is obtained from the Fourier transform of the reflected[time domain reflectometer (TDR)] or transmitted pulse.A typical TDR instrument is shown in Fig. 9. Stringent ge-ometric requirements for cell construction currently limitthe maximum frequency for coaxial line techniques to20 GHz with VNA and to somewhat less with TDR.

Above about 10 GHz the wavelength of the electromag-netic wave is comparable to the dimensions of the mea-suring equipment, making the use of waveguides unavoid-able. Resonator techniques as well as methods based onthe transmission or reflection of propagating waves are inuse. For the investigation of lossy liquids, transmissionline techniques yield superior results, but resonators aremore easily adapted to measurements at high temperatures

FIGURE 10 Waveguide apparatus for the determination of ε(ν) in the E-band (60–90 GHz) range with transmissionmeasurements. 1–9: waveguide interferometer with cell C and movable probe P; PLO, PLO-D, PLO-P: microwavesignal source and control unit; 8, MMC, S, RE: signal detection unit; HH, MT, SMD, SM, PM, SP: probe positioncontrol unit; PD: interface enabling the control of four interferometers (E-, A-, Ku-, X-band) in the frequency range 8.5to 90 GHz; MC: personal computer.

and pressures. The drawback with waveguides is their lim-ited frequency band. Several setups are needed to bridgethe gap between coaxial-line techniques and free-spacemethods. For instance, in the authors’ laboratory four in-terferometers of the type sketched in Fig. 10 are used tocover the range 8.5 ≤ ν/GHz ≤ 89.

Above 100 GHz, the small wavelengths (≤3 mm in thevacuum) permit free-space propagation with optical lensesand mirrors, as commonly used in conventional IR spec-troscopy. The technique of Fourier transform spectrome-ters based on an asymmetric Michelson interferometer israther mature. In this arrangement the sample is placed inone of the active arms of the interferometer, which allowsthe simultaneous determination of the refractive index andthe absorption coefficient, both necessary for the calcula-tion of ε. However, until now measurements in the FIRregion, which are of high potential interest as a sourceof information on the short-time dynamics (0.1–1 ps) ofthe sample, are rather limited due to the low intensities ofconventional FIR sources. This situation is changing nowwith the rapid development of femtosecond-laser pumpedterahertz emitters and detectors.

IV. MOLECULAR INTERPRETATIONOF RELAXATION MODES

The step response function Forp underlying the interpreta-

tion of the complex permittivity spectrum is a macroscopicproperty related to the fluctuations of the macroscopicdipole moment M = ∑

i µi (t). The relation to molecu-lar dynamics is made obvious by the rearrangement of itscorrelation function as

Page 288: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FMX/LSU P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN011B-553 July 25, 2001 16:54

704 Permittivity of Liquids

〈M(t) · M(0)〉 =∑

i

〈µi (t) · µi (0)〉self

+⟨ ∑

i

µi (t) ·∑j �=i

µ j (0)

⟩(26)

distinct

The self term gives the orientational correlation func-tion, that is, the dynamics of individual molecules in acontinuous “bath” of fluctuating frictional forces. It canbe compared with results from NMR relaxation or IRand Raman bandshapes. The distinct term explicitly ex-presses the coupling of the motion of molecule i with thestructure and dynamics of surrounding molecules, that is,it probes cooperative effects. Such terms also contributeto Rayleigh scattering, Kerr-effect relaxation, and quasi-elastic neutron scattering data. In combination with mod-els and computer simulations this allows the deduction ofintermolecular interaction potentials.

In liquids lacking specific long-range interactions, likehydrogen bonds, the reorientation of the molecular dipolemoment can generally be described as rotational dif-fusion. The molecular relaxation time deduced from〈µi (t) · µi (0)〉 is proportional to molecular size and vis-cosity. For simple dipolar fluids of highly symmetric rigidmolecules, like acetonitrile, the long-time behavior in themicrowave region is reproduced by an exponential corre-lation function, and the influence of the distinct term canbe expressed by a static correlation factor of orientationalcorrelations, the well-known Kirkwood factor g, and adynamic correlation factor g. The short-time dynamicsavailable from FIR spectra, however, deviates because ofthe predominance of libration and inertial effects. Reduc-tion of molecular symmetry, or possible intramolecularrotations of polar groups, leads to the emergence of ad-ditional Debye-type relaxation processes. Examples arethe observed second relaxation process when going fromCn symmetry (acetonitrile) to C2V symmetry (benzoni-trile) or the high-frequency relaxation process of butylenecarbonate due to fast rotation of the ethyl side chain. Ascan be seen in Fig. 11 for N ,N -dimethylformamide andN ,N -dimethylacetamide, for this class of dipolar apro-tic solvents the influence of dissolved electrolytes on thestatic permittivity is moderate (correctly, on the relaxationstrength, but usually ε∞(c) ≈ ε∞(0)). Additionally, the re-laxation time parallels the observed increase of solutionviscosity with electrolyte concentration.

The distinct term attains a predominant role inhydrogen-bonding liquids. Here, the intermolecular dy-namics dominates the relaxation behavior with a slow co-operative process of large amplitude (relaxation strength),and the topology of the hydogen-bond system is re-

FIGURE 11 Solvent permittivity εs as a function of LiClO4 (�),NaClO4 (�), and Bu4NClO4 (•) concentration in N,N-dime-thylformamide (1), N,N-dimethylacetamide (2), N-methylforma-mide (3), and formamide (4) at 25 ◦C.

flected in the number of observed relaxation processes.This can be nicely exemplified with the series N ,N -dimethylformamide (no H bonds), N -methylformamide(chains), and formamide (network). When winding chainsare formed, as with alcohols or N -methylformamide, threerelaxation processes are found, τ1 > τ2 > τ3. The magni-tude of the cooperative relaxation time, τ1 = 164 ps forethanol at 25◦C, exceeds the relaxation time expected formolecular rotation (which is roughly equal to the observedτ2 = 7.92 ps) at least by a factor of five. τ1 and the am-plitude S1 are very sensitive to added solutes, especiallyelectrolytes, as can be seen in Fig. 11. This process es-sentially probes the interchain dynamics. τ2 probes thereorientation of monomers and that of molecules witha single H bond, and the fast process with relaxationtime τ3 = 1.54 ps is indicative for partial reorientationwithin the H-bonded chains. For molecules able to formhydrogen-bond networks, like water and formamide, onlytwo relaxation processes are found (polyols show addi-tional intramolecular dynamics). Compared to the valueexpected for rotational diffusion from molecular size andviscosity, the relaxation time τ1 of the dominating slowprocess (S1/(ε − ε∞ > 0.95) is increased by a factor of3–5. For water (Fig. 7), data suggest that τ1 (8.33 psat 25◦C) is a measure for the rate with which H2Omolecules are released from the three-dimensional net-work before they can rapidly reorient with time con-stant τ2 ≈ 1 ps. Experiments with dissolved electrolytesand nonelectrolytes reveal that up to intermediate concen-trations, c < 1 mol dm−3, such hydrogen-bond networkscan accommodate solutes without much disturbance ofthe bulk solvent structure and dynamics (see Fig. 11).

It is well known that especially liquids formed by large,asymmetric and flexible molecules tend to supercool and

Page 289: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FMX/LSU P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN011B-553 July 25, 2001 16:54

Permittivity of Liquids 705

to form a glass instead of crystallizing. It is thought thatglass formation is closely connected with the cross-overfrom molecular reorientation which is dominated by theself term at high temperatures to cooperative relaxationclose to the glass transition temperature, Tg . Currentlysuch processes are a ‘hot topic’ in liquid state research.Investigations were mainly motivated by the emergenceof the so-called mode coupling theory which predicts auniversal scaling law for all dynamical variables in thesystem, linking thus viscosity, mechanical relaxation anddielectric relaxation.

The influence of dissolved nonelectrolytes in polar sol-vents is mainly controlled by volume effects. The solutionof electrolytes, however, yields a decrease of the static sol-vent permittivity that significantly surpasses the volumerequirements of the dissolved ions (Fig. 11). Two addi-tional effects occur in electrolyte solutions: ion solvationand kinetic depolarization. Kinetic depolarization, a dy-namic feature, arises from the interaction of the solventmolecules and the solvated ions moving in the externalfield. The moving ion creates a torque in the surroundingsolvent dipoles opposite to the force of the external field,leading to a reduction of the orientational polarization ofthe solvent molecules and the mobility of the ions. Thecontinuum theory of Hubbard and Onsager approximatelyaccounts for this contribution. Ion solvation, a static effect,is caused by the high field strength at an ion surface align-ing the solvent molecule dipoles in its vicinity so that theycannot contribute to the solvent relaxation. This permitsone to deduce effective solvation numbers, ZIB, of ionsfrom the solvent permittivity. As expected, the higher thecharge density at the ionic surface, the greater the sol-vation effect (Li+ > Na+ > K+ > Rb+ ≈ Cs+). It shouldbe noted that the thus-d etermined ZIB may differ fromfirst-shell coordination numbers deduced from scatteringexperiments or computer simulations, as the ability of theions to align solvent dipoles may be rather small (ClO−

4 )or extend beyond the first solvation shell (Mg2+).

Another effect characteristic of electrolyte solutions ision association (Fig. 12), which may lead from encountercomplexes (2SIP) where both ions keep their primary sol-vation shell via subsequent desolvation steps to solvent-shared (SSIP) and contact ion pairs (CIP) depending onthe relative balance of ion–ion and ion–solvent interac-tions. Such speciation processes are important for manyelectrolyte systems of biological, geochemical, and tech-nological interest. For chemical reactions involving ions,generally the CIP state has to be reached before the prod-uct can be formed. Despite of the importance of ion as-sociation, up to now many systems are only ill charac-terized because thermodynamic methods yield only theoverall equilibrium constant, KA, whereas spectroscopic

FIGURE 12 Scenario of possible ion association equilibria insolution involving free ions, doubly solvent-separated (2SIP),solvent-shared (SSIP), and contact ion pairs (CIP), as well aspossible higher aggregates. The Ki j are equilibrium constantsand the ki j rate constants relating different steps i and j .

techniques are generally only sensitive to CIP. As 2SIP,SSIP, and CIP have permanent dipole moments, DRS issensitive to all ion pair types, provided the lifetime of thespecies is at least comparable to their rotational correla-tion time. An example for ion pair relaxation can be seenin Fig. 6. In favorable cases dielectric studies, eventuallycombined with other techniques, not only permit the de-termination of the concentrations of the formed speciesbut also allow one to estimate the rate constants of ionpair formation and decay from the concentration depen-dence of the ion pair relaxation time(s). Dielectric inves-tigations of the dynamics of micelles fall into the samecategory.

In biological systems, both hydrogen bonding and ion–dipole interactions are of crucial importance. Dielectricrelaxation studies can be carried out to study the inter-action of water with biomolecules and its modulation byelectrolytes. For instance, information on the flexibility ofproteins or nucleic acids is available. Differences in sol-vent mobilities allow the identification of several bindingstates for water on biomolecules with the help of permit-tivity studies. DRS also becomes increasingly important in

Page 290: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: FMX/LSU P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN011B-553 July 25, 2001 16:54

706 Permittivity of Liquids

material analysis and the characterization of pharmaceu-tical systems.

SEE ALSO THE FOLLOWING ARTICLES

ELECTROLYTE SOLUTIONS, THERMODYNAMICS •ELECTROLYTE SOLUTIONS, TRANSPORT PROPERTIES

• HYDROGEN BOND • INFRARED SPECTROSCOPY •LIQUIDS, STRUCTURE AND DYNAMICS • MICROWAVE

COMMUNICATIONS

BIBLIOGRAPHY

Bagchi, B., and Chandra, A. (1991). “Collective orientational relaxationin dense dipolar liquids,” Adv. Chem. Phys. 80, 1–126.

Barthel, J., Buchner, R., Eberspacher, P. N., et al. (1998). “Dielectric re-laxation in electrolyte solutions. Recent developments and prospects,”J. Mol. Liq. 78, 82–109.

Barthel, J., and Buchner, R. (2000). “Relative permittivities of elec-trolytes,” In “Experimental Thermodynamics” (Goodwin, A. R. H.,Marsh, K. N., and Wakeham, W. A., eds.), Vol.6, Ch.9c, Blackwell,Oxford.

Buchner, R., and Barthel, J. (1994). “Dielectric relaxation in solutions,”Annu. Rep. Progr. Chem., Sect. C 91, 71–106.

Craig, D. Q. M. (1995). “Dielectric Analysis of Pharmaceutical Sys-tems,” Taylor & Francis, London.

Davis, J. L. (1990). “Wave Propagation in Electromagnetic Media,”Springer, Berlin.

Kaatze, U. (1997). “The dielectric properties of water in its differentstates of interaction,” J. Solution Chem. 26, 1049–1112.

Madden, P., and Kivelson, D. (1984). “A consistent molecular treatmentof dielectric phenomena,” Adv. Chem. Phys. 56, 467–566.

Scaife, B. K. P. (1989). “Principles of Dielectrics,” Clarendon, Oxford.

Page 291: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN011A-554 July 14, 2001 21:50

PerovskitesC. N. R. RaoNehru Center for Advanced Scientific Research

I. Crystal ChemistryII. Electrical and Magnetic Properties of

Perovskite OxidesIII. Magnetic Properties of Perovskite FluoridesIV. Metal–Insulator TransitionsV. Oxides of K2NiF4 Structure

VI. FerroicsVII. High-Temperature Superconductors

VIII. Colossal MagnetoresistanceIX. Why Are Perovskites Special?

GLOSSARY

Ferroics Materials possessing two or more orientationstates or domains that can be switched from one toanother through the application of one or more appro-priate forces.

Intergrowths Structures in which unit cells of two relatedmaterials occur randomly or recurrently.

Magnetoresistance Phenomenon whereby the resistanceof a solid varies by the application of a magneticfield.

Polytypism Phenomenon that may be regarded as poly-morphism in one dimension, exhibited by solids withclose-packed and layered structures where the primarycoordination around an atom is satisfied in more thanone way (for example, cubic versus hexagonal closepacking).

Superconductivity Phenomenon whereby the electrical

resistance of a material vanishes below a critical tem-perature, accompanied by exclusion of the magneticfield.

Transfer energy Measures the strength of interaction be-tween two orbitals of neighboring atoms and is propor-tional to the orbital overlap.

PEROVSKITES constitute one of the most fascinatingclasses of materials exhibiting diverse properties any-where between ferroelectricity and superconductivity.Oxides and fluorides are the most commonly foundmaterials of perovskite structure with the general com-position ABX3. A variety of other structures, especiallyin complex metal oxides, contain the perovskite unit.Structure and properties of perovskites constitute anexcellent case study of the chemistry and physics ofmaterials.

707

Page 292: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS Final Pages

Encyclopedia of Physical Science and Technology EN011A-554 July 14, 2001 21:50

708 Perovskites

FIGURE 1 (a) The ABO3 perovskite structure. Without the largeA atom in the body center position, the structure becomes that ofcubic ReO3; (b) layer sequence in the perovskite structure parallelto (001).

I. CRYSTAL CHEMISTRY

Perovskites of the general formula ABX3 may be regardedas derived from the ReO3 structure as shown in Fig. 1.The BX3 framework in the perovskite is similar to thatin ReO3 structure consisting of corner-shared BX6 octa-hedra. The large A cation occupies the body center, 12-coordinate position. In an ideal cubic perovskite structure,where the atoms are just touching one another, the B–Xdistance is equal to a/2 and the A–X distance is

√2(a/2),

where a is the cube unit cell length and the follow-ing relation between radii of ions holds: RA + RX = √

2

FIGURE 2 (a) Close-packed AO3 layer in perovskites. (b)–(f) BO6 octahedra in different perovskite polytypes: (b)3C, (c) 2H, (d) 6H, (e) 4H, and (f) 9R.

(RB + RX). Goldschmidt found that the perovskite struc-ture is retained in ABX3 compounds even when this rela-tion is not exactly obeyed and defined a tolerance factor,t , as

t = RA + RX√2(RB + RX)

For the ideal perovskite structure, t is unity. The per-ovskite structure is, however, found for lower values oft(∼0.75 < t ≤ 1.0), also. In such cases, the structure dis-torts to tetragonal, rhombohedral, or orthorhombic sym-metry. This distortion arises from the smaller size of the Aion, which causes a tilting of the BX6 octahedra in order tooptimize A–X bonding. Perovskite oxides, ABO3, can bethought of as consisting of alternating BO2 and AO lay-ers stacked one over the other in the [001] direction. Analternative description of the ABO3 structure in terms ofclose packing of A and O ions is one where close-packedAO3 layers [Fig. 2(a)] are stacked one over the other withthe B cations occupying octahedral holes surrounded byoxygen.

Several ABO3 oxides, where A is a large cation suchas Ba and B is a small cation of the d-transition series,are known to exhibit polytypism. The stacking of anAO3 layer in the structure may be cubic (c) or hexagonal(h) with respect to its two adjacent layers depending onwhether it is in the middle of the ABA or ABC sequence. Ifthe stacking is entirely cubic, the B-cation octahedra shareonly corners in three dimensions to form the perovskite(3C) structure (Fig. 2). If the stacking is all hexagonal, the

Page 293: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS Final Pages

Encyclopedia of Physical Science and Technology EN011A-554 July 14, 2001 21:50

Perovskites 709

FIGURE 3 The K2NiF4 structure of oxides, A2BO4.

B-cation octahedra share opposite faces, forming chainsalong the c-axis as in BaNiO3 (2H). In between the twoextremes, there can be several polytypic structures consist-ing of mixed cubic and hexagonal stacking of AO3 layers;for example, the 6H and 4H polytypes have the stackingsequences cch cch and chch, respectively. Typical ABO3

oxides showing polytypism are BaCrO3, BaMnO3, andBaRuO3.

Oxides of the general formula A2BO4 crystallize in theK2NiF4 structure, which is closely related to the perovskitestructure. The tetragonal structure of K2NiF4 (Fig. 3) canbe regarded as consisting of KNiF3 perovskite slabs ofone unit cell thick, which are stacked one over the otheralong the c-direction. The adjacent slabs are displacedrelative to one another by 1

212

12 , such that the c-axis of

the tetragonal structure is roughly equal to three times thecell edge of the cubic perovskite. The structure is two-dimensional in the sense that only the equatorial anions ofthe NiF6 octahedra are linked through corners. Tolerancefactors for the K2NiF4 structure can be worked out justas for perovskites, and oxides of this structure often showorthorhombic distortion. Oxides of the K2NiF4 structure(for example, La2CuO4, LaNiO4) have been investigatedextensively.

The perovskite structure can tolerate vacancies at the Aor X sites giving rise to nonstoichiometric compositions,A(1−x)BX3 and ABX3−x . B-site vacancies are energeti-cally not favored unless there are compensating factorssuch as B–B interaction. Typical examples of A-site va-cancies are the tungsten bronzes, AxWO3 and Cu0.5TaO3;

brownmillerite, CaFeO2.5, is an example of an anion-deficient perovskite. Perovskite-type oxides also show an-ion excess nonstoichiometry as in the case of LaMnO3 + x ,where the apparent anion excess probably arises from Lavacancies. Examples of B-site vacancy hexagonal per-ovskites are Ba3Re2O9 and Ba5Nb4O15.

There are many other interesting oxide systemspossessing perovskite units. There is a family of oxides,first described by Aurivillius, of the general formulaBi2An−1BnO3n + 3 containing (Bi2O2)2 + layers and(An−1BnO3n+1)2− perovskite layers. Typical members ofthis family are Bi4Ti3O12(n = 3) and BaBi4Ti4O15(n = 4).These oxides form disordered as well as ordered inter-growth structures. There are other intergrowth structuresin oxides derived from the perovskite structure. Thus,the AnBnO3n+2 family consists of slabs of An−1BnO3n+2

obtained by cutting the perovskite structure parallel to the(110) planes; a series of oxides with n between 4 and 4.5is known in the Na–Ca–Nb–O system. The An+1BnO3n+1

family (for example, in Sr–Ti–O and La–Ni–O systems)is generated by cutting the perovskite structure into slabsalong the (100) planes.

II. ELECTRICAL AND MAGNETICPROPERTIES OF PEROVSKITE OXIDES

Perovskite oxides exhibit a variety of electronic proper-ties. Thus, BaTiO3 if ferroelectric, SrRuO3 is ferromag-netic, LaFeO3 is weakly ferromagnetic, BaPb1−x Bix O3

is superconducting, and LaCoO3 shows an insulator—metal transition. Properties of known perovskites havebeen compiled by Goodenough and Longo as well asNomura. Several perovskite oxides exhibit metallic con-ductivity, typical examples being ReO3, Ax WO3, LaTiO3,AMoO3 (A = Ca, Sr, Ba), SrVO3, and LaNiO3. Metal-lic conductivity in perovskite oxides is caused by strongcation–anion–cation interaction.

We have listed important perovskite oxides contain-ing B-site transition metal atoms in Fig. 4, where ox-ides with the same d-electron configuration are groupedtogether in the columns. Entries in each column are ar-ranged in the decreasing order of B cation–anion trans-fer energy b (B–O covalency) from top to bottom. Co-valent mixing parameters λσ and λπ (and hence transferenergies bσ and bπ ) increase with the increasing valencestate of the B cation; for the same valence state, mix-ing varies as 5d > 4d > 3d. The influence of A cationson the covalency of the B–O bond is indirect. AcidicA cations decrease B–O covalency; λσ > λπ in all thecompounds.

The dotted lines in Fig. 4 representing bπ = bm (bm isthe critical value for spontaneous magnetism), bπ = bc,

Page 294: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS Final Pages

Encyclopedia of Physical Science and Technology EN011A-554 July 14, 2001 21:50

710 Perovskites

FIGURE 4 Periodic table of perovskites. [From Goodenough, J. B. (1974) In “Solid State Chemistry” (C. N. R. Rao,ed.), Dekker, New York.]

and bσ = bc (bc is the critical value of the transfer energy)separate oxides exhibiting localized electron behav-ior from those with collective electron properties.Compounds in column 1 are insulators because the Bcations are of d◦ electron configuration. Most of thecompounds in column 2 (spin S = 1

2 ) are metallic andPauli paramagnetic; the line bπ = bm separates LaTiO3

from GdTiO3 because GdTiO3 is a semiconductor with aferromagnetic Curie temperature (Tc) of 21 K. AMoO3

(A = Ca, Sr, Ba) and SrCrO3 in the third column (S = 1)are metallic and Pauli paramagnetic. Other compounds inthis column are semiconducting and antiferromagnetic.The line bπ = bm separates metallic and Pauli para-magnetic SrCrO3 from the antiferromagnetic semimetalCaCrO3. The line bπ = bc separates PbCrO3 from LaVO3

because the latter exhibits a crystallographic transition ata temperature lower than the Neel temperature (TN) char-acteristic of localized electrons. The region bm > bπ > bc

appears to be quite narrow as revealed by electrical, mag-netic, and associated properties. Pressure experiments arevaluable in the study of this region; thus, dTN/d P < 0in CaCrO3, while dTN/d P > 0 in YCrO3 and CaMnO3.Since increasing pressure increases bπ (by decreasinglattice dimensions), dTN/d P > 0 for bπ < bc (localizedbehavior) and dTN/d P < 0 for bm > bx > bc (collectivebehavior). Compounds in columns 4, 5, and 6 are anti-ferromagnetic insulators. Since the intraatomic exchange

≈S (S + 1) decreases the covalent mixing, it is natural thatmaxima in the curves bπ = bc and bσ = bc correspondingto smallest values of bπ and bσ occur in the middle of thecolumns with S = 5

2 . Rare earth orthoferrites with S = 52

are antiferromagnetic insulators and exhibit parasiticferromagnetism. The important contributions here are:(a) Fe3+ spins canted in a common direction either by co-operative buckling of oxygen octahedra or by anisotropicsuperexchange and (b) canting of an antiferromagneticrare earth sublattice because of interaction between twosublattices.

LaCoO3 is shown twice in Fig. 4, both in the S = 2 andS = 0 columns at the end, since Co3+ in this solid can haveeither the low-spin or the high-spin configuration. Thecompound exhibits a transition from a localized electronstate to a collective electron state (metal-insulator transi-tion). In the ninth column of Fig. 4, perovskites contain-ing d4 cations are placed. Of the three compounds in thiscolumn, SrRuO3 is a ferromagnetic metal (Tc = 160 K);CaRuO3 is antiferromagnetic (TN = 110 K) with a weakferromagnetism. Since both the compounds have the sameRuO3 array, the change from ferromagnetic to antiferro-magnetic coupling is of significance. SrFeO3 is placed inthe same column on the assumption that Fe4+(3d4) is in thelow-spin state, but recent work suggests that Fe4+ in thisoxide is in the high-spin state down to 4 K. CaFeO3, on theother hand, shows disproportionation of Fe4+ to Fe3+ and

Page 295: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS Final Pages

Encyclopedia of Physical Science and Technology EN011A-554 July 14, 2001 21:50

Perovskites 711

Fe5+ below 290 K. In the last but one column containingS = 1

2 B cations, metallic and Pauli paramagnetic LaNiO3

should be separated from antiferromagnetic YNiO3 andLuNiO3 indicating that in LaNiO3 bσ > bm and in YNiO3

bσ < bm. Similarly, in the last column, LaCoO3 should beseparated from LaRhO3 because the latter is a narrow gapsemiconductor with a filled t2g(π∗) band and an emptyeg(σ ∗) band.

III. MAGNETIC PROPERTIES OFPEROVSKITE FLUORIDES

In oxides, ferrimagnetism is common in spinel, garnet,and magnetoplumbite structures because of the occupa-tion of tetrahedral, octahedral, or dodecahedral sites bymagnetic ions. In fluorides, the transition metal cationsinvariably occupy octahedral sites. Ferrimagnetism there-fore results from the manner in which the octahedraare linked. In fluorides of hexagonal BaTiO3 structure,for example, one of the sublattices consists of octahe-dra linked by corners and the other octahedra linked byfaces; one third of the metal ions are present in corner-shared octahedra and two-thirds in face-shared octahera.Both the sublattices have different magnetic momentsthat are coupled antiferromagnetically leading to ferri-magnetism. CsFeF3 is a typical material possessing thisstructure; Fe–Fe interaction between face-shared octa-hedra is ferromagnetic and that between corner-sharedoctahedra is antiferromagnetic, giving rise to ferrimag-netism with a Tc of 60 K. Isostructural RbMnF3 is,however, antiferromagnetic since all interactions betweenthe neighboring octahedra are antiferromagnetic in thiscompound.

KCrF3 is an antiferromagnetic solid adopting anantiferro-distortive structure, the magnetic structurebeing similar to that of LaMnO3 (A type). Superexchangeinteraction d0

x2 − y2 − p − d1z2 leads to ferromagnetic

layers that are antiferromagnetically coupled through theempty dx2 − y2 orbitals. KCuF3 is a 1D antiferromagnet (Atype) with the spins lying in the ab plane. The magneticbehavior is again a consequence of antiferro-distortiveordering of distorted octahedra. Interaction betweentwo half-filled dx2 − y2 orbitals occurs along the c-axis,while interchain coupling is through filled dz2 –half-filleddx2 − y2 interaction in the ab plane. There are two formsof tetragonal KCuF3, one with I4/mcm symmetry andthe other P4/mbm symmetry; the 1D character is morepronounced in the latter. CsNiF3 is the only fluoridecrystallizing in the hexagonal 2H perovskite structure,where infinite chains of face-sharing NiF6 octahedraparallel to the c-axis exist. It exhibits 1D ferromagnetismat high temperatures (70 < T < 300 K) and 3D antiferro-

magnetism at low temperatures. Neutron diffraction andspecific heat measurements show that the 3D transitionoccurs at 2.65 K. The 3D magnetic structure consistsof ferromagnetic planes parallel to the c-axis that arecoupled antiferromagnetically.

Among the fluorides with layered structures, theK2NiF4 family has been widely investigated. K2NiF4

is a two-dimensional Heisenberg antiferromagnet withTN = 97 K and J/k = − 50 K. The isostructural K2CoF4

behaves as a 2D, S = 12 , Ising system (TN = 107.8 K,

J/k = − 97 K). K2CuF4 (and its rubidium and cesiumanalogs) crystallizing in an orthorhombic-distortedK2NiF4 structure are 2D Heisenberg ferromagnets. Thedistortion of the structure and ferromagnetic propertiesarise from the ordering of the elongation axis of theCuF6 octahedra alternately in the a and b directions. Theintralayer exchange constant J/k is ∼11 K and the valuefor interlayer coupling is ∼0.03 K.

IV. METAL–INSULATOR TRANSITIONS

LaCoO3, which is an insulator at ordinary temperatures,becomes metallic at high temperatures ( � 1200 K). Moreinteresting are the transitions from the metallic to the in-sulating state brought about by compositional changes.Oxides of the type La1−x Srx MO3 (M = V or Co) showmetal–insulator transitions with an increase in x . Thus,La1−x Srx CoO3 becomes metallic at all temperatures whenx > 0.3, while LaCoO3(x = 0) is an insulator at room tem-perature. When M = Mn or Co, the oxide becomes fer-romagnetic at the same composition when the d-electronsbecome itinerant. Another interesting system showingcompositionally controlled metal–insulator transitions isLaNi1−x Mx O3, where M = Cr, Mn, Fe, or Co. In this sys-tem, the metallic resistivity of the x = 0 oxide gives wayto a semiconducting or an insulating behavior above a par-ticular value of x . Such a change–over occurs essentiallyat a universal value of resistivity (2000 µ cm) in all theseoxide systems, the value corresponding closely to Mott’sminimum metallic conductivity. A metal–insulator transi-tion is brought about in the La4−x Ba1+x Cu5O13 + δ systemby a change in the La:Ba ratio or oxygen stoichiometry.

V. OXIDES OF K2NIF4 STRUCTURE

Quasi-two-dimensional oxides of the type A2BO4 pos-sessing the K2NiF4 structure contain the ABO3 perovskitelayers in between the rock-salt AO layers with B–O–Binteraction occurring only in the ab plane. Electricaland magnetic properties of the A2BO4 oxides are con-siderably different from those of the ABO3 perovskites.

Page 296: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS Final Pages

Encyclopedia of Physical Science and Technology EN011A-554 July 14, 2001 21:50

712 Perovskites

Accordingly, LaNiO3 is metallic and Pauli paramagnetic,while La2NiO4 exhibits two-dimensional antiferromag-netic ordering around 200 K and a semiconductor–metaltransition around 600 K. In the LaO(LaNiO3)n family,the electrical conductivity decreases with an increasein n, becoming essentially metallic when n = 3. SrRuO3

is a metallic ferromagnet, but Sr2RuO4 is a paramag-netic insulator. A more interesting comparison is pro-vided by LaCuO3 and La2CuO4. The former is a metaland the latter is an antiferromagnetic insulator. Oxygen-excess La2CuO4 and La2−x Srx (Bax )CuO4 are supercon-ductors, (Tc ∼ 30–40 K), although in the normal state,they are marginally metallic. A strict comparison of theproperties of three- and two-dimensional oxides canbe made only when the d-electron configuration ofthe transition metal ion B is the same. A comparativestudy of the two systems has been made with respectto their electrical and magnetic properties. For example,members of the La1−x Sr1+x CoO4 are all semiconduc-tors with a high activation energy for conduction unlikeLa1−x Srx CoO3(x ≥ 0.3) which is metallic; the latteroxides are ferromagnetic. La0.5Sr1.5CoO4 shows amagnetization of 0.5µB at 0 K (compared to 1.5µB

of La0.5Sr0.5CoO3), but the high-temperature suscep-tibilities of the two systems are comparable. In SrO(La0.5

Sr0.5MnO3)n , both magnetization and electrical con-ductivity increase with an increase in n approachingthe value of the perovskite, La0.5Sr0.5MnO3. LaSrMn0.5

Ni0.5(Co0.5)O4 shows no evidence of long-range ferro-magnetic ordering, unlike the perovskite LaMn0.5Ni0.5

(Co0.5)O3; high-temperature susceptibility behavior ofthese two insulating systems is, however, similar. LaSr1−x

Bax NiO4 exhibits high electrical resistivity with theresistivity increasing proportionately with the magneticsusceptibility (note that LaNiO3 is a Pauli paramagne-tic metal). High-temperature susceptibilities of LaSrNiO4

and LaNiO3 are comparable. Susceptibility measurementsshow no evidence for long-range ordering in LaSrFe1−x

Nix O4, unlike that in LaFe1−x Nix O3 (x ≤ 0.35), andthe electrical resistivity of the former is considerablyhigher.

VI. FERROICS

Ferroics are materials possessing two or more orientationstates or domains that can be switched from one to anotherthrough the application of one or more appropriate forces.In a ferromagnet, the orientation state of magnetizationin domains is switched by the application of a magneticfield. In a ferroelastic, the direction of spontaneous strainin a domain is switched by the application of mechanical

stress. In a ferroelectric, spontaneous electric polarizationis altered by the application of an electric field. Thesethree ferroics are primary ferroics since they are gov-erned by switchability of the properties. Perovskites pro-vide many examples of ferroics. BaTiO3, Bi4Ti3O12, andKNbO3 are well-known ferroelectrics, while PbZrO3 andNaNbO3 are antiferroelectric. Some of the perovsk-ites exhibit paired properties: ferroelectric-ferroelastic,KNbO3; ferroelectric-antiferromagnetic, HoMnO3; ferro-electric-superconducting SrTiO3; and antiferroelectric-antiferromagnetic, BiFeO3.

There are several secondary ferroic properties that occuras induced quantities, and the orientation states differ inderivative quantities that characterize the induced effects(for example, induced electric polarization by dielectricsusceptibility). Thus, SrTiO3 is a secondary ferroic show-ing ferrobielectricity.

Materials such as Pb(Mg1/3Nb2/3)O3 are relaxor ferro-electrics. Materials such as Pb(Zr1−x Tix )O3 or PZT areelectro-optic materials. Aurivillius oxides of the formulaBi2An−1BnO3n+1 are high-Tc ferroelectrics.

VII. HIGH-TEMPERATURESUPERCONDUCTORS

Superconductivity in perovskite oxides has been knownfor some time, the highest Tc observed until 1987 being∼13 K in Ba(Bi, Pb)O3. The oxide system where highTc was first reported in the 30–40-K region, La2−x Bax

(Srx )CuO4, has the tetragonal K2NiF4 structure at ordinarytemperatures, but becomes orthorhombic around 180 K.Oxygen-excess La2CuO4 also shows superconductivityin the 30-K region. Superconducting oxides of the 123type (Tc ∼ 90 K) with the general formula LnBa2Cu3O7−δ

(Ln = Y, La, Nd, Sm, Eu, Gd, etc.) are defect perovskitescontaining Cu–O sheets as well as Cu–O chains, the latterimparting the orthorhombic structure. The 123 oxidesare the x = 1 members of the Ln3−x Ba3+x Cu6O14+δ

defect perovskites. The bismuth and thalliumcuprate superconductors conforming to the formu-las Bi2(Ca, Sr)n+1CunO2n+4, Tl2Can+1Ba2CunO2n+4,TlCan−1Ba2CunO2n+3, with Tcs reaching 125 K containdefect perovskite layers and rock-salt-type layers.

In Fig. 5, the schematic structures of n = 2 cupratesof different families are shown in order to illustrate howthey arise from the intergrowth of perovskite and rocksalt layers. The family of mercury cuprates is also relatedto these intergrowths. The highest Tc to date is in a Hgcuprate of the formula HgBa2Ca2Cu3O8 which becomessuperconducting at 165 K under pressure. It may be notedthat the highest Tc found in a copper-free oxide material

Page 297: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS Final Pages

Encyclopedia of Physical Science and Technology EN011A-554 July 14, 2001 21:50

Perovskites 713

FIGURE 5 Schematic structure of n = 2 cuprates (ACuO3−x)m

(AO)n: (a) LaSrCaCu2O6, (b) TIBa2CaCu2O7, and (c) TI2Ba2CaCu2O8 or Bi2Sr2CaCu2O8.

is ∼30 K in Ba1−xKxBiO3 which is a three-dimensionalperovskite.

VIII. COLOSSAL MAGNETORESISTANCE

Rare earth manganates of the formula Ln1−xAxMnO3

(Ln = rare earth, A = alkaline earth) with the perovskitestructure show many interesting properties. Because of thedouble-exchange mechanism of electron hopping betweenMn3+ and Mn4+ ions, these materials exhibit ferromag-netism and an insulator-metal transition at the ferromag-netic transition temperature, Tc. Application of a moderateor high magnetic field (1–6T) causes a large decrease inthe resistivity, particularly around Tc. The negative mag-netoresistance can be as high as 100%, and hence the termcolossal magnetoresistance. In Fig. 6, typical magnetore-sistance data are shown in the case of La0.7Ca0.3MnO3.The Tc in these manganates is extremely sensitive to theaverage size as well as the size mismatch of the A-sitecations. Charge-ordering exhibited by some of the man-ganate compositions is also sensitive to the average sizeof the A-site cations. The competing interactions present

FIGURE 6 Electrical resistivity data of La0.7Ca0.3MnO3 in theabsence and presence of magnetic field. Temperature variationof magnetoresistance (MR) is also shown.

in the manganates such as double-exchange and Jahn–Teller distortion (and also charge-ordering) are responsi-ble for the fascinating phenomena and properties of thesematerials.

IX. WHY ARE PEROVSKITES SPECIAL?

There is hardly any other class of solids that exhibits thevariety of fascinating properties as the perovskites. Thus,perovskite oxides show high-temperature superconductiv-ity, colossal magnetoresistance, and a variety of ferroicproperties. This is because of the unique structure of per-ovskites wherein the B–O interaction, B–O–B angle, BBtransfer integral, and other important factors can be sen-sitively varied by changing the A-site or B-site cations.We should remember that there is no B–B interaction inperovskites. It is by and large the B–O–B interaction andthe nature of the BO6 octahedra that determine the proper-ties. The sensitivity of the properties of perovskites to thecations in the A- and B-sites illustrates why perovskitesare special. Thus, varying the A-site cations can affect the

Page 298: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GSS Final Pages

Encyclopedia of Physical Science and Technology EN011A-554 July 14, 2001 21:50

714 Perovskites

properties through changes in (a) the tolerance factor andlattice distortion, (b) B–O–B angle and related parame-ters, (c) disorder due to A-site ion size mismatch, and (d)B-O bonding through competitive interaction, involvingσ or π bonds. Varying the B-site cations can change theproperties because of changes in (a) the tolerance factor,(b) spin and electronic configuration, and (c) disorder andsize effects.

SEE ALSO THE FOLLOWING ARTICLES

CRYSTALLOGRAPHY • FERROMAGNETISM • MATERIALS

CHEMISTRY • SOLID-STATE CHEMISTRY • SUPERCON-DUCTIVITY • SUPERCONDUCTORS, HIGH TEMPERATURE

BIBLIOGRAPHY

Goodenough, J. B. (1974). In “Solid State Chemistry” (C. N. R. Rao,ed.). Dekker, New York.

Goodenough, J. B., and Longo, J. M. (1970). “Landbolt-BornsteinTabellen, New Series III/4a,” Springer-Verlag, Berlin and NewYork.

Nomura, S. (1978). “Landbolt-Bornstein Tabellen, New Series III/12a,”Springer-Verlag, Berlin and New York.

Rao, C. N. R., and Gopalakrishnan, J. (1997). “New Directions in SolidState Chemistry,” Cambridge Univ. Press, London and New York,Second Ed.

Rao, C. N. R., and Raveau, B. (1998). “Transition Metal Oxides,” VCH-Wiley, New York, Second Ed.

Rao, C. N. R., and Raveau, B. (1999). “Colossal Magnetoresistance,Charge-Order and Related Properties of Manganese Oxides,” WorldScientific, Singapore.

Page 299: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ/GLT P2: GQT Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN013A-634 July 26, 2001 19:56

Radiation PhysicsJohn H. HubbellNational Institute of Standards and Technology

I. Description of Radiation PhysicsII. Radiation Sources

III. Quantifying RadiationIV. Interaction with MatterV. Useful Radiation Data

GLOSSARY

Activity The activity, A, of an amount of radioactive nu-clide in a particular energy state at a given time is thequotient A = dN/dt, in which dN is the expectation valueof the number of spontaneous nuclear transitions fromthat energy state in the time interval dt. The unit foractivity is the Becquerel (Bq), in which 1 Bq = 1 s−1.The activity unit in the older literature is the curie (Ci),in which 1 Ci = 3.7 × 1010 s−1 (exactly) = 37.0 GBq.

Alpha ray (α ray) Radiation in the form of particlesequivalent to helium nuclei, ejected from radioisotopesundergoing nuclear transitions according to this mode.

Attenuation coefficient (µ) Sometimes called absorp-tion coefficient. Coefficient in Lambert’s law I = I0e−µt

which describes the attenuation to intensity I of a nar-row beam of radiation of incident intensity I0, afterpenetrating a thickness t of material. Units of µ: In-verse of units of t (e.g., cm−1). Data tabulations areusually in terms of the mass attenuation coefficient(µ/ρ), where ρ is the density of the material. Units ofµ/ρ: cm2 g−1. Primarily applicable to photon radiation(x rays, gamma rays, bremsstrahlung).

Beta ray (β ray) Radiation in the form of electrons

ejected from an radioisotopic nuclei undergoing nu-clear transitions according to this mode.

Bremsstrahlung Photon produced when chargedparticles (e.g., electrons or protons) are slowed byinteractions with atoms in passing through matter.Bremsstrahlung (German, braking radiation) is some-time called white radiation, or continuous spectrum,to distinguish it from fluorescence (x-ray) line spectracharacteristic of each element.

Cross section (σ) Effective area of a target particle (e.g.,atom, electron, nucleus, etc.) for intercepting a pho-ton or a unit of particle radiation, resulting in ab-sorption or deflection of the incident radiation. Unitsof σ : cm2 or b (barns), where 1 b = 10−28 m2 =10−24 cm2. The total cross section σtot (probability forany interaction with the target particle) is related to themass attenuation coefficient µ/ρ according to the re-lation σtot = (µ/ρ) · (Ar/NA) where Ar is the relativeatomic mass (atomic weight) and NA is the Avogadroconstant.

Dose (D) Used broadly for energy deposited in matterfrom radiation. Used in dosimetry for the energy ab-sorbed per unit mass of material, usually by ionizationprocesses. Units are the rad and the Gray (Gy), which

561

Page 300: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ/GLT P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN013A-634 July 26, 2001 19:56

562 Radiation Physics

are equivalent, respectively, to 100 ergs/g and 1 J/kg.Therefore, 1 rad = 1/100 Gray or 1 cGy.

Exposure (X) Exposure, X, related to the air ionizationproperties of photon radiation, is the quotient dQ/dmof the amount of charge dQ of the ions of one sign pro-duced by the electrons (negatrons and positrons) liber-ated by photons in a volume element of air having massdm are completely stopped in air. X has the units C/kg.In the older literature, and in the calibrations of manyexisting instruments, one finds the special unit roentgen(R) in which 1 R = 2.58 × 10−4 C/kg (exactly).

Fluence Time-integrated flux of particles or photons.Unit: cm−2.

Flux Number of particles or photons passing throughsome defined zone per unit time. For parallel beams,this is a unit area; for omnidirectional radiation, thezone chosen is usually a sphere with cross section of1 cm2. In both cases, the unit is cm−2 sec−1.

Gamma-ray (γ-ray) Photon resulting from a transitionin an atomic nucleus, either from natural decay of aradioisotope, or from an induced nuclear transition.

Gray Radiation absorbed dose unit of the Systeme Inter-nationale (SI), of value 1 J kg−1 and equal to 100 rad.

Mean-free-path (mfp) For photons in an attenuatingmedium, distance over which the primary (unscattered)beam intensity is reduced by a factor 1/e and is equalto the reciprocal, 1/µ, of the attenuation coefficient µ.

Photon Quantum of electromagnetic radiation. Can befrom any region of the electromagnetic spectrum in-cluding radio waves and visible light, but in this articlereferring to quanta in the energy (or wavelength) regionof x rays, gamma rays, or bremsstrahlung.

X ray (x-ray when used as an adjective) Specificallyreferring to characteristic (line-spectra: fluorescence)photons resulting from atomic (extra-nuclear) transi-tions, but often used more broadly, for photons from anysource, over the energy range from tens of eV (electronvolts) through the GeV region.

RADIATION PHYSICS ties together a variety of other-wise separate and compartmentalized scientific, medical,engineering disciplines, all involving aspects of radiationincluding radiation sources, radiation transport (penetra-tion), radiation detection and radiation effects. These dis-ciplines include, for example:

1. Atomic and nuclear physics, theory andmeasurements

2. Medical radiation physics: imaging, therapy3. Environmental radiation dosimetry4. Nuclear power engineering, shielding, radiation

transport theory, Monte Carlo

5. X-ray crystallography6. Industrial radiation processing, radiometric gauging,

on-stream monitoring and control7. Fluorescence XRS, XRF materials analysis, radiation

archeometry, dating8. X-ray, γ -ray astronomy, astrophysics, space vehicle

dosimetry9. Radiation damage to electronic circuitry

This cross-disciplinary feature of radiation physics hasresulted in the formation in 1985 of the International Ra-diation Physics Society (IRPS) which meets trienniallyto bring researchers from the above and other radiation-related disciplines together to share their experiences andresults, bound together by the common thread of radiationphysics. Information on this Society and its triennial Inter-national Symposia may be obtained from the IRPS Secre-tariat, c/o Department of Physics, University of Pittsburgh,Pittsburgh, PA 15260.

I. DESCRIPTION OF RADIATION PHYSICS

The field of radiation physics is distinct from the fields ofatomic physics, nuclear physics, and particle physics andother material sciences, all of which focus on the natureof tangible matter, with radiation playing the role of theprobe. In radiation physics, the roles are reversed, withthe focus on the radiation, and matter playing the role ofthe probe.

The radiation in question arises from both natural andartificial sources. Natural sources include, for example,space radiation and radium and uranium in rocks, as wellas trace radionuclides in the human body and in otherorganisms. Artificial sources include, for example, con-centrations of radioisotopes such as in cobalt-60 plaqueirradiators, accelerators such as x-ray and electron-beammachines, and nuclear chain reactions in reactors andweapons.

The technologies of radiation physics are used in aca-demic research, industrial technology, aerospace technol-ogy, and medicine. In some machines, materials and de-vices have to withstand high doses of radiation. In thetreatment of tumors, beams of radiation have to be gener-ated and controlled with great accuracy. Thus, while theeffects of radiation on living tissue fall into the field ofradiobiology, the task of irradiating tissue is often carriedout by experts in radiation physics.

The field of radiation effects is an important subtopicof radiation physics. Radiation deposits energy in mat-ter. This energy is then distributed among atoms andmolecules in a great variety of ways. The energy excitesatoms to higher energy states or moves them about in the

Page 301: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ/GLT P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN013A-634 July 26, 2001 19:56

Radiation Physics 563

material. The scientific disciplines employed in radiationphysics center around the interactions of photons or par-ticles with solids, liquids, and gases. The primary transferof energy is complex but well understood; but the energytransferred then dissipates via secondary interactions withthe target material. Secondary processes are complex andnot well understood except in a few cases, such as the var-ious results of the displacements of atoms in silicon by anelectron beam.

The technological aspects of radiation physics are verywide. Instruments for measuring radiation (dosimeters andcounters) are widely used. The prevention of degradationin electronic and optical materials is important. Further-more, in some cases, controlled irradiation can improvecommercial products. For example, food preservation andtoughening of plastics are sometimes most economicallyachieved by irradiation with noncontaminating radiation(electrons or γ rays). Infectious materials such as sewagecan be sterilized by radiation, and in-package sterilizationof medical supplies using radiation is now done routinely.

An equally important subtopic of radiation physics isnoninvasive interrogation of systems. In addition to med-ical imaging by conventional film and by the increasingnumber of tomographic modalities, hot-rolling of steelnow employs noncontact radiometric monitoring of thethickness, and control of the rollers, and voids (bubbles)

FIGURE 1 Energy values for various radiation environments and threshold energies. Approximate threshold forproducing ionization , for atomic displacements in solids , and for nuclear transmutations . Note: F & F = fissionand fusion. (Courtesy of Hughes Aircraft Co.)

in liquids flowing inside closed pipes can be monitoredby radiometric transmission and backscattering arrange-ments, to name a few examples. The design of equipmentfor airport x-ray inspection of luggage utilizes radiationphysics to produce useful images with minimal damageto camera film and other sensitive items.

II. RADIATION SOURCES

A. Overview

With reference to radiation effects, Fig. 1 shows a surveyof the broad range of energy values that have to be con-sidered under the term radiation. Most of the radiations ofinterest have energies above 1 keV (103 electron volts), butneutrons with much lower energies are still described asparticle radiation. Ultraviolet photons also have sufficientenergy to cause the same chemical effects as we see withx rays and very high energy photons. The shaded areas inFig. 1 show that different fundamental effects in matteroccur at different radiation energies.

B. Photons

Photons that have energies in the keV and MeV (mega-volt) range (x rays and γ rays) have the ability to penetrate

Page 302: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ/GLT P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN013A-634 July 26, 2001 19:56

564 Radiation Physics

matter deeply and, when absorbed, to produce strong ef-fects. X rays are generated when an electron beam strikesmatter; an x-ray generator consists of a powerful electrongun and a metal target in which the photons are generated.For x-ray crystallography applications, the photons willbe in the energy range from 5 to 30 keV, including line-energies characteristic of the atomic number of the metalin the target, superimposed on a bremsstrahlung contin-uum spectrum. For imaging and irradiation applications,the energies commonly range up to 3 MeV, mostly in theform of bremsstrahlung. For research applications, elec-tron accelerators such as synchrotrons and linacs (linearaccelerators) produce photons up to the GeV (109 elec-tron volts) region, but for imaging and irradiation pur-poses the energy is usually kept below 5 MeV to avoidproducing radioactivities in the sample due to the photonu-clear effect which has a resonance peak in the region 5 to40 MeV.

γ Rays are high-energy monoenergetic photons. Theterm is used specifically for photons created during the

TABLE I Radioisotopes Important in Medical Therapy and Industrial Irradiation Applications, also as Environmental Hazards(90Sr)a

Photons Particles

Percentage TransitionNuclide Half-life (year) Type of decay Energy (MeV) emitted (%) Energy (MeV) probability (%)

60Co 5.27 1.173 99.86 0.318 99.9

β− 1.333 99.98 1.491 0.1

(av 1.25)192Ir 0.526 0.296 29.6

(192 d) 0.308 30.7

0.316 82.7 0.530 42.6

β− 0.468 47.0 0.670 47.2

0.604 8.2

0.612 5.3137Cs 0.662 85.1

0.512 94.6

β− 8

30 1.174 5.40.032−0.038(137Ba)

K X rays

Plus internal conversion

electrons 0.65 MeV90Sr 28 β− 0.54 100 — —

+ daughter90Y 0.176 β− 2.27 100 — —85Kr 10.6 β− 0.15 0.7 0.51 0.7

β− 0.67 99.3 — —252Cf 2.65 Spontaneous fission — — Neutrons, 2 MeV

γ s, 5.9–6.1 MeV

Fission fragments, 80and 104 MeV

a Main emission energies.

disintegration of atomic nuclei. A well-known example isthe pair of photons created in the spontaneous disintegra-tion of the cobalt-60 atomic nucleus. These photons haveenergies of 1.1732 and 1.3325 MeV, and cobalt-60 (60Co)is frequently used in plaque and other geometrical con-figurations in irradiation facilities. γ Rays are created bynuclear chain reactions such as those that occur in a nu-clear reactor core or a nuclear explosion. The isotopescontained in nuclear fuel represent concentrated sourcesof γ rays. These can be used for experimental irradiationin spent-fuel ponds. Table I presents data on 60Co andsome of the other radioisotopes useful in medical therapyand industrial irradiation applications, also on 90Sr whichcan be important as an environmental hazard.

In recent decades, synchrotron radiation has become amajor high-flux source of photons for research and analyti-cal applications. This radiation is produced in high-energyaccelerators from the bending of electron orbital trajecto-ries in the confining magnetic field, sometimes by mag-netic “wigglers and undulators” interposed in the electron

Page 303: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ/GLT P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN013A-634 July 26, 2001 19:56

Radiation Physics 565

path. The photon energies thus produced range from tensof eV from accelerators in the hundreds of MeV range,to above 100 keV for electron accelerators in the multi-GeV range. Another recently developed source of photonsin the γ -ray energy range is by inverse Compton scatter-ing. In such devices, intense laser beams in the visible orultraviolet (eV range) are collided with GeV-range elec-trons in an accelerator, boosting the eV laser photons upto MeV energies.

C. Electrons

Electron beams for research and for irradiation purposesare produced by a variety of orbital and linear accelera-tors, with output energies from the keV to the GeV re-gions, and beam currents from microamperes to kiloam-peres. Linear accelerators (linacs) consist of an electrongun injector at one end, followed by a series of accelerat-ing cavities containing rf (radio frequency) power. Naturalsources of high-energy electrons include β rays from ra-dioisotope decay, and also from electrons in space whichhave been accelerated in magnetic fields. There is a par-ticularly high concentration of electrons trapped aroundplanets that have high magnetic fields, such as Earth andJupiter. These trapped-electron concentrations, called VanAllen belts, offer a radiation damage hazard to componentsin unmanned space vehicles traversing these altitudes, andare avoided as much as possible in the case of mannedflights.

D. Positrons

The emissions from decay of radionuclides can also in-clude electrons of the opposite charge sign, from that ofthe orbital atomic electrons and β’s (β−), which are calledpositrons (e+ or β+), and these can also be accelerated toprovide positron beams. Positrons are also produced in theprocess of pair production by photons of energy higherthan that equivalent to the rest-mass of two electrons(1.022 MeV total). The signature of positrons is the anni-hilation radiation resulting from the collision between anelectron and a positron, usually, if the positron has come torest, consisting of two photons in opposite directions, eachwith one rest-mass energy of 0.511 MeV. This radiation isimportant in medical diagnostic imaging by PET (positronemission tomography). Variants of this method includeSPECT (single-photon emission computed tomography).

E. Protons

Proton beams can also be produced by accelerators. Oneespecially well-known form of proton accelerator is thecyclotron, which uses radio-frequency energy. Nuclear

reactions also produce protons in a material sample. Pro-tons are emitted from the sun, and protons are also foundtrapped in planetary magnetic fields. High-energy protonsare emitted from the sun in bursts associated with solarflares. A less energetic, steady stream of protons emittedfrom the sun is called the solar wind.

F. Ions

α Particles (which are ions) consist of high-energy heliumnuclei; these too are emitted by radioisotopes and starsand are thus found in interplanetary space. In addition tohelium ions, very energetic ions of all atomic masses arefound in space. These are called cosmic rays. Ion beamscan be generated in accelerators. One use for ion beams isthe ion implantation of solids to modify their properties.High-current ion implanters are available for industrialuse. These beams cause large amounts of radiation damagein the solids so treated.

G. Neutrons

The main sources of neutrons are nuclear fission and nu-clear fusion reactions. Of these, the fission of uraniumis the most common. The primary product of fission isfast neutrons having an energy distribution described as afission spectrum. This spectrum has a large content withenergies above 1 MeV. This is the spectrum that would beobserved near a nuclear explosion. In a nuclear reactor, in-teraction with surrounding material, most efficiently withlight nuclei, reduces the neutron energy. Thermal neutronsor cold neutrons of much lower energy are produced. Oncollision with matter, fast neutrons produce much damagewhile thermal neutrons produce radioactivation. Neutronsreduced to cryogenic temperatures, with their long effec-tive wavelength, are useful in surface physics studies. Nu-clear fusion reactions produce neutrons having a muchhigher energy than fission. For example, one commercialgenerator of fusion neutrons produces a beam of D-T neu-trons of a single energy of 14 MeV.

III. QUANTIFYING RADIATION

A. Flux and Fluence

A parallel beam of radiation passing through free space canbe quantified by quoting the number of particles passingthrough a unit area. See flux and fluence in this article’sGlossary. If the particles come from a variety of directions(as in outer space) then we quote the number intersectinga sphere of unit cross-sectional area.

Page 304: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ/GLT P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN013A-634 July 26, 2001 19:56

566 Radiation Physics

B. Energy Spectrum

A curve plotting the population of particles in a givenenergy range is called the energy spectrum of the particle.

C. Exposure

In radiation physics, we are often concerned with the end-result of the absorption of radiation-borne energy in solids.In order to calculate this energy, we first need to quantifythe fluence and energy of the particles that impinge on thesolid. Even if we cannot do this, we can at least specifyan observable effect in a familiar medium such as air.These forms of a statement of a quantity of radiation donot describe the exact energy absorbed in a given material.They are, however, a useful measure of radiation quantityand are called “exposure” units.

We express exposure in two ways:

1. We note the fluxes impinging on the solid of interestand the energies of the particles or photonsconcerned, such as 1015 cm−2 of 1 MeV electrons or1012 cm−2 of fission-spectrum neutrons.

2. We measure the quantity of ionization produced bysuch a flux in a standard medium, usually air.

The units used in (1) are often used in the radiationtesting of silicon devices with radiation beams such aselectrons. The units used in (2) are often used in medicine,where x-ray generators and isotopes are the most commonsources of radiation, and air ionization chambers are themost common measuring instruments. Calculation meth-ods are available to convert units of exposure into unitsof absorbed energy, namely, dose and kerma, which arediscussed later.

D. Dose and Kerma

The term dose is a useful general description of the en-ergy per unit mass that has been deposited in a material bya high-energy particle on its way through. Given a valuefor dose, we can calculate biological and nonbiologicalradiation effects. For crystalline solids such as silicon andmetals, some further complications arise. It may be nec-essary to divide the energy deposition into two fractions:ionization and atomic displacement. A quantity definedto assist with these distinctions is kerma, meaning kineticenergy released in a material. To distinguish between thetwo forms of energy deposition, we can speak, for exam-ple, of ionization kerma when referring to the fraction ofenergy going into ionization.

Radiation dose and kerma are both measures of energydeposited. The SI unit for either is the Gray (Gy), whichrepresents energy deposited per unit mass of 1 J/kg. Many

authoritative publications still employ the older practicalunit, the rad, representing 100 ergs/g. One roentgen repre-sents 86.9 ergs per gram in air. A dose of 1 Gy thus equals100 rad.

The term equivalent dose is a useful description of thebiological effects of different kinds of radiation on differ-ent organs of human tissue. The equivalent dose is definedas the absorbed dose multiplied by an appropriate radia-tion weighting factor, wr . For x rays and gamma rays theweighting factor is about unity. The SI unit for the equiv-alent dose is the Sievert (Sv), where 1 Sv = 1 J/kg. Theolder unit is the rem, representing 100 ergs per gram. Anequivalent dose of 1 Sv thus equals 100 rems.

E. Range–Energy Relations

Photons and particles, when passing through a slab of ma-terial, are attenuated by collisions and other interactions.We can plot the intensity of the emergent radiation as afunction of the slab thickness d. In many cases, the at-tenuation can be expressed by an exponential law, of theform

I/I0 = e−µd ,

where I0 is the incident radiation intensity, I is the emer-gent intensity, and µ is the attenuation coefficient in unitsreciprocal to those of the thickness d of the slab.

For photons (x- and γ -ray), this law is followed exactlyfor an idealized “narrow beam” geometry in which sec-ondary radiations produced in the absorber are not seenby the detector. For electrons, the law is followed overthe early part of the curve but, after a certain distance, thepractical range, no electrons emerge. For neutrons, the lawis followed approximately, although the interactions withthe atoms of the material are very different from those forelectrons and photons.

For electrons and other charged particles there is a min-imum slab thickness W that stops all the particles. Thiscalled the “stopping range.” We can plot curves of Wversus energy. These are useful for calculations of shield-ing. Depth-dose curves as a function of penetration depth,such as shown in Fig. 2, are useful in designing radiationcancer-therapy treatment plans.

IV. INTERACTION WITH MATTER

A. Phenomena Observed during Absorptionof Radiation

1. General

The laws by which radiation is attenuated by and absorbedin matter are derived from the several competing types of

Page 305: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ/GLT P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN013A-634 July 26, 2001 19:56

Radiation Physics 567

FIGURE 2 Central axis depth dose curves for some typical photon, electron, neutron and proton radiotherapy treat-ment beams. [Reproduced courtesy of D. T. L. Jones “Present status and future trends of heavy particle raadiotherapy,”pp. 13–20 in Cyclotrons and their Applications 1998 (Proceedings of the 15th International Conference on Cyclotronsand their Applications, Caen, France, pp. 14–19 June 1998) (E. Baron and M. Lieuvin, eds), Copyright Institute ofPhysics, Bristol 1998.]

interactions of photons or particles with the atoms of thematerial. This section describe the primary processes, forthe various types of radiation.

2. Photons (e.g., X Rays, γ Rays, Bremsstrahlung)

Photons are electromagnetic wave-trains, differing fromvisible light only by having a shorter wavelength and ahigher energy, with the same velocity c = 2.99792458·108

m s−1 in a vacuum. Photons can also be treated as parti-cles, with energies E (usually in multiples of eV) inverselyrelated to the wavelength λ (usually in angstroms [A]where 1 A = 10−10 m = 0.1 nm) according to:

E(eV) = 12398.42/λ(A).

As a particle, a photon can transfer momentum to a targetparticle such as an electron, in a variety of scattering andabsorption collision processes. Compared with particlessuch electrons and protons, the probability of collision islow, so that photons in the energy range here consideredare regarded as penetration radiation. For each element, theprobability of interaction with incident photons is a func-tion of the energy of the photon and the atomic number Zof the material. Interaction is primarily by four processes:the photoelectric process, the Compton (inelastic) scatter-ing process, the Rayleigh (elastic) scattering process, and

the pair (and triplet) production process. Among the otherless-probable processes is photonuclear interaction whichcan induce radioactivity.

a. Photoelectric absorption (τ ). In this interactiona photon is completely absorbed by at atom, and an elec-tron is ejected. For a given atomic electron shell or sub-shell to participate in this process, the photon energy hν

must be greater than the binding energy B of electrons inthat shell or subshell. The ejected electron has a kineticenergy of hν − B. The highest value of B for a given el-ement is for the two (except one for hydrogen) innermostand most tightly bound electrons, in the K shell, rangingfrom 13.598 eV for hydrogen (Z = 1) up to 88.005 keVfor a high-Z element such as lead (Z = 82). Progressingoutward from the nucleus, the L shell has three subshellsLI, LII, and LIII, each with a slightly different bindingenergy B. These threshold energy values result in a char-acteristic “sawtooth” shape of a plot of the photoelec-tric effect cross section as a function of incident photonenergy.

Although the regions at photon energies just abovethese “sawtooth” absorption edges (thresholds) can exhibitconsiderable fine structure, with oscillations of 10% ormore, from both matrix and atomic (outer unfilled shells,etc.) effects, this fine structure is generally ignored in

Page 306: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ/GLT P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN013A-634 July 26, 2001 19:56

568 Radiation Physics

general-purpose compilations of x-ray attenuation coef-ficients for medical and shielding applications. However,EXAFS (x-ray absorption fine structure) is a widely usedand important analytical tool.

i. Fluorescence (characteristic) x rays. In addition tothe ejection of a photoelectron, this process results in theemission of one or more fluorescence x rays, due to anouter-shell electron falling into the inner-shell vacancycreated by the departure of the ejected photoelectron. Thefluorescence x-ray energy is equal to the difference inbinding energy between the participating inner and outerelectron shells or subshells, hence a fluorescence x-rayspectrum consists of a number of discrete lines, each cor-responding to one of the many transition possibilities.

ii. Auger effect. The fluorescence x rays do not nec-essarily emerge from the collided atom, but can insteaddislodge outer electrons, similar in effect to a secondaryphotoelectric absorption. This additional emission of elec-trons is called the Auger effect, and the emitted elec-trons are called Auger electrons. The probability that theprimary fluorescence x ray will escape from the atomwithout undergoing the Auger process is called the flu-orescence yield, ωi, in which i refers to the electron shellor subshell from which the primary photoelectron wasdislodged.

The photoelectric effect (τ ) is the dominant interactionprocess for low photon energies.

b. Compton scattering (σinc). In this interaction,also called incoherent or inelastic scattering, only part ofthe energy of the photon is transferred to an electron. Boththe electron and photon are scattered from the collision,with energies and directions related as determined by con-servation of momentum and energy between the deflectedphoton and the recoiling electron. The energy E′ of the de-flected photon is reduced from that of the incident photonenergy E according to the relation

E′ = E/{1 + (E/mc2)(1 − cosθ )}in which mc2 = 0.5110 MeV is the rest mass energy ofan electron (or positron) and θ is the deflection angle ofthe scattered photon. This equation can also be written interms of the shift in wavelength of the photon

λ′ − λ = 1 − cosθ

in which the wavelengths λ′ and λ of the deflected andincident photons are in Compton units

λ = mc2/E = 0.5110/E[MeV].

It can thus be seen that the maximum shift in photon wave-length is two Compton units, at the photon backscatteredangle of θ = 180◦, at which angle, no matter how high theincident photon energy E, the backscattered photon energy

E′ will never exceed mc2/2 = 0.2555 MeV. In collimatedgamma-ray sources, this energy can show up in a spectrumof the primary beam, in addition to high-energy photonsfrom the source itself, due to 180◦ Compton scatteringfrom material behind the source.

Over most of the region where the Compton cross sec-tion is a major part of the total cross section, the tar-get electron can be assumed to be free and at rest, inwhich case the elegant Klein-Nishina equations apply. Atlower energies, where electron motion and binding ener-gies are a significant fraction of the incident photon en-ergy, the Klein-Nishina theoretical cross section can bemodified by use of an incoherent scattering function S(x,Z) in which x is a momentum transfer variable relatedto the incident photon energy E and its subsequent de-flection angle θ , and Z is the atomic number of the tar-get atom. Calculations of S(x, Z) require knowledge ofatomic wave functions, and values of S(x, Z) are usu-ally taken from available tables of the incoherent scat-tering function, an example of which is provided in thebibliography.

Compton scattering (σinc) is the dominant interactionprocess in the intermediate energy range from a few tensor hundreds of keV up to a few MeV, this range being thebroadest for the lowest-Z elements.

c. Rayleigh scattering (σcoh). In this interaction,also called coherent or elastic scattering, a photon isscattered by the atomic electron cloud as a whole, withthe entire atom, including the heavy nucleus (compared tothe electrons) taking up the recoil. Thus, from energy andmomentum conservation considerations, the deflectedphoton undergoes negligible loss of energy or change inwavelength.

Since the angular distribution Rayleigh scattering issharply peaked in the forward direction for photon en-ergies in the gamma-ray region, and the energy loss of thedeflected photon is negligible, the contribution of this pro-cess is sometimes omitted in radiation transport calcula-tions. However, at the lower photon energies (4 to 30 keV)utilized in x-ray crystallography, the photon wavelength iscomparable to the lattice spacing between atoms in crys-tals, and this effect becomes of prime importance. Dueto the coherence between the incident and deflected pho-ton, interferences from waves scattered from successivecrystalline layers result in reflections only into sharplydiscrete directions, making possible studies not only ofinorganic crystalline materials, but also of biological struc-tures such as DNA.

Rayleigh (coherent) scattering (σcoh) is never the dom-inant photon interaction process, but for heavy elementsin the energy region just below the photoelectric K-shell

Page 307: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ/GLT P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN013A-634 July 26, 2001 19:56

Radiation Physics 569

absorption edge (threshold) it can account for up to 10%of the total attenuation coefficient.

d. Pair and triplet production (κn and κe). In thistype of interaction, which can happen only for photonswith energy in excess of two electron rest-mass energies(2mc2 = 1.022 MeV), the photon interacts with the elec-trostatic field of a charged particle, such that the photondisappears and in its place is created an electron–positronpair (e−, e+). If this process happens in the field of thepositively charged atomic nucleus (κn), with negligible re-coil due to its effectively infinite mass in comparison withthe electron or positron, only these two particles emergefrom the atom as the products of the primary interac-tion. Any photon energy in excess of 1.022 MeV appearsas kinetic energy, divided between the electron and thepositron.

i. Triplet production. If the target charged particle forthis process is an atomic electron (κe), the target elec-tron recoils in the forward direction, along with the cre-ated electron-positron pair, with all three equal-mass parti-cles (2e−, e+) sharing as kinetic energy the excess photonenergy. In the early cloud-chamber observations of thisprocess, the three particles appeared as a three-prongedtrident, or “triplet.” Due to the kinematics, with the recoil-ing electron target, the threshold energy for triplet pro-duction (κe) is four electron rest-mass energies (4mc2 =2.044 MeV).

ii. Annihilation radiation. The produced positron(e+), as was mentioned earlier, eventually collides cata-strophically with an electron (e−) and both are anni-hilated, reverting usually to a pair of photons, each ofenergy of one electron rest-mass energy (mc2 = 0.511MeV), higher if the positron annihilates while still inflight, and (rarely) to a single or to three photons.

The probability for pair (or triplet) production is ap-proximately proportional to the square of the charge ofthe target particle. Hence, the cross section for pair pro-duction in the field of the nucleus (κn) varies as Z2, whereasfor triplet production (κe) the cross section varies simplyas Z, the number of unit-charge electron target particlesper atom. Thus for hydrogen (Z = 1), at photon energieswell above the thresholds, the cross sections for κe and κn

are approximately equal, whereas for higher-Z elementsκe is smaller approximately as

κe/κn ≈ Z/Z2 = 1/Z.

Pair and triplet production (κn and κe) are the dominantprocesses for photon interactions with atoms at high en-ergies, from a few MeV upwards.

e. Photonuclear absorption (σph.n.). This type ofinteraction, with a threshold of the order of 5 MeV or

higher, is somewhat the analog of the atomic photoelec-tric effect, but with the photon absorbed by the atomic nu-cleus, rather than by an electron in the shells surroundingthe nucleus. The most likely result of such an interactionis the emission of a single neutron, in which case we canexpress the interaction as

Ni(γ, n)Nf

in which Ni is the target nucleus and Nf is the final nucleus,with one less neutron, and may be a radioactive isotopeof the target nucleus Ni. Besides single neutron emis-sion, other possibilities include the emission of chargedparticles, gamma rays, or more than one neutron. Emis-sion of charged particles, such as protons, changes theatomic number Z of the target element, as well as itsmass.

The most characteristic feature of the photonuclear ab-sorption cross section is the “giant resonance.” This isa broad peak in the absorption cross section centeredat about 24 MeV for light nuclei, decreasing in energywith increasing mass number to about 12 MeV for theheaviest stable nuclei. The width “�” (energy differencebetween the points at which the cross section drops toone half its maximum value) varies from about 3 MeV to9 MeV depending on the detailed properties of individualnuclei.

The magnitude of the photonuclear cross section, evenat the resonance peak energy, is small in comparisonwith the sum of the above “electronic” cross sections,and in no case contributes more than 10% to the to-tal cross section. However, photonuclear absorption canbe important in shielding technologies since the emit-ted neutrons are usually far more penetration than theincident photons, and obviously important in irradiationtechnology due to the induced radioactivities in the targetmaterials.

f. Other photon interactions. A number of otherless-probable things can happen to photons as they tra-verse and encounter tangible matter, but the interactionslisted above are at present the only ones of practicalinterest in medical and technological applications. Oneof these less-probable but interesting interactions isDelbruck scattering, which is scattering of photons in theCoulomb (electrostatic) field of nuclei as a consequence ofvacuum polarization. Delbruck scattering is considered tobe “scattering of light by light,” and has been observed forphoton energies from a few hundred keV up to a fewGeV. Other small effects include, for example, resonancescattering and Thomson scattering by the nucleus,Compton scattering by nucleons, meson production (ata few hundred MeV), resonance scattering associated

Page 308: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ/GLT P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN013A-634 July 26, 2001 19:56

570 Radiation Physics

FIGURE 3 Contributions of (a) atomic photoeffect, τ , (b) incoherent (Compton) scattering, σinc, (c) coherent (Rayleigh)scattering, σcoh, (d) nuclear-field pair production, κn, (e) electron-field pair production (triplet), κe, and (f) nuclearphotoabsorption, σph.n., to the total measured cross section, σtot (circles) in carbon over the photon energy range10 eV to 100 GeV. The measured σtot points, taken from 90 independent literature references, are not all shown inregions of high measurement density.

with meson production, and nucleon–antinucleon pairproduction. Nucleon–antinucleon pair production, ofinterest in cosmogenic modeling, has a threshold whichvaries from 3.75 GeV (4 nucleon masses) in the field of aproton at rest down to 3.0 GeV (3.2 nucleon masses) forprotons moving in a complex nucleus.

g. Total cross section (σtot) and the mass atten-uation coefficient (µ/ρ). The total photon interactioncross section σtot, for a given material and a given photonenergy is given as the sum over the photoelectric effectcross section τ , the incoherent (Compton) and coherent(Rayleigh) scattering cross sections σinc and σcoh, the pairand triplet production cross sections κn and κe, and thephotonuclear absorption cross section σph.n.:

σtot = τ + σinc + σcoh + κn + κe + σph.n.

The relative importance of all six of these differenttypes of interactions of photons with atoms, in the

different regions of photon incident energy, can be seen inFigs. 3 and 4 for carbon (Z = 6) and for lead (Z = 82).

The photonuclear cross section σph.n. is at present notamenable to systematic compilation and tabulations, dueto its complex dependence on the irregular variations innuclear structure as a function of the atomic number Z andof the mass numbers A of the variable isotopic mixes ofthe stable elements. Hence, despite its importance (pro-duction of radioactivities) in biological and technologicalapplications involving photon energies above 5 or 10 MeV,this cross section is still omitted from systematic general-use tabulations of x-ray attenuation coefficient and energy-absorption coefficient data. Hence, in currently availabletabulations of the mass attenuation coefficient, µ/ρ, thecomposition will be found to be

µ/ρ = (τ + σinc + σcoh + κn + κe)/(u · A)

with µ/ρ in units of cm2/g if the five componentcross sections are in units of cm2/atom, where u( = 1.660 538 73(13) · 10−24 g) is the atomic mass unit

Page 309: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ/GLT P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN013A-634 July 26, 2001 19:56

Radiation Physics 571

FIGURE 4 Contributions of (a) atomic photoeffect, τ ,(b) incoherent (Compton) scattering, σinc, (c) coherent (Rayleigh)scattering, σcoh, (d) nuclear-field pair production, κn, (e) electron-field pair production (triplet), κe, and (f) nuclearphotoabsorption, σph.n., to the total measured cross section, σtot (circles) in lead over the photon energy range 10 eVto 100 GeV. The measured σtot points, taken from 121 independent literature references, are not all shown in regionsof high measurement density.

(1/12 of the mass of an atom of the nuclide 12C) and A isthe relative atomic mass of the target element.

i. Additivity. For compounds and mixtures, the massattenuation coefficient µ/ρ at a particular photon energyE can be obtained according to simple additivity:

µ/ρ =∑

i

wi(µ/ρ)i

in which wi is the fraction by weight of the ith elementalconstituent and (µ/ρ)i is the mass attenuation coefficientfor that element for photons of energy E.

ii. Narrow-beam attenuation. For an idealized pencilbeam of photons and a detector shielded by a collimatorsuch that secondary radiations (e.g., scattered and fluores-cence photons) from the interposed target material cannotbe seen by the detector, the exponential attenuation lawmentioned earlier holds exactly:

I/I0 = e−(µ/ρ)x

in which x is in units of g/cm2 if µ/ρ is in units of cm2/gas mentioned above.

h. Other factors governing photon attenuation.Although the idealized pencil beam (collimated source,collimated detector) geometry is useful for performingbasic measurements of the mass attenuation coefficientµ/ρ in support of theoretical modeling of the individualtypes of interaction cross sections and their sum σtot, inpractical situations two other factors modify the expo-nential attenuation law. One of these is the inverse squarelaw, in which the radiation flux at various radial distancesr from a point isotropic (PTI) source of photons variesaccording to 1/r2. For a point isotropic source embeddedin an absorbing and scattering medium, the flux detectedby an uncollimated detector will also be modified bya buildup factor BPTI(E0, Z , r ), due to scattering andother secondary radiations reaching the detector. In thisrepresentation of the buildup factor, E0 is the photonsource energy, Z represents the element or mixturecomprising the medium, and r is the distance from thesource, usually in units of mfp (mean-free-path), in which1 mfp = 1/{(µ/ρ)[E0, Z ]}. For applications in radiationfield modeling for extended sources, BPTI(E0, Z , r ) hasbeen parametrized as a function of r in a variety of

Page 310: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ/GLT P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN013A-634 July 26, 2001 19:56

572 Radiation Physics

empirical and semi-empirical analytical formulations, aswill be discussed in a later section on radiation fieldsfrom point and extended sources. However, to a largeextent Monte Carlo calculations have replaced the use ofbuildup factors in radiation transport predictions exceptfor some benchmark limiting cases.

i. Mass energy-transfer coefficient µtr/ρ and massenergy-absorption coefficient µen/ρ. For computingthe energy deposited at a given point within an absorb-ing medium, subject to a photon flux traversing thatpoint, the mass energy-transfer coefficient µtr/ρ and massenergy-absorption coefficient µen/ρ are useful quanti-ties. These coefficients account for the energy loss froma collision site of the energy of secondary photon radi-ation which departs from the collision site, instead ofthe entire photon energy being deposited at the colli-sion site in the form of charged-particle kinetic energy.The difference between these quantities, and how theydiffer also from the total mass attenuation coefficientµ/ρ, is illustrated pictorially in Fig. 5, in which the up-ward branching arrows represent the energy of the de-parting secondary photon radiations.The mass energy-transfer coefficient, used for computing kerma, is definedas

µtr / ρ = (fσ incσinc + fτ τ + fκnκn + fκeκe) / (u · A)

in which the factors fi are the average fractions of the in-cident photon energy, for each type of interaction, leftat the collision site in the form of ionization, excita-tion and kinetic energy of charged particles, the remain-der departing, as shown, the collision site in the formof Compton (incoherent) scattered photons, fluorescencephotons from photoelectric absorption or Compton va-cancies, and annihilation radiation from the catastrophicencounters of pair- and triplet-produced positrons withelectrons. For Rayleigh (coherent) scattering, the deflectedphoton loses negligible energy, so that σcoh does not con-tribute to the mass energy-transfer coefficient µtr/ρ. Amore-detailed version of this energy-deposition quantity,as shown in Fig. 5, is the mass energy-absorption coeffi-cient µen/ρ which takes into account the bremsstrahlung(photon) radiation produced by the energetic charge parti-cles, including positrons, which can depart from the regionaround the original collision site without depositing en-ergy. A further refinement takes into account the fact thatthe combined energies of the annihilation radiation pho-tons from positrons annihilating in flight is greater than2mc2. The mass energy-absorption coefficient µen/ρ isexpressed as

µen / ρ = (1 − g)µtr / ρ

FIGURE 5 Schematic representation of the mass attenuationcoefficient µ/ρ, the mass absorption coefficient µa/ρ, the massenergy-transfer coefficient µt/ρ, and the mass energy-absorptioncoefficient µen/ρ in terms of the cross sections for coherent (σcoh)and incoherent (σincoh) scattering, atomic photoeffect (τ ), pairproduction (κ), and photonuclear reactions (σph.n.). The upward-branching arrows represent the fraction, of the incident photonenergy, lost to the volume of interest in the form of secondaryphotons such as positron annihilation radiation (ANN. RAD.),bremsstrahlung (e−, e+ BREMSS.), fluorescence x rays (FLUOR.γ ) and scattered photons (SCATT. γ ). The enhancement of anni-hilation photon energies due to positron annihilation in flight (e+ANN. IN FLT.) at the expense of positron bremsstrahlung and en-ergy deposition is also indicated.

in which g is the average bremsstrahlung yield (fraction)for all of the charged particles produced in the originalcollision in the form of photoelectrons (and Auger elec-trons), Compton-ejected electrons, and electron-positronpairs (and triplets) from pair production events. A fur-ther refinement could include the effects of secondaryphotons and charged particles from photonuclear ab-sorption σph.n., but insufficient systematic information is

Page 311: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ/GLT P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN013A-634 July 26, 2001 19:56

Radiation Physics 573

available to include this process in currently available datatales.

3. Neutron Absorption Mechanisms

Neutrons are uncharged particles that can interact easilywith a positively charge atomic nucleus where variouscapture processes can occur. The probability of such aninteraction is low, and hence neutrons are regarded ashighly penetrating radiation. In elastic scattering the neu-tron loses part of its energy by displacing atoms. It canbe shown that the energy transferable during collisionsis greatest for atoms of low atomic mass, especially hy-drogen. Therefore, hydrogenous materials are useful forneutron shielding and detection. Energetic protons are cre-ated during elastic collisions with hydrogenous materialssuch as polymers and water.

After a number of collisions in a material at room tem-perature, the neutrons become thermalized, that is, theirkinetic energy is of the order of 0.025 eV, the averagevibrational energy of the atom in a solid at room temper-ature. If the solid is cooled below room temperature, thenneutrons of even lower energies are produced. These arecalled cold neutrons and have uses in materials physics.The National Institute of Standards and Technology, fromits reactor, produces neutrons at cryogenic temperatures,valuable in surface studies due to the effective long wave-length of the neutrons.

Thermal neutrons can be captured by atomic nuclei.This usually leads to the emission of a γ photon from theexcited nucleus and the creation of a new isotope, a pro-cess known as transmutation. The new isotope may or maynot be radioactive. If it is, then the process is termed ra-dioactivation. If the neutron capture results in the emissionof more neutrons, and a moderator material is present, achain reaction can occur. In particular, in the case of heavyelements such as uranium which can undergo fission fromneutron capture, large amounts of energy can be releasedin either controlled or uncontrolled modes.

4. Electron (β-Ray) Absorption

The two principal mechanisms by which electrons shedtheir kinetic energy are by collision, which yields ionizedor excited atoms, and the production of bremsstrahlungradiation. A third mode of interaction of electrons withmatter is the Cherenkov effect which plays a minor rolein energy deposition, but is quite visually dramatic in re-fractive transparent media.

a. Collisions. As they pass through matter, electronsgradually lose energy by means of collisions with atoms,along a wandering path. The collision energy is then usu-

ally taken up by other electrons; these ejected electronsare termed secondary electrons. Occasionally, atoms aredisplaced from their lattice positions by the collision, anda defect is left in the solid. Secondary electrons, after fur-ther collisions, generate electron-hole pairs or electron-ionpairs, depending on the medium in question. In a liquidor gas, a complex track of electron-ion pairs is produced,which has a core of intense ionization and small spurscalled delta rays, where secondary electrons have departedradially from the original track.

b. Electromagnetic (bremsstrahlung) radiation.High-energy electrons, when deflected or slowed inCoulomb (electrostatic) fields of nuclei, produce pho-tons known as bremsstrahlung (“braking” radiation).Electron–electron bremsstrahlung is also possible, butthis is a less-probable process. Thin-target bremsstrahlunghas energy and angular distributions well described math-ematically according to equations developed by Schiff,and this broad continuum is the major photon output ofx-ray machines for imaging and for therapy, particularlyin the case of high-energy machines using electron linacsor other high-energy accelerators as sources of electronsto impinge on heavy-element bremsstrahlung convertertargets.

c. Cherenkov radiation. The speed of light has aconstant value of c in a vacuum. However, in a transparentmaterial with a refractive index n > 1, the velocity of lightin the material is reduced to c/n. Thus for a high-energyparticle traveling at velocity v = βc, if β > 1/n, we cansay that the particle is “traveling faster than light” in thematerial. As first observed experimentally by Cherenkovin 1934 and explained by Frank and Tamm in terms ofMaxwell’s theory, when a charged particle traverses matterat a velocity in excess of the speed of light in the medium,radiation in the optical-wavelength region is emitted. Thislight, called “Cherenkov radiation,” is subject to construc-tive interference such that it is confined to a cone whoseaxis coincides with the direction of propagation of thecharged particle, in a manner analogous to the productionof a shock wave from a supersonic aircraft. The aperturehalf-angle θ of this propagating cone of light is givenby

cos θ = 1/(βn)

in which β is the ratio of the particle’s velocity to thatof light in a vacuum and n is the refractive index of themedium. In water moderated and shielded research reac-tors, where the fuel elements are visible, the eerie blueglow in the water surrounding the elements is Cherenkovradiation, mostly from Compton electrons induced byhigh-energy γ rays.

Page 312: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ/GLT P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN013A-634 July 26, 2001 19:56

574 Radiation Physics

5. Proton Radiation Absorption

Protons interact primarily with atomic electrons, hencetheir effects, including biological, are similar to those of xrays. However, as seen in Fig. 2, their depth-dose patternis quite different. The proton depth-dose curve, hereshown for 200-MeV protons in water, is characterized bya relatively low entrance dose plateau region, followedby a sharp dose peak (the Bragg peak) near the end ofthe range. Beyond the Bragg peak is a very sharp cut-off,with no protons and dose deposition reaching the areasdeeper into the material. Protons scatter very little, hencethe lateral dose fall-off is also very sharp. Thus, in cancertherapy involving tumors adjacent to critical structureswhich need to be spared, use of protons can avoid theuse of isocentric treatments in which the source-beamis rotated around the patient. If the tumor is larger thanthe typically 1–2 cm width of the Bragg peak, the peakcan be spread by rotating variable-thickness absorbersintercepting the incident beam.

6. Charged Nuclei

Besides protons (hydrogen nuclei, Z = 1), nuclei of heav-ier atoms moving at high energy can cause trails of intenseionization. Examples of such particles include α parti-cles from radon and its short-lived daughter products (5,6, and 7.7 MeV), cosmic ray particles with energies ex-tending up to the TeV region, ions in accelerator beams,and fission fragments. Light atoms produce a track of sig-nificant length. For example, a neon (Z = 10) ion at 40MeV has a track length in polymer plastics approaching100 µm. Over a core region having a diameter of a fewnanometers, the dose imparted to the material in the trackis 106 rad. Over a penumbra region having a diameter ofabout 10 µm the dose falls off to about 100 rad. Thus,the details of the effects of different energetic atoms ona solid vary strongly according to atomic weight. Heavyatoms in motion are encountered in cosmic rays and infission fragments. During the fission of an atomic nu-cleus, a large amount of momentum is given to the twofragments of the nucleus. If the fission occurs in a solid,then the fragments are stopped rapidly. The resulting bulkdamage to the solid is heavy and is localized in a smallvolume.

7. Neutrinos, ν

It is beyond the scope of this article to list and classify theentire “zoo” of known charged and uncharged particlesand anti-particles which could be considered to be withinthe domain of “radiation physics.” However, the neutrinois here singled out for mention since it plays a role in

the nuclear fusion processes energizing stars includingour Sun as well as in supernovae and other astrophysi-cal events. The neutrino was first postulated by W. Pauliin 1930 to avoid an apparent non-conservation of energyand linear momentum in radioactive β-decay. This elusive“massless” uncharged particle interacts with matter princi-pally through the weak nuclear force, hence the cross sec-tion is so small that it can pass through the Earth with onlya small probability of collision. Detectors consist of largetanks of water with surface studded with inward-lookingphotomultiplier tubes to observe Cherenkov or other radi-ations from secondary products of neutrino collisions.

The three known types of neutrinos, associated withelectrons νe, muons νµ, and with tau particles ντ , althoughthe latter has not been observed directly.∗ The Sun’s poweris thought to be generated principally by hydrogen nuclei(protons, p) undergoing fusion into helium (He, Z = 2)and two positrons (e+), according to

4p → 4He + 2e+ + 2νe + 26.7 MeV

from which the neutrinos νe, even those produced deepin the core of the Sun, can easily escape to enable terres-trial detection and analysis. Currently, the observed Solarneutrino flux is less than the theoretical predictions.

A rare (but observed in geochemical experiments) ex-ample of the neutrino participation in radioactivity is dou-ble β-decay, which would be forbidden except for the neu-trino production is:

Nucleus(Z, A) → Nucleus(Z + 2, A) + 2e− + 2νe

in which Z is the atomic number and A is the atomic mass.

8. Radiation Fields from Pointand Extended Sources

This section will apply mostly to photon sources, but forsources embedded in a medium with negligible scatter-ing or absorption, also negligible magnetic or electrostaticfields in the case of charged particles, the analytical geo-metrical treatments here will also be applicable.

a. Point source. For a point isotropic (PTI) source ofmonoenergetic photons embedded in a homogeneous ab-sorbing and scattering medium, as schematized in Fig. 6,the total response D of an isotropic detector at distance rfrom the source is

D = D◦ + Ds

in which D◦ represents the response of the detector tophotons arriving from the source without suffering a scat-tering interaction, and Ds the response of the detector to

∗See, however, B. Schwarzschild (2000). “The tau neutrino has finallybeen seen,” Physics Today 53(10), 17–19.

Page 313: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ/GLT P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN013A-634 July 26, 2001 19:56

Radiation Physics 575

FIGURE 6 Schematization of a point isotropic monoenergetic ra-diation source, and an isotropic radiation detector, embedded ina medium characterized by an attenuation coefficient µ and abuildup factor BPTI which accounts for scattered and other sec-ondary radiations “seen” by the detector.

photons arriving indirectly, via scattering or other modesof secondary radiations. The unscattered response can becalculated according to

D◦ = (σ/4π )[exp(−µr)]/r2

in which the source-strength constant (σ/4π ) is thedetector response at unit-distance from a unit-strengthpoint source in the absence of attenuation [e.g., σ/4π =1.27 rads/hr in tissue 1 m from a 1 Ci 60Co source, fromTable VI. C]. Defining the buildup factor BPTI for a givenphoton energy Eo, material Z and distance r, as

BPTI = D/D◦ = (D◦ + Ds)/D◦

we have

D = D◦ × BPTI = (σ/4π )BPTI[exp(−µr)]/r2.

The build-up factor BPTI has been expressed in a vari-ety of parametrized formulations suitable for different ap-plications, most of them listed and compared in the ar-ticles by Hubbell (1963) and by Harima (1993) in thebibliography.

b. Finite plane isotropic source with exponentialand inverse-square-law attenuation, and buildup fac-tor. The detector response D to a finite plane isotropicsource S such as shown in Fig. 7 can be expressed as theintegral

FIGURE 7 Finite plane isotropic source S uniformly covered withisotropically radiating material.

D = σ

∫s

exp(−µr )

r2× BPTI (µr ) d S.

For regular geometries such as the rectangular plaquesource shown in Fig. 8 found in irradiation facilities, a con-venient formulation for the buildup factor, to separate the

FIGURE 8 Schematization showing the geometry parameters fora rectangular plaque source, S, such as used in irradiation pro-cessing, in which the detector is at a perpendicular distance hfrom a corner of the rectangle of width w and length l.

Page 314: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ/GLT P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN013A-634 July 26, 2001 19:56

576 Radiation Physics

FIGURE 9 Schematization for obtaining the detector response when the detector is not over a corner, but is displacedby distances α and β, respectively, from the corner of a plaque source with width a and length b.

shape (geometry) variables from the material-dependentpenetration variables, is

BPTI(µr ) = exp(+µr )N or ∞∑

n=0

bn(−µr )n/n!

in which the bn coefficients are tabulated in Hubbell (1963)which also provides transforms for obtaining bn

′s fromother formulations given in termsof polynomials or ex-ponential combinations. The detector response from theextended source S can then be computed according to

D = (σ/4π )N or ∞∑

n=0

bn × qn(geom) × (µx)n

in which x is a fixed distance for a given geometry. Thegeometry (shape) coefficients qn can be calculated analyt-ically for a regular shape such as the rectangular source inFig. 8 according to

qn(geom) =∫

S(−r/x)n (1/r2n!) dS

or in polar coordinates

qn(geom) =∫

�s[(−secθ )n+1/n!] d�S

in which �S is the solid angle subtended by the source Swith respect to the detector, and θ is the angle betweenr and x as indicated in Fig. 7 or between r and h as inFig. 8.

c. Rectangular plaque source. For a detector op-posite a corner of a rectangular plaque source, analyticalexpressions and tabulations of the geometry coefficientsqn are available in Hubbell et al. (1960). For the detec-tor not opposite a corner, but displaced by distances α

and β from the corner of a rectangular source of widtha and length b, in units of the detector distance from thesource-plane (h = 1), as shown in Fig. 9, corner-positiondata can be combined according to

D(a, b; α, β) = D(α, β) + D(a − α, b − β)

+ D(α, b − β) + D(a − α, β).

For a “bare” rectangular plaque source (embedded in amedium with negligible absorption and scattering), thecorner-position (h = 1) detector response is given by thezeroth term of the above series

D(a, b) = (σ/4π ) × q0(a, b).

in which

q0(a, b) =∫ b

0

1√β2 + 1

tan−1 a√β2 + 1

dβ.

Although the rectangular source geometry coefficientsqn(a, b) are available in closed form (Hubbell et al., 1960)for all n ≥ 1, the above integral for n = 0 is not soluble inclosed form. However, a series solution which is rapidlyconvergent for all values of a and b, i.e., for 0 ≤ a ≤ b ≤ ∞,given in Hubbell et al. (1960) is

Page 315: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ/GLT P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN013A-634 July 26, 2001 19:56

Radiation Physics 577

q0(a, b) = π

2sinh−1 a − b

a

∞∑i=0

1

2i + 1

(a2

a2 + 1

)i+1

×{√

a2 + 1

btan−1

√a2 + 1

b

−i−1∑j=0

22 j ( j!)2

(2 j + 1)!

(a2 + 1

a2 + b2 + 1

) j+1}

For all a ≤ b, only terms in the outer sum i = 0 to i = 6 arerequired for 0.01% accuracy, since the inner finite summa-tion is actually the leading terms of the tan−1 (and factor)expression just before it, hence the very rapid convergenceof the outer infinite summation.

Infinite strip “bare” source. For large b (b → ∞) thesummation vanishes, leaving only

q0(a, ∞) = (π/2)sinh−1a.

Hence the response of a detector at unit height centeredabove an infinite strip source of width 2a and surface ac-tivity σ , in a nonattenuating medium, is

D = 4(σ/4π )(π/2) sinh−1a = (σ/2) sinh−1a

d. Circular disk plane isotropic source, off-axis.Similarly, for a unit-radius (r = 1) circular disk planeisotropic source as shown in Fig. 10, the response of adetector at height h (in disk radii) above the plane of thesource and at distance ρ (in disk radii) off-axis can alsobe computed using disk source (off-axis) geometry coeffi-cients qn(ρ, h) given analytically and in tables in Hubbellet al. (1961)

FIGURE 10 Source-detector geometry for a circular disk source,with the detector at an off-axis position, showing the relevant pa-rameters ρ, h, φ, r and R, in which the linear dimensions aremeasured in radii of the source disk.

Analogous to the detector response coefficient for a barerectangular source q0(a, b), the corresponding coefficientq0(ρ, h) for a detector at height h (in disk radii) from theplane of a disk source of unit radius, displaced from thedisk axis by distance ρ (in disk radii), can be expressed inthe closed form

q0(ρ, h) = π ln((1 + h2 − ρ2

+√

(1 + h2 − ρ2)2 + 4ρ2h2)/2h2)

Formulas and tabulations of higher terms through n = 9of qn(ρ, h) are given in Hubbell (1961), also formu-las and data for treating anisotropic angular sourcedistributions.

9. Monte Carlo Simulations of Radiation Transport

Although analytical methods of mapping radiation fields,such as in the examples above, are sometimes useful asbenchmarks for simple or limiting situations, the avail-ability of modern high-speed and large-memory comput-ers has resulted in the wide use of Monte Carlo sim-ulation. In this method, trajectories of photons and/orparticles are determined by random numbers at eachcollision point in the medium, weighting the azimuthaland deflection angles by the probabilities, or differentialcross sections, for the interactions with target atoms. Formore extensive information on Monte Carlo simulations,see the reviews by Morin (1988) and by Jenkins et al.(1988).

V. USEFUL RADIATION DATA

A. Radiation Units

1rad = 100 erg g−1

= 6.25 × 1013 eV g−1 = 10−2 Gy

1 Mrad = 6.25 × 1019 eV g−1 = 10 kGy

1 Gy = 1 J kg−1 = 100 rad

1 kGy = 105 rad = 100 krad

1 MGy = 108 rad = 100 Mrad

1020 eV g−1 = 1.6 Mrad = 16 kGy

1 roentgen (R) = 86.9 erg g−1(air) = 2.58 × 10−4 Coul-ombs kg−1(air)1 R of 1-MeV photons ≡ 1.95 × 109 photons cm−2.

Page 316: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ/GLT P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN013A-634 July 26, 2001 19:56

578 Radiation Physics

This fluence deposits 0.869 rad in air

0.965 rad in water

0.865 rad in silicon

0.995 rad in polyethylene

0.804 rad in LiF

0.862 rad in Pyrex glass

(80% SiO2).

One curie (Ci) of radioactive material produces3.700 × 1010 disintegrations per second.

A 1-Ci point source emitting one 1-Mev photon perdisintegration gives an exposure of 0.54 R hr−1 at 1 m.

A 1-Ci 60Co source gives 1.29 R hr−1 at 1 m.Photon flux at 1 m from a 1-Ci point source = 1.059 ×

109 cm−2 hr−1 (assuming one γ -ray photon per disinte-gration).

SI Units recommended by the international Commis-sion on Radiation Units and measurements (I.C.R.U.)[Brit. J. Radiology 49, 476(1976)] are the following:

— Absorbed dose: the Gray (Gy) = 100 rad = 1 J/kg— Exposure: the coulomb per kilogram (no name given) =

1 C/kg— Quantity activity: the Becquerel (Bq) = 1 sec1 =

2.703 × 10−11 Ci, with the old units to be abandonedover 10 years.

Basic units of biological dose from radiation are therem and the sievert (Sv). 1 rem (radiation equivalent man)is the absorbed dose to the body of 1 rad weighted bya quality factor Q.F. that is dependent on the type of ra-diation involved. This is because the energy absorptionfrom radiation is insufficient as a measure of biologicaldamage.

1 rem = Q.F. × 1 rad

1 Sv = Q.F. × 1 Gy

Q.F. = 1 for X rays, γ, or β.

The values of Q.F. for α, neutrons, and heavy particles aregreater than 1.

B. Useful General Constants

7 years = 3.682 × 106 min

= 2.209032 × 108 sec

1 year = 5.25960 × 105 min

= 3.155760 × 107 sec

1 day = 1.440 × 103 min

= 8.6400 × 104 sec

1000 A = 0.1 µm = 100 nm

1 mm = 0.03937 in.

0.001 in. = 25.4 µ m

1 m3 = 106 cm3

1000 cm3 = 10−3 m3

1 l = 1000.028 cm−3 = 0.219 976 gal

1g cm−3 = 1000 kg m−3

∗1 eV = 1.602 176 462(63) × 10−19 J∗1 MeV = 1.602 176 462(63) × 10−13 J

1 J = 107 erg

1 cal = 4.187 J

1 eV /molecule = 23.1 kcal/mol

Permittivity of free space ε0 = 8.86 × 10−14 F cm−1

= 8.86 × 10−12 F m−1

= 55.4 electronic

charges V−1 µ m−1

Permeability of free space µ0 = 1.26 × 10−6 H m−1

∗Electronic charge q =1.602 176 462(63) × 10−19 C∗1 C cm−2 =6.241 509 74 × 1018

electrons cm−2

1 µA cm−2 =6.24 × 1012

electrons cm−2 sec−1

∗Velocity of light=2.997 924 58 × 108 m sec−1

1 N=105 dyn

1 mm Hg=133.3224 N m−2

∗Boltzmann’s constant k =1.380 650 3(24)×10−23 J K−1

8.617 342(15) × 10−5 eV K−1

kT at room temperature=0.0259 eV

∗Planck’s constant h =6.626 068 76(52)×10−34 J sec

∗Avogadro’s number=6.022 141 99(47) × 1023 mol−1

∗Electron rest mass mc =9.109 381 88(72) × 10−31 kg

∗Proton rest mass mp =1.672 621 58(13) × 10−27 kg

∗From P. J. Mohr and B. N. Taylor (2000). The FundamentalPhysical Constants, Physics Today 53(No. 8, Part 2), BG6–BG13.

Page 317: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ/GLT P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN013A-634 July 26, 2001 19:56

Radiation Physics 579

C. Dose Rates From γγ Emittersa

Dose rate at 1 m fromNuclide Half-life Principal γ energies (MeV) 1 Ci (rad/hr in tissue)

Antimony-124 60 day 0.60; 0.72; 1.69; 2.09 0.94

Arsenic-72 26 hr 0.51;b 0.63; 0.835 0.97

Arsenic-74 18 day 0.51;b 0.596; 0.635 0.42

Arsenic-76 26.5 hr 0.56; 0.66; 1.21; 2.08 0.23

Barium-140c 12.8 day 0.16; 0.33; 0.49; 0.54; 0.82; 0.92; 1.60; 2.54 1.19

Bromine-82 35.4 hr 0.55; 0.62; 0.70; 0.78; 0.83; 1.04; 1.32; 1.48 1.40

Caesium-137 30 yr 0.662 0.32

Cobalt-58 71 day 0.51;b 0.81; 1.62 0.53

Cobalt-60 5.26 yr 1.17; 1.33 1.27

Gold-198 2.70 day 0.412; 0.68; 1.09 0.22

Iodine-131 8.04 day 0.28; 0.36; 0.64; 0.72 0.21

Iodine-132 2.3 hr 0.52; 0.65; 0.67; 0.78; 0.95; 1.39 1.13

Iridium-192 74 day 0.296; 0.308; 0.316; 0.468; 0.605; 0.613 0.46

Iron-59 45 day 0.19; 1.10; 1.29 0.61

Manganese-52 5.7 day 0.51;b 0.74; 0.94; 1.43 1.79

Manganese-54 314 day 0.84 0.45

Potassium-42 12.4 hr 1.52 0.13

Radium-226d 1620 yr 0.05–2.43 0.79

Sodium-22 2.6 yr 0.51;b 1.28 1.15

Sodium-24 15.0 hr 1.37; 2.75 1.77

Tantalum-182 115 day 0.068; 0.100; 0.222; 1.12; 1.19; 1.22; 1.23 0.64

Thulium-170 127 day 0.052; 0.084 0.002

Zinc-65 245 day 0.51;b 1.11 0.26

a Reprinted with permission from “The Radiochemical Manual.” The Radiochemical Centre. Amersham, England, 1966.b 0.51-MeV γ rays from positron annihilation.c Barium-140 in equilibrium with lanthanum-140.d Radium-226 in equilibrium with daughter products; radiation filtered through 0.5 mm platinum; dose rate from l g.

SEE ALSO THE FOLLOWING ARTICLES

ATOMIC PHYSICS • COSMIC RADIATION • DOSIMETRY •HEALTH PHYSICS • NEUTRINOS • NUCLEAR PHYSICS •NUCLEAR REACTOR MATERIALS AND FUELS • NUCLEAR

SAFEGUARDS • RADIATION SHIELDING AND PROTECTION

• SOLAR PHYSICS

BIBLIOGRAPHY

Berger, M. J., and Hubbell, J. H. (1987). “XCOM: Photon CrossSections on a Personal Computer,” National Bureau of Stan-dards (now National Institute of Standards and Technology) Re-port NBSIR 87-3597. Current version of this database available at:http://physics.nist.gov/PhysRefData/Xcom/Text/XCOM.html

Chilton, A. B., Shultis, J. K., and Faw, R. E. (1984). “Principles ofRadiation Shielding,” Prentice-Hall, Englewood Cliffs, NJ.

Christophorou, L. G. (1971). “Atomic and Molecular Radiation Physics,”Wiley, New York.

Franklin, A. (2000). “The Road to the Neutrino,” Physics Today 53(2),22–28.

Fuller, E. G., and Hayward, E. (1976). “Photonuclear Reactions,” Dow-den, Hutchinson & Ross, Stroudsburg, PA.

Greening, J. R. (1981). “Fundamentals of Radiation Dosimetry,” AdamHilger Ltd., Bristol, UK.

Harima, Y. (1993). “An historical review and current status of buildupfactor calculations and applications,” Rad. Physics Cheml/5),631–672.

Henke, B. L., Gullikson, E. M., and Davis, J. C. (1993). “X-ray inter-actions: Photoabsorption, scattering, transmission, and reflection atE = 50–30,000 eV, Z = 1–92,” Atomic Data Nucl. Data Tables 54(2),181–342.

Hubbell, J. H., Bach, R. L., and Lamkin, J. C. (1960). “Radiation fieldfrom a rectangular source,” J. Res. Nat. Bureau Stand. 64C(2), 121–138.

Hubbell, J. H., Bach, R. L., and Herbold, R. J. (1961). “Radiation fieldfrom a circular disk source,” J. Res. Nat. Bureau Stand. 65C(4), 249–264.

Hubbell, J. H. (1963). “A power-series buildup factor formulation. Ap-plication to rectangular and off-axis disk source problems,” J. Res.Nat. Bureau Stand. 67C(4), 291–306.

Page 318: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPJ/GLT P2: GQT Final Pages

Encyclopedia of Physical Science and Technology EN013A-634 July 26, 2001 19:56

580 Radiation Physics

Hubbell, J. H. (1969). “Photon Cross Sections, Attenuation Coefficients,and Energy Absorption Coefficients from 10 keV to 100 GeV,” Na-tional Bureau of Standards Reference Data Series NSRDS-NBS 29.

Hubbell, J. H., Veigele, Wm. J., Briggs, E. A., Brown, R. T., Cromer,D. T., and Howerton, R. J. (1975). “Atomic form factors, incoherentscattering functions, and photon scattering cross sections,” J. Phys.Chem. Ref. Data 4(3), 471–538, Erratum in 6(2), 615–617 (1977).

Hubbell, J. H., Gimm, H. A., and Øverbo/, I. (1980). “Pair, triplet, andtotal atomic cross sections (and mass attenuation coefficients) for 1MeV–100 GeV photons in elements Z = 1 to 100,” J. Phys. Chem. Ref.Data 9(4), 1023–1147.

Hubbell, J. H. (1982). “Photon mass attenuation and energy-absorptioncoefficients from 1 keV to 20 MeV,” Int. J. Appl. Rad. Isotopes 33(11),1269–1290.

Hubbell, J. H. (ed.) (1993). “Radiation physics at 1993: A topical com-pendium,” Rad. Phys. Chem. 41(4/5), 579–789.

Hubbell, J. H., Trehan, P. N., Singh, N., Mehta, D., Garg, M. L., Garg,R. R., Singh, S., and Puri, S. (1994). “A review, bibliography, andtabulation of K, L, and higher atomic shell x-ray fluorescence yields,”J. Phys. Chem. Ref. Data 23(2), 339–364.

ICRU (1969). “Neutron Fluence, Neutron Spectra and Kerma,” ICRUReport 13, ICRU Publications, Bethesda, MD.

ICRU (1984). “Stopping Powers for Electrons and Positrons,” ICRUReport 37, ICRU Publications, Bethesda, MD.

ICRU (1993). “Stopping Powers for Protons and Alpha Particles,” ICRUReport 49, ICRU Publications, Bethesda, MD.

ICRU (1998). “Fundamental Quantities and Units for Ionizing Radia-tion,” ICRU Report 60, ICRU Publications, Bethesda, MD.

Jenkins, T. M., Nelson, W. R., and Rindi, A. (eds.) (1988). “Monte CarloTransport of Electrons and Photons,” Plenum, New York.

Johns, H. E., and Cunningham, J. R. (1983). “The Physics of Radiology,”4th ed., Charles C Thomas, Springfield, IL.

Knoll, G. F. (2000). “Radiation Detection and Measurement,” 3rd ed.,Wiley, New York.

Koch, H. W., and Motz, J. W. (1959). “Bremsstrahlung cross-sectionformulas and related data,” Rev. Mod. Phys. 31(4), 920–955.

Lamarsh, J. R. (1983). “Introduction to Nuclear Engineering,” 2nd ed.,Addison-Wesley, Reading, MA.

Morin, R. L. (ed.) (1988). “Monte Carlo Simulation in the RadiologicalSciences,” CRC Press, Boca Raton, FL.

Motz, J. W., Olsen, H. A., and Koch, H. W. (1969). “Pair production byphotons,” Rev. Mod. Phys. 41(4), 581–639.

Schwarzschild, B. “The tau neutrino has finally been seen,” PhysicsToday 53(10), 17–19.

Seltzer, S. M. (1993). “Calculation of photon mass energy-transfer andmass energy-absorption coefficients,”Rad. Res. 136(2), 147–170.

Winter, K. (ed.) (1991). “Neutrino Physics,” Cambridge UniversityPress, Cambridge, U. K. and New York.

Page 319: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPA Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology en016b-749 July 31, 2001 15:25

SuperconductivityH. R. KhanFEM and University of Tennessee at Knoxville

I. IntroductionII. Superconducting MaterialsIII. Correlation: Tc with the Electronic Structure

of a SolidIV. Flux QuantizationV. London Equation and Coherence Length

VI. Coherence Length and Energy GapVII. Thermodynamics of Superconductivity

VIII. Magnetic SuperconductorsIX. Tunneling and the Josephson EffectX. Theory of SuperconductivityXI. Applications of SuperconductivityXII. Recent Developments: High-Transition

Temperature Superconductivity

GLOSSARY

Coherence length Correlation distance of the supercon-ducting electrons.

Critical magnetic field Above this value of an externallyapplied magnetic field, a superconductor becomes non-superconducting (normal).

Energy gap Gap in the low-energy excitations of asuperconductor.

Type I superconductor When an external magnetic fieldis applied on this superconductor, the transition from asuperconducting to a normal state is sharp.

Type II superconductor When an external magneticfield is applied, the transition from a superconducting

to a normal state occurs after going through a broad“mixed-state” region.

SUPERCONDUCTORS are materials that lose all theirelectrical resistivity below a certain temperature and be-come diamagnetic. High values of an externally appliedmagnetic field are required to destroy the superconduc-tivity. These electrical and magnetic properties of super-conducting materials have found applications in losslesselectrical transmission and generation of high-magneticfields. Superconducting magnets are used where normaliron magnets are inadequate. These magnets are used asexciter magnets for homopolar generators or rotors in

235

Page 320: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPA Final Pages

Encyclopedia of Physical Science and Technology en016b-749 July 31, 2001 15:25

236 Superconductivity

large alternators, and much gain in efficiency and powerdensity is obtained. Future fusion reactors will use su-perconducting magnets for confining deuterium and tri-tium plasma. Superelectron pairs in a superconductor cantunnel through a nonconducting thin layer. Based on this“Josephson effect,” superconducting Josephson junctionsare used as sensors, as high-energy electromagnetic radi-ation detectors, and in high-speed digital signal and dataprocessing.

I. INTRODUCTION

A. Discovery of Liquid HeliumGas and Superconductivity

In 1908 Kammerlingh Onnes succeeded in liquifying he-lium gas, and this enabled him to measure the electri-cal resistivity of metals at lower temperatures down, to4.2 K. The boiling temperature of liquid helium is 4.2 K.He measured the electrical resistivity of gold, platinum,and mercury and found that the electrical resistivity ofmercury disappeared almost completely below 4.2 K. Asshown in Fig. 1, the electrical resistivity of mercury isalmost zero below 4.2 K. This state of a material inwhich the resistance is zero is called the superconduct-

FIGURE 1 Electrical resistance R (�) as a function of tempera-ture for mercury metal (Hg).

ing state. The current flows without any attenuation inthis state, and it has been estimated that the decay timeof a current in a superconductor is about 100,000 years.The temperature below which a material loses its resis-tance is called the superconducting transition or criticaltemperature Tc.

B. Effect of a Magnetic Field onSuperconductivity and the Meissner Effect

Meissner discovered that a bulk superconducting mate-rial behaves like a perfect diamagnet with a zero mag-netic induction in its interior. If a paramagnetic materialis placed in a magnetic field, then the magnetic lines offorce penetrate through the material. But when the samematerial is made superconducting by cooling to lower tem-peratures, then all the lines of force are expelled from theinterior of this material. This is called the Meissner ef-fect. Figure 2 shows a material in the normal and super-conducting states in an externally applied magnetic field.When the strength of this externally applied magnetic fieldis increased slowly, a value is reached where the mag-netic lines of force begin to penetrate the material and itbecomes nonsuperconducting or normal. This particularvalue of the magnetic field above which the supercon-ductivity is destroyed is called the critical magnetic fieldHc(T ) and is also a function of the temperature of the ma-terial. A typical example of the variation of the Hc(T ) withtemperature T is shown in Fig. 3 for the metal mercury(Hg).

The variation of Hc(T ) with temperature can be ex-pressed by the equation

Hc(T ) = Hc(0)[1 − (T I Tc)2

],

which has a parabolic form. This expression can also bederived using thermodynamics.

FIGURE 2 Normal and superconducting states of a material inan external magnetic field.

Page 321: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPA Final Pages

Encyclopedia of Physical Science and Technology en016b-749 July 31, 2001 15:25

Superconductivity 237

FIGURE 3 Variation of the superconducting transition tempera-ture of mercury in an extemally applied magnetic field.

C. Type I and Type II Superconductors

Based on the Meissner effect, superconducting materialsare classified as type I and type II. When the magneticinduction 4π M of a superconducting material in the formof a cylinder with its axis parallel to the apptied magneticfield is measured with an increasing magnetic field andif there is a sharp transition to the normal state above acertain value of the magnetic field Ba as shown in Fig. 4,then this type of material is called a type I superconductor.This kind of behavior is shown in general by pure metals.On the contrary, the 4π M versus Ba behavior of a typeII superconductor is shown in Fig. 5. The magnetic flux

FIGURE 4 Magnetization 4π M as a function of externally appliedmagnetic field Ba of a type I superconductor.

FIGURE 5 Magnetization 4π M as a function of applied magneticfield Ba of a type II superconductor.

penetrates the material slowly at a field value of Hc1 andcontinues up to Hc2, where the material is transformed to anormal state. The superconducting state between the fieldvalue Hc1 and the value Hc2 is called the vortex or mixedstate. The Hc2 value can be 100 or more times greaterthan Hc. This type II superconducting behavior is shownin general by alloys and compounds that are called dirtysuperconductors.

Some of the superconducting alloys and compoundsof special structures possess very high values of Hc2.For example, the Hc2 value of a compound of composi-tion Pb1Mo5.1S6 with a Cheveral phase structure is about51 T. Very high magnetic fields can be generated by thesolenoids of the wires made of superconducting materialsof high Hc2. Commercial superconducting magnets capa-ble of producing magnetic fields of more than 10 T areavailable and use wires of Nb–Ti and Nb–Sn alloys. Thevariation of Hc2(T ) with temperature of some high-Hc2

alloys is shown in Fig. 6. A type I superconductor canbe transformed to a type II superconductor by alloying.A typical example is shown in Fig. 7. Here lead (Pb) is atype I superconductor, and when it is alloyed with indium(In), the alloys show type II behavior and the values ofHc1 and Hc2 are a function of the composition. The cur-rent flowing through the superconducting wire producesa magnetic field, and when the value of the current is in-creased slowly, a value is reached where the magnetic fieldbecomes equal to the critical magnetic field. This value ofcurrent is called the critical current.

II. SUPERCONDUCTING MATERIALS

A. Elements, Compounds, and Alloys

The distribution of superconducting elements in the pe-riodic system is shown in Fig. 8. Some elements do not

Page 322: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPA Final Pages

Encyclopedia of Physical Science and Technology en016b-749 July 31, 2001 15:25

238 Superconductivity

FIGURE 6 Upper critical magnetic fields Hc2 as a function oftemperature for some superconductors of β-W structure. [FromBerlincourt, T. G., and Hake, R. H. (1963). Phys. Rev. 131, 140;Fonner, S., McNiff, E. J., Jr., Matthias, B. T., Geballe, T. H., Willens,R. H., and Corenzwit, E. (1970). Phys. Lett. 3lA, 349.]

become superconducting at all and others become so onlyunder pressure. The superconducting transition tempera-tures along with their crystal structures and melting tem-peratures are listed in Table I. Niobium (Nb) metal has thehighest superconducting transition temperature (9.2 K).The elements that become superconducting under pres-sure are listed in Table II.

The magnetic elements Mn, Fe, Co, and Ni do not be-come superconducting down to the lowest temperatureavailable. Also, their presence in small amounts (of theorder of parts per million) suppresses the superconduct-ing transition temperature of other superconducting ma-terials. The superconducting elements are classified intotwo groups. One group is the nontransition elements andconsists of the elements Si, Ge, P, As, Sb, Bi, Se, and Tc,which are not superconducting under normal conditionsbut under pressure become superconducting. The othergroup consists of the transition elements and have un-

FIGURE 7 Magnetization 4π M as a function of applied mag-netic field Ba and change of type I (Pb) to type II (Pb–In) alloysuperconductors.

filled 3d, 4d, and 5d shells. The crystal structure playsan important role in superconductivity. For example, asshown in Tables I and II, Bi is not superconducting, butdifferent crystal modifications of it obtained by applyingpressure exhibit superconductivity at temperatures rang-ing between 3.9 and 8.5 K.

Multicomponent alloys and compounds of differentcrystal structures exhibit superconductivity. High-transition temperature superconductivity occurs with cu-bic structure, and the most favorable is the one with theβ-W structure. Compounds and alloys with superconduct-ing transition temperatures above 20 K form this structure.The β-W structure is cubic and is shown in Fig. 9. Eachface of the cubic lattice is occupied by two atoms thatform orthogonal linear atomic chains. The highest super-conducting transition temperature is 23 K, and it is ex-hibited by a compound of composition Nb3Ge with theβ-W structure. Here the Nb atoms form the linear atomicchains, and the Ge atoms occupy the center and cornersites of the cubic lattice.

The reason the materials of this particular structureshow such high superconducting transition temperaturesis explored by Labbe and Friedel. Their theoretical cal-culations based on tight binding approximation suggestthat materials of this structure possess an unusually highelectron density of states at the Fermi surface; this isalso experimentally confirmed. In addition, the d-bandof these materials is narrower and taller compared withthat of the transition metals. These are the factors thatcause the enhancement of the superconducting transitiontemperature. Some of the high superconducting transitiontemperature materials are listed in Table III.

There are other kinds of superconducting materials, in-cluding low-carrier density superconductors (semimetalor semiconductor), intercalated compounds, amorphoussuperconductors, and organic superconductors, and theyare described separately as follows.

B. Low-Carrier Density Superconductors

A class of materials that have carrier densities in the rangeof 1018 to 1021 are called semimetals because their carrierdensities are between those of metallic conductors andsemiconductors. Many of these materials are supercon-ducting. For example, Fig. 10 shows that La3Se4, GeTc,SnTc, and SrTi03 are superconducting, and the supercon-ducting transition temperature increases with increasingcarrier density except in the case of SrTi03. For SrTi03,the superconducting transition temperature begins to de-crease above a carrier density of 1020. This decrease isexplained by the occurrence of the magnetic effect. Allthe above-mentioned materials investigated were in theform of single crystals.

Page 323: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPA Final Pages

Encyclopedia of Physical Science and Technology en016b-749 July 31, 2001 15:25

Superconductivity 239

FIGURE 8 Distribution of the superconducting elements in the periodic table. [From Khan, H. R. (1984). Gold Bull.17(3), 94.]

C. Intercalated Compounds

A typical example of this class of materials isTaS2(C5H5N)1/2. This compound is formed when TaS2

is intercalated with pyridine (C5H5N), and metallic layersabout 6 A thick are separated by pyridine layers of thesame thickness. This intercalcated compound becomessuperconducting at 3.5 K. A large number of transitionmetal chalcogenides exist that crystallize in the layeredstructures. These types of compounds show anisotropicsuperconducting properties parallel and perpendicular tothe layer surface. The critical magnetic field is about 30times higher in the direction parallel to the layer surfacecompared with the perpendicular direction, as shown inFig. 11.

D. Amorphous Superconductors

Unlike crystalline materials, amorphous or noncrystallinematerials consist of atoms that do not form regular arraysand are randomly distributed. These amorphous materi-

als can be obtained in the form of thin films by evapora-tion deposition on cold substrates. Amorphous materialsin bulk can also be obtained by rapidly cooling an alloymelt. The amorphous materials obtained in this way arecalled metallic glasses. Materials of this class also ex-hibit superconductivity but are completely different fromtheir crystalline counterparts. The superconducting tran-sition temperature Tc and the electron per atom ratio (e/a)of some amorphous nontransition metals and alloys arelisted below. One sees that the Tc values in the amorphousstate are higher than those in the crystalline state.

Alloy Tc (K) e/a

Be 9.95 2.0

Be90A10 7.2 2.1

Ga 8.4 3.0

Pb90Cu10 6.5 3.7

Bi 5 5

Amorphous films of transition metals and alloys have alsobeen obtained, and the Tc values are lower than those of the

Page 324: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPA Final Pages

Encyclopedia of Physical Science and Technology en016b-749 July 31, 2001 15:25

240 Superconductivity

TABLE I Superconducting Transition Temperature Tc, Melt-ing Temperature, and Crystal Structure of Elements

Crystal MeltingElement Tc (K) structurea temperature (◦C)

Al 1.19 f.c.c. 660

Be 0.026 Hex. 1283

Cd 0.55 Hex. 321

Ga 1.09 Orth. 29.8

(6.5, 7.5)

Hg 4.15 Rhom. −38.9

(3.95)

In 3.40 Tetr. 156

Ir 0.14 f.c.c. 2450

La 4.8 Hex. 900

(5.9)

Mo 0.92 b.c.c. 2620

Nb 9.2 b.c.c. 2500

Os 0.65 Hex. 2700

Pa 1.3 — —

Pb 7.2 f.c.c. 327

Re 1.7 Hex. 3180

Ru 0.5 Hex. 2500

Sn 3.72 Tetr. 231.9

(5.3) Tetr.

Ta 4.39 b.c.c. 3000

Tc 7.8 Hex.

Th 1.37 f.c.c. 1695

Ti 0.39 Hex. 1670

TI 2.39 Hex. 303

U(α) 0.2 Orth. 1132

V 5.3 b.c.c. 1730

W 0.012 b.c.c. 3380

Zn 0.9 Hex. 419

Zr 0.55 Hex. 1855

a f.c.c., face-centered cubic; hex., hexagonal; orth., orthorhombic;rhom., rhombohedral; tetr., Tetrahedral; b.c.c., body-centered cubic.

same alloys in crystalline form. Metallic glass supercon-ductors are classified into two main groups. One groupconsists of metal–metal compositions and the other ofmetal–metalloid compositions. The superconducting tran-sition temperatures of some of the metallic glass supercon-ductors are listed in Table IV. These metallic glass super-conductors show some desirable properties. For example,they are ductile and possess a high strength, whereas theircrystalline counterparts are brittle. The metallic glass su-perconductors also possess very high values of the criticalmagnetic field.

A practical superconductor capable of producing a mag-netic field of about 10 T should have a critical current den-sity of about 106 A/cm2. In general, the amorphous super-

TABLE II Superconducting Transition Temper-ature Tc of Elements under Pressure

PressureElement Tc (K) (kbar)

As 0.5 120

Ba 5.1 140

(1.8) 55

Bi II 3.9 26

Bi III 7.2 27

Bi V 8.5 78

Ce 1.7 50

Cs 1.5 1000

Ge 5.4 110

Lu 0.1–0.7 130

P 4.6–6.1 100

Sb 3.6 85

Se 6.9 130

Si 6.7 120

Te 4.5 43

Y 1.5–2.7 120–160

conductors have low critical current densities. The currentdensity can be increased by introducing some kind of inho-mogeneities into the amorphous matrix. Some binary andternary pseudoamorphous alloys of vanadium, hafnium,and zirconium metals possess reasonably high supercon-ducting transition temperatures and very high values ofcritical magnetic fields and critical current densities. Atthe same time they also have good mechanical propertiessuch as ductility and high tensile strength. These kinds ofmaterials are promising future superconducting materialsfor generating magnetic fields above 10 T.

E. Organic Superconductors

In 1964 Little proposed that an organic polymer can alsobecome a superconductor. His theory of superconductivityis based on a mechanism entirely different from that of theBardeen et al. (1957) theory of metals and alloys. Littlesuggested that the electrons on the spine of the polymerchain are attracted to each other by an indirect process

TABLE III Superconducting Transition Temper-ature Tc of Some β-W Structure Compounds

Compound Tc (K)

V3Ga 14.2–14.6

V3Si 17.1

Nb3Au 11.0–11.5

Nb3Sn 18.0

Nb3A10.8Ge0.2 20.7

Page 325: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPA Final Pages

Encyclopedia of Physical Science and Technology en016b-749 July 31, 2001 15:25

Superconductivity 241

FIGURE 9 The β-W lattice structure.

involving the fixed polar groups on the side branches ofthe polymer. He predicted a superconducting transitiontemperature of about 100 K using the molecular polar-ization mechanism. Experimentally, there are indicationsof a polymer becoming a superconductor. For example,a polymer of formula (SN)x shows superconductivity at0.3 K and another organic compound called tetramethyl-tetrasulfofluoride (TMTSF) at 1 K under a pressure of12 kbar.

FIGURE 10 Superconducting transition temperature Tc as afunction of carrier density. [From Hulm, J. K., Ashkin, M., Deis,D. W., and Jones, C. K. (1970). Prog. Low Temp. Phys. VI, 205.]

FIGURE 11 Upper critical magnetic field Hc2 as a function oftemperature parallel and perpendicular to the layered surface inTaS2(C5H5N)1/2. [From Gamble, F. R., et al. (1971). Science 174,493.]

III. CORRELATION: Tc WITH THEELECTRONIC STRUCTURE OF A SOLID

Matthias proposed empirically that the superconductingtransition temperature Tc and the electron per atom ratioe/a of a solid are related. This Matthias empirical rule sug-gests that the maximum values of Tc for transition metalsoccur at e/a values of 5 and 7, as shown in Fig. 12. In thecase of solid solutions of transition metals, a slight shift ofthe first maximum to an e/a value of 4.5 occurs, as shownin Fig. 13. Amorphous materials consisting of transitionmetals show a different behavior. Amorphous materialsbased on the transition metals of an unfilled 4d shell showonly one maximum at an e/a ratio of 6.4, whereas materials

TABLE IV Superconducting Transition Temper-ature Tc of Some Metallic Glasses

Metallic glass Tc (K)

La80Au20 3.5

La80Ga20 3.8

Zr75Rh25 4.55

Zr70Pd30 2.4

Nb60Rh40 4.8

(Mo0.8Ru0.2)80P20 7.31

(Mo0.6Ru0.4)82B18 6.05

(Mo0.8Ru0.2)80P10B10 8.71

Page 326: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPA Final Pages

Encyclopedia of Physical Science and Technology en016b-749 July 31, 2001 15:25

242 Superconductivity

FIGURE 12 Superconducting transition temperature Tc as afunction of electron per atom ratio e/a for the transition elements.

based on transition metals with an unfilled 5d shell have amaximum at an e/a of 7, as shown in Fig. 14. At these peakvalues of the superconducting transition temperatures, thevalues of the electron density of states are also maximum.

IV. FLUX QUANTIZATION

In 1950 F. London suggested that the magnetic fluxtrapped by a superconducting ring is quantized and theflux quantum is given by 0 = ch12e = 2 × 10−7 G/cm2,where c is the velocity of light, h is Planck’s constant, ande is the electronic charge. The flux trapped in a supercon-ductor is quantized and is equal to n 0. In the case of type Isuperconductors where the Meissner effect is perfect, thevalue of n is zero. The flux quantization is observed onlyin the case of multiply connected geometries such as asuperconducting ring. When the external magnetic field isremoved, the magnetic flux trapped is equal to n 0. Theflux quantization is expected to exist even in singly con-

FIGURE 13 Superconducting transition temperature Tc as afunction of electron per atom ratio e/a for solid solutions of thetransition elements.

FIGURE 14 Superconducting transition temperature Tc as afunction of the electron per atom ratio e/a for the amorphous 4dand 5d transition metals.

nected geometries in the case of type II superconductorsbecause a mixed state exists in which the superconductingregions surround the lines of force and form a multiplyconnected system of filaments.

V. LONDON EQUATIONAND COHERENCE LENGTH

The magnetic field H and the supercurrent Js in a super-conductor are related by the equation

∇ × H = 4π

cJs,

where c is the velocity of light. The free energy F of asystem is given as

F = Fs + Ekin + Emag,

where |Fs| is the free energy of the electrons in the super-conducting state. The kinetic energy Ekin is

Ekin = 1

2

∫vol

mV2nsd r,

where V is the drift velocity of a parabolic band, ns isthe number of superconducting electrons per unit volume,and m is the effective mass of the electrons. The magneticenergy Emag in a magnetic field H is

Emag =∫

H 2

8πd r.

Page 327: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPA Final Pages

Encyclopedia of Physical Science and Technology en016b-749 July 31, 2001 15:25

Superconductivity 243

The free energy F can be written

F = Fs + 1

∫ [H 2 + λ2

L |∇ × H|2 ]

d r,

where λL is a constant,

λL =∣∣∣∣ mc2

4πnse2

∣∣∣∣1/2

,

and is called the London penetration depth. London ob-tained an equation by minimizing the free energy withrespect to the field distribution:

H + λ2L ∇ × ∇ × H = 0.

When the current flows in the y direction in a supercon-ductor, then the magnetic field in the z direction is

Hz = H (0)e −z /λL ,

which shows that the magnetic field falls off exponentiallyinside a superconductor. The London penetration depthλ(0)L at T = 0K in terms of the Fermi velocity υF andthe electron density of states N (0) is

λ(0)L =[

3C2

8π N (0)υ2Fe2

]1/2

.

VI. COHERENCE LENGTHAND ENERGY GAP

Coherence length is a measure of the correlation distanceof the superconducting electrons and is denoted ξ0. Thecoherence length in terms of the Fermi velocity υF , Boltz-mann constant K B , and superconducting transition tem-perature Tc is

ξ0 = hυF

KBTc,

where h is Planck’s constant.One of the important features of superconductivity is the

existence of a gap in the low-energy excitations, which isdenoted ε. In most superconductors, an external energy Emust be supplied to create an electron-hole pair close tothe Fermi surface.

This energy E is

E ≥ 2ε.

The coherence length ξ0 and the energy gap are related bythe equation

ξ = hυF

πε.

Bardeen, Cooper, and Schrieffer (BCS), in 1957, relatedthis energy gap to the formation of Cooper pairs. Theformation of an energy gap in a superconductor is depictedin Fig. 15 for free electrons. According to the BCS theory,

FIGURE 15 Formation of an energy gap in the superconductingstate.

electrons that have energies close to the Fermi energy formCooper pairs easily. The paired states have a lower energythan the unpaired electrons that form them. The electrondensity of states n(E)-versus-energy E curve of a normalmetal as shown in Fig. 16 changes to the curve in Fig. 17when the normal metal becomes a superconductor. Fromthe BCS theory, a relationship among the energy gap at 0 Kε(0), the Boltzmann constant KB, and the superconductingtransition temperature Tc is

ε(0) = 1.76KBTc .

The variation of the energy gap with temperature is shownin Fig. 18. The BCS theory is discussed later.

The energy gap of a superconductor can be measuredexperimentally as follows. The absorption coefficients oflongitudinal ultrasonic waves in the normal state αn andsuperconducting state αs are related by

αs

αn= 2

1 + exp(ε/KBT ) .

FIGURE 16 Electron density of states n(E ) as a function of en-ergy E for a normal metal.

Page 328: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPA Final Pages

Encyclopedia of Physical Science and Technology en016b-749 July 31, 2001 15:25

244 Superconductivity

FIGURE 17 Electron density of states n(E ) as a function of en-ergy E for an ideal superconductor.

The experimental determination of αn and αs at a particulartemperature T enables one to determine the value of ε.

Absorption of the electromagnetic waves in the far-infrared (λ ≈ 1 mm) region occurs for photons of energyh v = 2ε. Determination of the frequency of absorption di-rectly gives an energy gap as shown in Fig. 19. The specificheat is proportional to exp(−ε/KBT ), and the value of εcan also be obtained from the specific heat measurements.

Tunneling experiments also give the value of ε directly.An experimental arrangement for the determination of εis shown in Fig. 20. A superconductor with an energy gapε is depicted as A and is separated from a normal conduc-tor C through a thin insulating layer B. The shaded areasrepresent the occupied states. The Fermi level is at thecenter of the energy gap in the case of the superconductor.When a potential difference is applied across the insulat-

FIGURE 18 Energy gap 2ε as a function of temperature T for asuperconductor.

FIGURE 19 Absorption of electromagnetic waves as a functionof frequency in a superconductor.

ing layer, electrons tunnel through the barrier B from Cto A. This potential difference that causes the onset ofthe tunneling current is a direct measure of the energygap ε. When both of the materials across the insulatinglayer are superconductors, the energy gaps of these twosuperconductors can be measured simultaneously fromthe potential difference-versus-current curve. A typical ar-rangement for these kinds of measurements is shown inFigs. 21 and 22.

VII. THERMODYNAMICS OFSUPERCONDUCTIVITY

From the basic thermodynamic considerations, we de-rive relations among the critical magnetic field of a

FIGURE 20 Tunneling of electrons through a thin insulating layer,B, between two superconductors, A and C.

Page 329: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPA Final Pages

Encyclopedia of Physical Science and Technology en016b-749 July 31, 2001 15:25

Superconductivity 245

FIGURE 21 Tunneling of electrons through a thin insulating layerof Al2O3 between the two superconductors lead (Pb) and alu-minum (Al).

superconductor and the specific heats in the normal andsuperconducting states as well as the critical magneticfield and superconducting transition temperature Tc. Letus consider a material in the normal state with a negligiblemagnetization. Its Gibbs energy function Gn in the normalstate is

Gn = U − T S + PV,

where U is the internal energy, T the temperature, Sthe entropy, P the pressure, and V the volume. For asuperconductor in the presence of an external magneticfield, the magnetization is not negligible and the magneticinduction B is

B = H + 4πI,

where H is the applied magnetic field and I is the intensityof magnetization. In the case of a sharp superconductingtransition for a long thin rod parallel to the field B = 0,

I = −H/4π.

The Gibbs function Gs in the superconducting state perunit volume is

Gs(H ) = U − T S + PV −∫ H

0I d H

= Gs +∫ H

0H d H/4π

= Gs + H 2/8π.

FIGURE 22 Current I -versus-potential difference PD plot of atunnel junction consisting of two superconductors with energygaps ε1 and ε2.

Assuming a negligible volume change at the transition,

Gn = Gs(Hc),

where Hc is the critical magnetic field, and

Gn − Gs = H 2c

/8π.

Because

G = U + PV − T S (1)

and

dG = dU + PdV + V d P − T d S − SdT,

using the first law of thermodynamics,

d Q = dU + PdV = T d S,

one obtains

dG = V d P − SdT,

which gives

S = −(∂G/∂T )P. (2)

Combining Eqs. (1) and (2),

Sn − Ss = −Hc/4π · ∂ Hc/∂T . (3)

The difference of the normal-state and superconducting-state entropies is expressed in terms of the critical field Hc

and its slope ∂ Hc/∂T . The specific heat per unit volumeis

C = d Q/dT = T d S/dT,

Page 330: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPA Final Pages

Encyclopedia of Physical Science and Technology en016b-749 July 31, 2001 15:25

246 Superconductivity

so that

Cn − Cs = T ∂(Sn − Ss)/∂T ,

and Eq. (3) reduces to

Cs − Cn = T Hc /4π · ∂2H /∂T 2 + T /4π · (∂ Hc /∂T )2 .

At the transition T = Tc and Hc = 0,

Cs − Cn = Tc /4π · (∂ Hc /∂T )2 . (4)

where Cs and Cn are the specific heats in the supercon-ducting and normal states. The specific heat Cs followsthe relationship

Cs = BT 3 .

where B is a constant, whereas Cn is given as

Cn = AT 3 + γ T .

Because

d Q = dU + PdV = T d S,

S =∫

d Q/T =∫

C dT/T .

Therefore,

Ss =∫

BT 2 dT = (BT 3 /3)

and

Sn =∫

(AT 2 + γ ) dT = (AT 3 /3) + γ T ,

and the difference is

Sn − Ss = 1

3(A − B)T 3 + γ T .

At the transition temperature Tc (in zero field)

Sn = Ss ,

thus1

3(B − A)T 2c = γ

and

Sn − Ss = γ(T − T 3

/T 2c

). (5)

From Eq. (4) it follows that

−Hc /4π · ∂ Hc /∂T = γ(T − T 3

/T 2c

)or

∂/∂T /(

H 2c

) = 8πγ(T 3

/T 2c − T

).

Since Hc = Ho at T = 0 K and integrating,

H 2c = 8πγ

(T 4

/4T 2c − T 2 /2

) + H 20 (6)

when T = Tc and H 2c = 0, therefore

8πγ(T 2c

/2 − T 2c

/4) = H 2

0

or

γ = H 20

/2πT 2c . (7)

Combining Eqs. (6) and (7),

Hc = H0 (1 − (T /Tc)2

).

This equation relates the critical magnetic field Hc of asuperconductor with the critical temperature Tc and hasa parabolic form. This conforms to the experimental ob-servation shown in Fig. 3 for a type I superconductor forwhich the relationship between Hc and Tc was

Hc ∼= H0

(1 − (T /Tc)2

).

VIII. MAGNETIC SUPERCONDUCTORS

Ferromagnetism and superconductivity have been consid-ered to be mutually exclusive phenomena. lt was assumedthat the large internal magnetic field present in a ferro-magnetic material would not allow it to become a super-conductor. This is true, and so far none of the magnetic el-ements (for example, chromium, manganese, iron, cobalt,and nickel) have exhibited superconductivity. A searchwas made to find a material that exhibits superconductivityand ferromagnetism at different temperatures. Among therare-earth elements, lanthanum is superconducting at 6 K.The other rare-earth elements are either paramagnetic orferromagnetic, with magnetic moments that are due to 4 felectrons. Matthias and co-workers dissolved gadoliniummetal in lanthanum and measured the superconductingtransition temperatures as a function of dissolved gadolin-ium. Figure 23 shows a plot of the superconducting tran-sition temperature as a function of gadolinium dissolved

FIGURE 23 Superconducting transition temperature and ferro-magnetic Curie point as a function of gadolinium (Gd) in La–Gdalloys. [From Matthias, B. T., and Suhl, H. (1960). Phys. Rev. Lett.4, 51.]

Page 331: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPA Final Pages

Encyclopedia of Physical Science and Technology en016b-749 July 31, 2001 15:25

Superconductivity 247

FIGURE 24 Transitions to superconducting and ferromagneticstates in ErRh4B4 −. [From Fertig, W. A., Johnston, D. C., Maple,M. B., and Matthias, B. T. (1977). Phys. Rev. Lett. 38, 987.]

in lanthanum. The depression of Tc is a linear functionof gadolinium dissolved to approximately l% gadolinium.More than 2.5% gadolinium in lanthanum makes it a fer-romagnetic material.

These data suggest that an exchange interaction overconduction electrons leading to ferromagnetism is easy tobring about in an element that is itself a superconductor.This points to a possibility of a magnetic superconductor inwhich the phenomena of superconductivity and ferromag-netism overlap. A material of composition ErRh4B4 hasbeen discovered that becomes superconducting at 8.7 Kand shows ferromagnetic ordering at 0.93 K. The mea-surements of resistance and magnetic susceptibility as afunction of temperature for ErRh4B4 are shown in Fig. 24.The anomalies at the temperatures of 0.93 and 8.7 K in theresistance and niagnetic susceptibility curve correspond tothe ferromagnetic ordering and superconducting transitiontemperatures.

Another Chevrel phase compound of compositionHoMo6S8 exhibits superconducting and ferromagnetictransitions at 2.15 and 0.6 K, as shown in Fig. 25. Thediscovery of the coexistence of ferromagnetism and su-perconductivity in these ternary rare-earth molybdenumchalcogenides and rare-earth rhodium borides has openeda new field of investigation on the interactions responsiblefor ferromagnetism and superconductivity.

IX. TUNNELING AND THEJOSEPHSON EFFECT

In 1962 Josephson predicted theoretically that if two su-perconductors were separated by a thin (∼10- A) insulat-ing film, then the superconducting electron pairs would

FIGURE 25 Transitions to the superconducting and ferromag-netic states in HoMo6S8 −. [From Ishikawa, M., and Fischer, O.(1977). Solid State Commun. 23, 37.]

tunnel through this junction. The tunneling current wouldflow without any voltage across the junction between thetwo superconductors. When the dc current is exceeded, adc voltage would develop across the junction. This volt-age is 2eV 1η, where e is the charge on the electron, ηPlanck’s constant, and V the frequency of the photon ra-diated by the electron pair while tunneling across the junc-tion. The maximum zero-voltage current J across the junc-tion is

J = J0 sin(δ0 + 2e /c η)∫

A ds,

where δ0 is a constant. This shows that the current is aperiodic function of the flux passing through the junc-tion at right angles to the current and that the periodis equal to the quantum of flux ηcl2e. The Josephsonprediction was experimentally proved by Anderson andRowell, who showed that a zero-voltage current wouldflow through a thin insulating layer between the two su-perconductors. The maximum value of this current os-cillates with the external magnetic field, as shown inFig. 26.

X. THEORY OF SUPERCONDUCTIVITY

In 1950 Ginzburg and Landau proposed a model for su-perconductors in which an order process in a supercon-ductor is described in terms of an order parameter ψ ,where ψ represents the fraction of conduction electronsin the superconducting momentum state. This model con-tained expressions for the momentum and kinetic energyof superelectrons and described the magnetic behavior ofsuperconductors very well, but a basic interaction mech-anism was still lacking.

Page 332: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPA Final Pages

Encyclopedia of Physical Science and Technology en016b-749 July 31, 2001 15:25

248 Superconductivity

FIGURE 26 Tunneling current as a function of applied magneticfield H on a tunnel junction consisting of two superconductors.[From Rowell, J. M. (1963). Phys. Rev. Lett. 11, 200.]

In 1957 Bardeen et al. proposed a theory of supercon-ductivity in which they expressed the superconductingtransition temperature in terms of an interaction betweenthe electrons and the lattice vibrations of a solid. Thequanta of lattice vibrations in a solid are called phonons.According to this theory, when the temperature of a solidis lowered, an interaction between the electrons and thephonons causes an attractive force between the conduc-tion electron pairs called Cooper pairs. These Cooperpairs are paired states with equal and opposite momen-tum at zero supercurrent. When a current is applied toa superconductor, all the electron pairs have the samemomentum directed parallel to the electric field. Dueto this coherent motion, the pairs do not collide withthe lattice and there is no electrical resistance. The ex-pression for the superconducting transition temperatureTc is

KBTc = 1.14ηωc exp(−1/(N (0)V )). (8)

This equation is valid for N (0) 1. Here N (O) is theelectron density of states, V the net attractive potentialbetween the electrons, and ωc the principal phonon fre-quency. The temperature Tc is extremely sensitive to smallchanges in V . This theory successfully explains most ofthe physical property changes associated with the super-conducting transition. lt is rather difficult to calculate thesuperconducting transition temperature itself using thistheory. lt should be mentioned that in all critical phenom-

ena, the critical temperatures are most difficult to calcu-late. For example, it is not easy to calculate the freezingor boiling point of water.

lt has been observed experimentally that the supercon-ducting transition temperature of an element varies withthe isotope mass. For example, for the isotopes of mer-cury, Tc varies between 4.185 and 4.146 K, whereas theaverage atomic mass M varies between 199.5 and 203.4.In Eq. (8) ωc is proportional to 1/

√M , where M is the

atomic mass and V is independent of M in the BCSequation.

Therefore Tc should be proportional to 1/√

M , and thisdependence has been observed in the case of several ele-ments such as tin, mercury, and indium. The term N (0)Voccurring in the BCS theory can be further expressed interms of two parameters: the electron–phonon interac-tion parameter, λ; and µ∗, which describes the normal-ized coulomb repulsion of electrons. This modification ofthe BCS theory was suggested by McMillan for the strongcoupling superconductorsλ � µ∗, where the original BCStheory is not valid. The modified expression for Tc is givenby the expression

Tc = �D/1.45 exp

[− 1.04(1 + λ)

λ − µ∗(1 + 0.62λ)

], (9)

where �D is the Debye temperature. The electron–phononinteraction parameter λ is

λ = η/

Mω2c ,

where η is a constant for a given structure class. Maxi-mization of Tc in Eq. (9) with respect to ωc gives

Tc(max) = (η/2M)1/2 exp(−3/2). (10)

Substituting suitable parameters into Eq. (10), a maxi-mum Tc value of 35 K is calculated. lt must be mentionedthat, at present, a maximum Tc value of 23 K exists forNb3Ge.

XI. APPLICATIONS OFSUPERCONDUCTIVITY

Since its discovery, superconductivity has found many ap-plications in technology. Because the electrical resistancein a superconductor is almost zero, large and homogeneousfields can be generated simply by winding the coils of thewires made from the high critical transition temperatureand critical magnetic field superconducting materials. Inthe last decade much effort has gone into the develop-ment of these superconducting materials. Magnetic fieldsbelow and above 10 T can be produced using supercon-ducting wires made from Nb–Ti and Nb–Sn alloys. Thesesuperconducting magnets have found a broad range of

Page 333: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPA Final Pages

Encyclopedia of Physical Science and Technology en016b-749 July 31, 2001 15:25

Superconductivity 249

applications where normal iron magnets are inadequate.These superconducting magnets can be used as excitermagnets for homopolar generators or rotors in large alter-nators, and a large gain in efficiency and power density isobtained.

Fusion reactors will employ superconducting mag-nets to confine plasma in which deuterium and tritiumwill be fused to produce energy. Soon the six D-shapedsuperconducting coils of dimensions 2.5 × 3.5 m man-ufactured in the United States and in European coun-tries will be tested to produce a magnetic field of8 T, which will be used to confine deuterium–tritiumplasma to produce fusion energy. Superconducting mag-nets have found use in particle beam accelerators forhigh-energy particle physics research. Another applica-tion of superconducting magnets is in nuclear magneticresonance tomography, which requires a homogeneousmagnetic field; superconducting magnets are ideal for thispurpose.

Another application of superconductors is as magneticsensors. As mentioned earlier, a Josephson junction is anextremely nonlinear detector which, when connected to aloop of a superconducting wire, forms a superconductingquantum interference device (SQUID). These SQUIDs areextremely sensitive to small changes in magnetic fields.Based on the Josephson junction, high-frequency electro-magnetic radiation detectors for frequencies in the rangeof microwaves have been developed. Josephson junctiontechnology also finds applications in digital signal anddata processing due to the high-speed and low-powerdissipation compared to semiconductor technology. TheJosephson junction can replace semiconductor technol-ogy where high speed, ultrahigh performance, reliabil-ity, lower power, and compactness are required. Otherapplications of superconductors include lossless trans-port of electrical energy and generation of magneticfields for levitation and propulsion for high-speed groundtransportation.

XII. RECENT DEVELOPMENTS:HIGH-TRANSITION TEMPERATURESUPERCONDUCTIVITY

Until April 1986, the maximum superconducting transi-tion temperature measured in Nb3Ge was ∼23 K. Thislimited superconducting transition temperature allowedlarge- and small-scale applications of superconductorsonly with the use of liquid helium. Decades of experi-mental and theoretical research work showed that the phe-nomenon of superconductivity could be explained by theattraction of electrons caused by electron–phonon inter-action (BCS theory). lt was suggested that, based on this

mechanism, a superconducting transition temperature of∼35 K could be achieved. These conclusions were basedon research on about 24,000 superconducting inorganicphases. In 1986, J. G. Bednorz and K. A. Muller pub-lished a paper in Zeitschrift fur Physik on the possibil-ity of a superconducting transition temperature as highas 30 K in a mixture of lanthanum and barium–copperoxide (La2 −x Bax –CuOx )(x ∼ 0.15) of tetragonal K2NiF4

structure. This discovery broke all previous records andreceived world attention, and the two authors received the1987 Nobel Prize.

In a short time, superconducting oxides in the ranges30–40, 90–100, and above 100 K were discovered. Atpresent, the highest achievable superconducting transitiontemperature under normal conditions is about 133 K. Thesuperconducting oxides of ∼90 K superconducting tran-sition temperature are rare-earth barium–copper oxides oforthorhombic structure. The oxygen content in these ox-ides plays a major role in the superconductivity. Whenthe oxygen content is reduced, the oxides transform toa tetragonal structure and become semiconducting. Su-perconducting transition temperatures above 100 K areobserved in thalium-, bismuth-, strontium-, calcium-, andcopper-based oxides.

All these materials are ceramics and brittle, not ductilelike metals or alloys, and the electronic properties arehighly anisotropic. The critical current density is highin one direction and low in the other, perpendicular,direction. The epitaxial thin films of some of these ox-ides show critical current densities of 106 A/cm2 at liquidnitrogen temperature. The critical current density of poly-crystalline materials in the polycrystalline state is verylow and not suitable for technical applications. The co-herence length in these ceramic superconductors is quitesmall and is comparable to the lattice constants. These ma-terials show rather strong electron–electron interactions,for example, as reported by Steiner et al. (1988). There-fore there is increasing evidence that the electron pair-ing in the superconducting state is of a pure electronicnature as suggested by Anderson (1987), and not causedby electron–phonon interaction.

The mechanical properties of these ceramic supercon-ductors as well as their superconducting properties maybe improved by the addition of silver metal as reportedby Khan et al. At present, a worldwide effort is ongoingto improve the mechanical properties and to increase thecritical current densities of these materials for large-scaleapplications. Once the mechanical properties of ceramicsuperconductors are improved and the critical current den-sity is increased to a practical value, it is expected thatthese superconducting materials will revolutionize vari-ous technologies by working at liquid nitrogen, rather thanliquid helium, temperatures.

Page 334: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPA Final Pages

Encyclopedia of Physical Science and Technology en016b-749 July 31, 2001 15:25

250 Superconductivity

SEE ALSO THE FOLLOWING ARTICLES

CRYOGENIC PROCESS ENGINEERING • CRYOGENICS •FERROMAGNETISM • RARE EARTH ELEMENTS AND

MATERIALS • SUPERCONDUCTING CABLES • SUPER-CONDUCTING DEVICES • SUPERCONDUCTIVITY MECH-ANISMS • SUPERCONDUCTORS, HIGH TEMPERATURE •THERMOELECTRICITY

BIBLIOGRAPHY

Anderson, P. W. (1987). Science 235, 1196.Bardeen, J., Cooper, L. N., and Schrieffer, J. R. (1957). Phys. Rev. 108,

1175.

Barone, A., and Paterno, G. (1982). “Physics and Applications of theJosephson Effect,” Wiley, New York.

Bednorz, J. G., and Muller, K. A. (1986). Z. Phys. B64, 189.Buckel, W. (1972). “Supraleitung,” Physik Verlag GmbH, Weinheim,

Germany.Khan, H. R. (1984). Gold Bull. 17(3), 94.Khan, H. R. (1998). J. Superconduct 11, 1.Khan, H. R., and Loebich, O., (1995). Physica C. 254, 15.Khan, H. R., and Raub, C. J. (1985). Annu. Rev. Mater. Sci. 15, 21.Kittel, C. (ed.) (1976). “Introduction to Solid State Physics,” 5th ed.

Wiley, New York.Newhouse, V. L. (1964). “Applied Superconductivity,” Wiley, New York.Putlin, S. N., and Antipov, E. V. (1993). Nature 362, 226.Roberts, B. W. (1976). J. Phys. Chem. Data 5(3), 581–821.Saint-James, D., Sarma, G., and Thomas, E. J. (1969). “Type II Super-

conductivity,” Pergamon, Oxford.Steiner, P., et al. (1988). Z. Phys. B69, 449.

Page 335: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

Superconductivity MechanismsJozef SpalekJagiellonian University and Purdue University

I. IntroductionII. The Bardeen–Cooper–Schrieffer (BCS) Theory:

A Brief SummaryIII. Normal and Magnetic States of Correlated

ElectronsIV. Novel Mechanisms of Electron PairingV. Conclusions

GLOSSARY

Almost-localized Fermi liquid A metallic system which,under a relatively small change of an external parametersuch as temperature, pressure, or composition, under-goes a transition to the Mott insulating state. In sucha metal electrons have a large effective mass. At lowtemperatures the system may order antiferromagnet-ically or undergo a transition to the superconductingstate. Both nonstoichiometric oxides (such as V2O3−y)and heavy-fermion systems (e.g., UPt3) are regardedas almost-localized Fermi liquids.

Bardeen–Cooper–Schrieffer (BCS) theory Theory de-scribing properties of superconductors in terms of theconcept of pairing of electrons with opposite spins andmomenta. The pairing of electrons is mediated by adynamic positive-ion lattice deformation, which pro-duces resultant attractive interaction overcoming theirmutual coulomb repulsion. At a critical temperaturethe electron system undergoes a phase transition to acondensed state of pairs which is characterized by azero dc electrical resistance and a strong diamagnetism

(Meissner–Ochsenfeld effect). The condensed state isdestroyed by the application of an applied magneticfield (the critical fields Hc and Hc2 for superconduc-tors of the first and second kinds, respectively.

Correlated electrons Electrons with their kinetic (orband) energy comparable to or lower than the mag-nitude U of electron–electron repulsion. This situa-tion is described by the condition U � W , where Wis the width of a starting (bare) energy band. Strictlyspeaking, we distinguish between the limits of almost–localized Fermi liquids, for which U � W , and thelimits of strongly correlated electrons (Tomonaga–Luttinger or spin liquids), for which U � W . The term“correlated electrons” means that the motion of a singleelectron is correlated with that of others in the system(for example, its effective mass depends on the two-particle correlation function).

Exchange interaction Part of the coulomb interactionbetween electrons which depends on the resultant spinstate of their partially filled d or f shells. If the spin–singlet configuration is favored in the ground state,then the interaction is called antiferromagnetic. The

251

Page 336: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

252 Superconductivity Mechanisms

exchange interaction provides a mechanism of mag-netic ordering in Mott insulators; it may also correlateelectrons into singlet or triplet pairs in the metallic state,particularly when the pair-exchange coupling J of anelectron pair is comparable to the kinetic energy of eachof its constituents, as is the case for strongly correlatedelectrons. The superexchange (or kinetic exchange) isinduced by a strong electron correlation.

Fermi liquid Term describing the state of interactingelectrons in a metal. Equilibrium properties of suchsystems are modeled by a gas of electrons with renor-malized characteristics such as the effective mass (theyare called quasiparticles). The properties at low tem-peratures are determined mainly by electrons near theFermi surface. The electron–electron interactions leadto specific contributions to the transport properties ofsuch a system producing, e.g., sound-wave and plas-mon excitations.

High-temperature superconductors Oxide materials ofthe type La2−x Srx CuO4 or YBa2Cu3O7−x , which havea layer structure, with the principal role of elec-trons confined to the CuO2 planes. The term “high-temperature superconductors” (HTS) was coined todistinguish these and other oxide superconductors witha critical temperature Tc � 20 K from “classical” super-conductors, which comprise metals and intermetalliccompounds such as Nb3Ti with Tc � 23 K. At present,this class of materials (HTS) is characterized by thequasi-two-dimensonal structure of the normal metallicstate above Tc and strong deviations from either thenormal Fermi-liquid or the BCS superconducting typeof behavior in the corresponding temperature regimesT > Tc and T < Tc, respectively.

Hubbard subband Term describing each of the two partsof an energy band in a solid which splits when theelectron–electron repulsion energy is comparable to(or larger than) their kinetic (band) energy. The Hub-bard splitting of the original band induced by the inter-action explains in a natural way the existence of theMott insulating state in the case of an odd numberof electrons per atom (that is, when the atomic shellswould normally form an only half-filled band; cf., e.g.,CoO).

Mott insulator An insulator containing atoms with par-tially filled 3d or 4f shells. These systems order mag-netically (usually antiferromagnetically) when the tem-perature is lowered. Thus, they differ from ordinary(Bloch–Wilson) or band insulators, which are weaklydiamagnetic, and are characterized by filled atomicshells, separated from empty states by a gap. In the an-tiferromagnetic phase of the Mott insulators each elec-tron with its (frozen) spin oriented up is surrounded

by electrons with their spin down, and vice versa. Theparent stochiometric materials for high-temperature su-perconductors (e.g., La2CuO4 and YBa2Cu3O6) are an-tiferromagnetic Mott insulators with Neel temperatures(TN = 250 and 415 K, respectively).

Real-space pairing Source of attraction or superconduct-ing correlations that is not induced by lattice defor-mation (phonons). Such pairing may be provided bythe density fluctuations within the interacting electronsubsystem (e.g., by spin fluctuations or other excita-tions). By real-space pairing we mean the pairing ofelectron spins in correlated metals caused by exchangeinteractions (e.g., kinetic exchange) among electrons incoordinate space. The essence of the real-space pair-ing, not resolved as yet, is contained in the question,Can a strong short-range part of the coulomb repulsion(of range a0) lead to an attraction (an effective bind-ing) at intermediate distances (2 ÷ 10a0), where strongsinglet–spin correlations prevail?

Strongly correlated electrons Electrons describing themetallic state of high-temperature superconductors,some heavy-fermion systems (non-Fermi liquids),and, particularly, systems of low dimensonality, d = 1and 2. In these systems, the concept of a Fermi liq-uid is inapplicable, and for d = 1, at least, the chargeand spin degrees of freedom lead to separate quasi-particle representations—holons and spinons, respec-tively. The quantum liquid describing strongly corre-lated electrons composes a new quantum macrostate.

I. INTRODUCTION

Superconductivity remains among the most spectacu-lar manifestations of a macroscopic quantum state ofelectrons in a metal or plasma. Experimentally, one ob-serves below a characteristic temperature Tc a transitionto a phase with nonmeasurable dc resistance (or with apersistent current), a perfect diamagnetism of bulk sam-ples in a weak magnetic field, and quantum tunneling be-tween superconductors separated by an insulating layer ofmesoscopic (∼1-nm) thickness. In the theoretical domain,one studies the quantum–mechanical (nonclassical) mech-anisms of pairing of the microscopic particles (fermions)at a macroscopic scale. Here, we summarize brieflyour present understanding of the Bardeen–Cooper–Schrieffer (BCS) theory of “classical” superconductors(see Section II) and we review the current theoreticalapproaches to new superconductors: the heavy-fermionmaterials and the high-Tc magnetic oxides. The latter sub-ject is discussed in Section IV, after we summarize normal-state properties of correlated electrons in Section III.

Page 337: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

Superconductivity Mechanisms 253

A brief characterization of the recent studies of super-conductivity is in order. From the time of the first dis-covery (1911) of superconductivity in mercury (at tem-perature Tc � 4.2 K) by Kammerlingh Onnes until 1986,studies were limited to low temperatures, T < 25 K. Dur-ing the next 5 years, six classes of new superconduct-ing compounds with critical temperatures Tc = 30 K (forBa1−x Kx BiO3), 40 K (for La2−x Srx CuO4), 90 K (forYBa2Cu3O7−δ), 110 K (for Bi2Sr2CaCu2O8), and 125 K(for Tl2Ca2Ba2Cu3O10−y), and 135 K (HgBa2Ca2Cu3O8)were discovered and/or thoroughly studied in a number oflaboratories. In recent years the idea has also been appliedto new systems such as Fermi condensated dilute gasesand quark–gluon plasma in high-energy physics. Apartfrom the discovery of spin–triplet pairing in liquid 3He,evidence for it in Sr2RuO4 also opens new possibilities forpairing studies.

The starting point in both classical and new supercon-ducting materials is the electronic structure that deter-mines the metallic properties in the normal phase (thatis, that above Tc). In this respect, the classical supercon-ductors are well described by band theory and, in somecases, starting from the concept of the Fermi–liquid con-cept. In contrast, the new materials are characterized asthose whose electrons are close to localization, that is,those close to the metal–insulator transition of the Mott–Hubbard type. The latter transition may be induced bya relatively small change in compound composition (cf.the behavior of La2−x Srx O4 or YBa2Cu3O7−x as a func-tion of x). It is quite interesting to note that oxides suchas YBa2Cu3O7−x may be synthesized in either insulat-ing (x � 0.65) or metallic states. Additionally, antiferro-magnetic ordering of the 3d electrons is observed closeto the insulator–metal transition; the magnetic insulat-ing state transforms into a superconducting state when0 ≤ x � 0.65. Therefore, an account of our understand-ing of the almost-localized metallic state in a normal ormagnetic (that is, nonsuperconducting) phase is highlydesirable and summarized in Section III. The antiferro-magnetic insulating, normal metallic, and superconduct-ing states must all be treated on the same footing for aproper characterization of high-Tc oxides. In this manner,the studies of those systems must incorporate the descrip-tion of different quantum phase transitions. One can saythat the theory of strongly correlated electrons and of thesuperconductivity in those systems poses one of the mostchallenging problems for physics of the 21st century.

Details of the electronic structure in high-Tc oxidesare also important for two additional reasons. First, asdiscussed later, in these superconductors the coherencelength is quite small, that is, comparable to the latticeconstant. Hence, the details of the wave function on the

atomic scale become crucial. Second, a whole class ofmodels (discussed in Section IV) relies on the electronpairing induced by short-range electron–electron interac-tions. These interactions are strong and also present in thenormal phase. This is the reason one must develop a co-herent theoretical picture of the correlated metallic statethat undergoes a transformation either to the Mott insu-lating or to the superconducting state. Such a theory doesyet not exist.

In this chapter, the properties of correlated electrons innormal, insulating, magnetic, and superconducting phasesare reviewed and related to the parametrized models, start-ing from either Hubbard or Anderson-lattice Hamiltoni-ans. These are the models that describe the properties ofcorrelated metallic systems in terms of a few parameters,such as the band width W of starting (uncorrelated, bare)electrons, the magnitude U of short-range (intraatomic)coulomb interactions, etc. Such models provide an over-all understanding of both the nature of correlated metallicand insulating ground states and the underlying thermody-namic properties of these systems. However, the guidanceof detailed band structure calculations is often needed inchoosing appropriate values for the microscopic parame-ters, as well as to understand the specific features of thecompounds.

II. THE BARDEEN–COOPER–SCHRIEFFER(BCS) THEORY: A BRIEF SUMMARY

The BCS theory [1–10] relies on three features of metal-lic solids: (1) the electron–lattice interaction; (2) the for-mation of an electron-pair bound state (the so-calledCooper pair state) due to the coupling of the electrons tothe lattice; and (3) the instability of the normal metallicstate with respect to the formation of a macroscopic con-densed state of all pairs (k↑, −k↓) with antiparallel spinsin momentum (k) space. The condensed state exhibits theprincipal properties of superconductors, such as a per-fect diamagnetism, zero dc resistance, etc. We first dis-cuss these three features briefly and then summarize someconsequences of the BCS theory. The BCS theory not onlydeals with one of the possible (phonon-mediated) mecha-nisms for superconductivity, but also provides proper lan-guage for the description of such a condensed state ingeneral terms, independent of the particular pairing mech-anism. One should also remark at the beginning that such acondensed state of pairs cannot be regarded as a Bose con-densed state if the size of the bound-state wave functionξ (the coherence length) is much larger than the interpar-ticle distance a = (V/N )

13 ; this happens for the “classic”

superconductors.

Page 338: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

254 Superconductivity Mechanisms

A. From Electron–Phonon Coupling to theEffective Attractive Interaction betweenElectrons: Virtual Exchange of Phonons

The electron–lattice interaction can be described by intro-ducing phonons as quasiparticles representing vibrationalmodes of the lattice. In this picture, an electron movingin a solid and scattering on the lattice vibration absorbsor emits a phonon with energy hωq and quasi-momentumhq. If during such processes the energy of the incomingelectron (with energy εk and momentum hk) and the scat-tered electron (with energy εk′ ) is conserved, then a realscattering process has taken place. Such events lead to thenonzero resistivity of metals at temperature T > 0. Forthese processes

εk′ − εk = ±hωq, (1)

where − corresponds to the emission and + to the absorp-tion of the phonon. However, in the quantum-mechanicaldescription of scattering processes, there also exist virtualprocesses that do not conserve energy. Such events involvethe emission and subsequent reabsorption of a phononin a time interval �t such that the uncertainty principle�E · �t ≥ h is not violated. The uncertainty of particleenergies �E is related to the magnitude of the electron–phonon interaction. In effect, this leads to the followingeffective electron–electron interaction energy involving apair (k, k′) of electrons:

Vkk′q = |Wq|2 hωq

(εk′ − εk)2 − (hωq)2, (2)

where (k′ − k) = q, and Wq is the electron–phonon matrixelement characterizing the process of single emission orabsorption of the phonon by the electron subsystem. Inmany electron systems, one represents Eq. (2) by an ef-fective electron–electron interaction, which can be written

FIGURE 1 (a) Scattering diagram of electrons with wave vectors k → k + q, accompanied by emission of the phononof wave vector −q. (b) Virtual emission and subsequent reabsorption of the phonon by electrons. The two processesdrawn combine into the contribution [Eq. (2)] leading to the effective electron–electron attraction.

H ′ =∑kk′q

Vkk′qc+k+qσ c+

k′−qσ ′ck′σ ′ckσ . (3)

This is a phonon—mediated contribution to the interac-tion between electrons. More precisely, in this expressionckσ symbolizes a destruction or annihilation of an electronin the initial single—particle state |kσ 〉, whereas c+

k′σ ′ isthe creation of an electron in the state |k′σ ′〉 after the scat-tering process has taken place. The processes representedin Eq. (3) of destruction of the electron pair in the states|kσ 〉 and |k′σ ′〉 and their subsequent reestablishment inthe final states |k + qσ 〉 and |k′ − qσ ′〉 are customarilyrepresented by a diagram of the type in Fig. 1b. It sym-bolizes the phonon exchange between the two electronsmoving through crystal. The virtual processes are com-posed of two parts: one describing phonon emission andthe subsequent reabsorption process and one describingthe reverse process.

One should note that if in Eq. (2) |εk′ − εk| < hωq,then Vkk′q < 0, that is, the interaction is attractive. Thishappens, for example, on the Fermi surface, whereεk = εk′ = µ. The sign of the interaction changes rapidlyonce we depart from the Fermi surface, since the electronicenergies present are much higher than that of phonons.Hence, if only the magnitude of attraction overcomes themagnitude of the coulomb repulsion between the electronsin a given medium, this leads to a net attraction betweenthe electrons. Such a net attractive interaction results in astable superconducting state, as we shall see next.

B. Instability of the Electron Gas Statein the Case of Attractive Interactionbetween Electrons: Cooper Pairs

Following Frohlich’s discovery [11] that the electron–electron attraction can be mediated by phonons (cf.the previous discussion), the next step was taken by

Page 339: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

Superconductivity Mechanisms 255

FIGURE 2 Schematic representation of the conduction bandfilled with electrons up to the Fermi level εF. The density of statesis ρ(ε) (per one spin direction). The two electrons added to thesystem attract each other with the potential Vkk′ = −V if placedwithin the energy interval hωp counting from εF. The attractionleads to a binding energy � below εF for the pair configuration(k↑, −k↓).

Cooper [12], who asked what happens when two elec-trons are added to an electron gas at T = 0. Because ofthe Pauli exclusion, they must occupy the states above theFermi level, as shown in Fig. 2. Cooper showed that if theattractive potential in Eq. (2) is approximated by a nega-tive nonzero constant (−V ) in the energy interval 2hωp

around the Fermi level εF (cf. Fig. 2), then such a potentialintroduces a binding between these two electrons with abinding energy

� = −2hωp

[exp

(2

ρV

)− 1

]−1

� −2hωp exp

(− 2

ρV

),

(4)

relative to the energy 2εF of those two particles placed atthe Fermi energy. In this expression, hωp represents theaverage phonon energy (related to the Debye temperatureθD through hωp = kBθD), and ρ is the density of free—particle states at the Fermi energy εF for the metal underconsideration.

A few important features of the bound state representedby Eq. (4) should be mentioned. First, the binding energy� is largest for the state of the pair at rest, that is, with thetotal pair momentum k1 + k2 = 0. Thus, � represents thebinding energy of the pair (k, −k). Second, the spin ofthe pair is compensated, that is, a singlet state is assumed.Finally, the bound state has a lower energy than a pair of

free particles placed at the Fermi level. Hence, the electrongas state is unstable with respect to such pair formation. Asystem of such pairs may condense into a superfluid state.However, the situation is not so simple since the size ofthe pair is of the order

ξ = 〈r2〉 12 ≈ 2hVF

�≈ hVF

kBTC∼ 10+4 A, (5)

where VF is the Fermi velocity for electrons. The quantity ξ

thus exceeds by far the average classical distance betweenthe electrons, which is comparable to the interatomic dis-tance a ∼ 1

2 2 A. In other words, the wave functions of thedifferent pairs overlap very strongly, forming a condensedand coherent state of pairs in the superconducting phase.The properties of this condensed phase are discussed next.The new length scale ξ appearing in the system when elec-trons are bound into Cooper pairs is called the coherencelength.

C. Properties of the Superconducting State:The Pairing Theory

The BCS theory [1] provides a method of calculating theground state, thermodynamic, and electromagnetic prop-erties of a superconductor treated as a condensed stateof electron pairs with opposite momenta and spins. Thestarting microscopic Hamiltonian is

H =∑kσ

εknkσ +∑kk′

Vkk′c+k↑c+

k↓c−k′↓ck′↑. (6)

The first term describes the single-particle (band) energy,εk being the energy per particle and nkσ = c+

kσ ckσ the num-ber of particles in the state |kσ 〉. The second term describesthe pairing part [Eq. (3)] for the system of pairs that scat-ters from the state (k′, −k′) into the state (k, −k). Thisterm describes the dominant contribution of all processescontained in Eq. (3) (cf. Ref. 10).

To obtain eigenenergies of the Hamiltonian [Eq. (6)],one can use either the variational method due to Schrieffer[2], the transformation method developed by Bogoliubovand Valatin [13], or the two-component method due toNambu [13]. To obtain quasiparticle states in the super-conducting phase, one has to combine an electron in thestate |k↑〉 with one in the time-reversed state |−k↓〉. Moreprecisely, one defines new quasiparticle operators λ+

k0 andλ+

k1, which are expressed by the operators c+ and c in thefollowing manner:

λ+k0 = ukc+

k↑ − vkc−k↓,

and

λ+k1 = vkc+

k↑ + ukc−k↓.

Page 340: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

256 Superconductivity Mechanisms

The coefficients of the transformation fulfill the conditionu2

k + v2k = 1. One should note that the transformation does

not conserve the particle number, so one has to add theterm (−µN ) to the Hamiltonian (b), where µ ≡ εF is thechemical potential in the superconducting state.

The single-particle excitations in the superconductingphase are specified by

Ek = [(εk − µ)2 + |�k|2

] 12 , (7)

where µ is the chemical potential of the system and |�k|is the so-called superconducting gap determined from theself—consistent equation

�k = −∑

k

Vkk′�k

2Ek′tanh

(βEk′

2

), (8)

with β ≡ (kBT )−1. One should note that if Vkk′ is approx-imated by a negative constant, then �k = �; Eq. (8) thenyields as a solution either � ≡ 0 or � �= 0, obeying theequation

1 = V

N

∑k

′ 1

2Ektanh

(βEk

2

), (9)

where now

Ek = [(εk − µ)2 + �2

] 12 . (10)

The primed summation in Eq. (9) is restricted to the regimeof k states where V �= 0. Equations (9) and (10) constitutethe simplest BCS solution for an isotropic (k-independent)gap. One sees that Ek is always nonvanishing and reaches aminimum Ek = � for electrons placed on the Fermi level,where εk = µ. Thus, the meaning of the gap becomes obvi-ous: it is the gap for the single—electron excitations fromthe superconducting (condensed) phase to a free—particlestate. The presence of a gap � > kBTc in the spectrum ofsingle—particle excitations suppresses the scattering ofelectrons with acoustic phonons. The thermally excitedelectrons across the gap do not yield nonzero resistivitybecause their contribution is short-circuited by the pres-ence of the pair condensate that carries a current with noresistance. The same holds true even for the superconduct-ing systems for which the gap vanishes along some linesor at some points in k space.

One should emphasize that all thermodynamic prop-erties are associated with thermal excitations; the en-ergies that are specified by Eq. (7) contain |�k| or �

as a parameter to be determined self-consistently fromEq. (8) or (9), respectively. Next, we provide a brief sum-mary of the results that may be obtained within the BCStheory.

D. Summary of the Properties:The Homogeneous State

The solution of Eq. (9) provides the following properties.

1. At T = 0, Eq. (1) reads1 = (V/2N )

∑k

′E−1

k . (11)

The value of � ≡ �(T = 0) for ρV � 1 is given by

�o = hωp

sinh(1/ρV )≈ 2hωp exp

(− 1

ρV

), (12)

where hωp ≈ kBθD1 and ρ is the density of States at the

Fermi energy. One notes a striking similarity betweenEq. (12) and Eq. (4), particularly for ρV � 1 (this condi-tion represents the so-called weak—coupling limit); Theabsence of factor 2 in (12) provides an enchancement ofthe gap in the condensed state due to the presence of otherelectrons.

2. We can choose the origin of energy at µ. Then Eq. (9)can be transformed into an integral form:

1 = V∫ hωp

o

ρ(ε) dε

(ε2 + �2)12

tanh

2

√ε2 + �2

). (13)

Since hωp � µ, we may take ρ(ε) ≈ ρ(εF) ≡ ρ within therange of integration. This allows for an analytic evaluationof the critical temperature for which � = 0:

Tc = 1.13θD exp

(− 1

ρV

). (14)

In all these calculations, it is implicitly assumed thatρV � 1. Because of the presence of the exponential factorin Eq. (14), the critical temperature Tc is much lower thanthe Debye temperature characterizing the average energyof acoustic phonons. This is the principal theoretical rea-son that Tc is so low in the superconductors discoveredin the period 1911–1986. The exponential dependence ofTc on the electronic parameter ρV also explains why theparameters pertaining to the electronic structure, whichare of the order of 1 eV or more, respond to phase transi-tions on an energy scale that is three orders of magnitudesmaller (kBTc ∼ 1 meV). Effects with such a nonanalyticdependence of transition temperature on the coupling con-stant cannot be obtained in any order of perturbation theorystarting with the normal state as an initial state. A similartype of effect is obtained in the studies of the Kondo effect(cf. Section IV).

3. Combining Eqs. (14) and (1.12) one obtains the uni-versal ratio

2�o

kBTc= 3.53, (15)

1In actual practice, one assumes that hωp ≈ 0.75kBθD (cf. Meserveyand Schwartz in Ref. 9).

Page 341: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

Superconductivity Mechanisms 257

which is frequently used as a test for the applicability ofthe BCS model. However, this value can also be obtainedin the strong-coupling limit [15] for a particular strengthof electron–phonon coupling.

4. By regarding energies Ek as representing electronexcitations across the gap, one can write the expressionfor the entropy of a superconductor in the standard form:

S = −2kB

∑k

[ fk �n fk + (1 − fk) �n(1 − fk)], (16)

where fk ≡ f (Ek) = [1 + exp(βEk)]−1 is the Fermi–Dirac distribution function [1 + exp(βEk)]−1. Hence, thefree energy of the superconducting state is

FS = 2∑

k

Ek fk − TS. (17)

One should note that the thermodynamic properties are de-termined fully only if the chemical potential µ = µ(T ) andthe temperature dependence of the superconducting gap�k = �k(T ) are explicitly determined, since only then isthe spectrum of single—particle excitations (characterizedby the energies {Ek}) uniquely determined. The quantity�(T ) is determined from Eq. (13). The chemical poten-tial is determined from the conservation of the number Ne

of particles, that is, from the condition∑

k fk = Ne. Thetemperature dependence of the gap in the isotropic case isshown schematically in Fig. 3.

5. By calculating this difference of the free energiesFS − FN in superconducting (FS) and normal (FN) phasesand equating the difference with the magnetic free energyH 2

c V/8π (V is the volume of the system), one can obtainan approximate relation of the form

Hc(T )

Hc(0)≈ 1 −

(T

Tc

)2

. (18)

For the applied field H > Hc, superconductivity is de-stroyed because in the thermodynamic critical field Hc

FIGURE 3 Schematic representation of the temperature depen-dence of the superconducting gap for the isotropic change. Tc isthe critical temperature for the transition, and �0 ≡ � (T = 0).

the spin–singlet bound state is destroyed by the thermalfluctuations. The pair binding energy is then effectivelyovercome by the magnetic energy, so that the pairs breakup into single particles. Strictly speaking, this type of be-havior characterizes the so-called superconductors of thefirst kind.

6. By calculating the specific heat from the stan-dard thermodynamic analysis [CS = −T (∂2 FS/∂T 2)V],one obtains at T = Tc a discontinuity of the form

CS − CN

CN= 1.43, (19)

where CN is the specific heat at Tc for the material inits normal state. At low temperatures, the specific heatdecreases exponentially:

CS ∼ exp

(− �0

kBT

), (20)

for the special case of an isotropic gap. However, if the gapis anisotropic [� = �k(T )] and has lines of zeros (alongwhich �k = 0), then the low—temperature dependence ofCS does not follow Eq. (20) but rather a power law T n ,with n depending on the details of the gap anisotropy.

The specific heat grows with T because the number ofthermally broken pairs increases with rising temperature;eventually, at T = Tc (kBTc ∼ �o), all bound pairs disso-ciate thermally, at which point Cs reaches a maximum.If the temperature is raised further (above Tc), the spe-cific heat drops rapidly to its normal-state value since nopairs are left to absorb the energy. This type of behavioris observed in superconductors with an isotropic gap (cf.,e.g., Hg and Sn). One should note that this interpretationof the thermal properties is based on the single—particleexcitation spectrum [Eq. (10)]; we have disregarded anyfluctuation phenomena near Tc, as well as collective ex-citations of the condensed system. It can be shown thatthe large coherence length ξ ∼ 103/104 A encountered inclassic superconductors [8] is related to the absence ofcritical behavior near Tc. This is not the case in high-Tc

superconductors (discussed in Section IV); hence, the newmaterials open up the possibility of studies of critical phe-nomena in superconducting systems.

7. The spin part of the static magnetic susceptibilityvanishes as T → 0. This is a direct consequence of thebinding of electrons in the condensed state into singletpairs. Therefore, the Meissner effect (the magnetic fluxexpulsion from the bulk of the sample) at T = 0 is presentbecause the orbital part of the susceptibility is diamag-netic (roughly, it represents an electron-pair analogue ofthe Landau diamagnetism of single electrons in a normalelectron gas). The expulsion of the magnetic flux from thebulk is measured in terms of the so-called London pene-tration depth λ = λ(T ), which characterizes the decay of

Page 342: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

258 Superconductivity Mechanisms

the magnetic induction inside the sample. It decays accor-ding to

B(z) = Ha exp(−z/λ),

where the z direction is perpendicular to the sample sur-face and the applied magnetic field Ha is parallel to it. Thetemperature dependence of the penetration depth is givenby

λ(T )

λ(0)=

[�(T ) tanh(�/2kBT )

�o

],

(21)

≈[

1 −(

T

Tc

)4]− 1

2

.

This result has been derived under the assumption thatthe coherence length2 ξo � hVF/�o is much larger thanλ. One should note that for a bulk sample of dimensiond � λ the induction B ≡ 0 almost everywhere. This con-dition determines the magnetic susceptibility χ of a su-perconductor regarded as an ideal diamagnet; in cgs units,χ ≡ M/H = −1/(4π ).

8. The relative ratio of the two characteristic lengthsκ ≡ λ/ξ determines the type of superconductivity behav-ior in a magnetic field. From the dependence ξ ∼ �−1, weinfer that as T → Tc, ξ ∼ (Tc − T )−

12 . The same type of

dependence for λ(T ) can be inferred from Eq. (21) whenT → Tc. Within the phenomenological theory of Ginzburgand Landau (which can be derived from the BCS theoryas shown by Gorkov [8]), one can show that if κ � 1/

√2,

then the material is a superconductor of the first kind;if κ � 1/

√2, then the material is of the second kind. The

value of κ is directly related to the penetration depth λ(T ).The thermodynamic critical magnetic field Eq. (18) has theform

Hc(T ) = �0

√2

κλ2(T )(22a)

or, equivalently,

Hc(T ) = �o

2π√

2ξ (T )λ(T ), (22b)

where �0 = hc/2e is the magnetic—flux quantum. Thisvalue of the field terminates superconductivity of the firstkind. For superconductors of the second kind, the corre-sponding field is given by

Hc2 = κ√

2Hc = �0

2πξ (T )2i. (22c)

2The coherence length in a superconductor can be estimated by usingthe uncertainty relation �p · ξo = h where �p is a change of the electronmomentum (at ε = εF) due to the attractive interaction, which can beestimated from the corresponding change of the particle kinetic energy�E = vF�p . Taking �E � �0, we obtain the desired estimate of ξo.

For fields Hc1 < Ha < Hc2 [with Hc1 ≡ Hc(0) �n κ/(√

2κ)],the superconducting phase is inhomogeneous, composedof the lattice of vortices, each of the form of a tube con-taining one flux quantum, penetrating the sample. All ofthe newly discovered high-Tc superconductors are of thesecond kind, with very small values of Hc1 and very largevalues of Hc2. This means that the value of the coherencelength ξ is very small in those systems.

9. The sound absorption coefficient αs in the supercon-ducting phase is related to that in the normal phase αN by

αS

αN= 2

1 + exp(�/kBT ).

This is a very simple result; hence, experimental results for(αS/αN) are used to determine the temperature dependenceof the gap �.

A complete discussion of superconducting states withinthe BCS theory is provided in Refs. 1–10.

E. Strong—Coupling Effects:The Eliashberg Approach

The BCS theory provides a complete though approxi-mate theory of both thermal and dynamic properties ofsuperconductors in the weak-coupling limit ρV � 1. Theelectron–electron interactions deriving from the electron–lattice interaction are treated in the lowest order andthe electron–electron correlations are decoupled in themean field-type approximation. Generalizations of theBCS treatment concentrate on two main problems—(1)inclusion of the repulsive coulomb interaction betweenthe electrons [14] and (2) extension of the BCS theory tothe situation with arbitrarily large electron–phonon cou-pling [15]—by generalizing the treatment of normal met-als, with electron–lattice interactions incorporated in asystematic fashion [16]. Both of these factors have beenincluded in the Eliashberg approach to superconductiv-ity [15].

The coulomb repulsive interaction reduces the effec-tive attractive interaction between the electrons, so that,instead of Eq. (14), one obtains in the BCS approximation

Tc = 1.14θD exp

(− 1

λ − µ∗

), (23)

where λ = ρV is the effective electron–phonon couplingand µ∗ is the so-called coulomb pseudopotential [14] mul-tiplied by ρ.

The Eliashberg correction to the BCS theory mustbe evaluated numerically. The numerical solution of theEliashberg equation representing higher-order correctionsto the BCS theory may be represented by [17]

Page 343: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

Superconductivity Mechanisms 259

FIGURE 4 Numerical solution of Tc versus the electron–phononcoupling constant λ for the coulomb pseudopotential µ∗ = 0.1. Theother parameters are taken as for the superconducting elementniobium. Note that the Eliashberg theory gives a much slowerincrease in Tc than does the BCS theory.

Tc = θD

1.45exp

[− 1.04(1 + λ)

λ − µ∗(1 + 0.62λ)

]. (24)

Figure 4 illustrates the difference in the values of Tc ob-tained by the BCS vs the Eliashberg theory [18]. We seethat the repulsive coulomb interaction and the higher-orderelectron–phonon effects combine to reduce the supercon-ducting transition temperature drastically. This and otherresults [19] have led to the conclusion that the value of Tc

determined within the phonon–mediated mechanism hasan upper limit of the order of 30 K.

One should mention a very important feature of thephonon-mediated electron pairing. Namely, the transitiontemperature is proportional to the Debye temperature �D.Hence, Tc given by expression (23) depends on the massM of the atoms composing the lattice. In the simplest situ-ation we expect that Tc ∼ M− 1

2 . A dependence of Tc on themass M was demonstrated experimentally [20] by study-ing the isotope influence on Tc. These observations pro-vided a crucial argument in favor of the lattice involvementin the formation of superconducting state. If the Coulombrepulsion between electrons is taken into account, then therelation is Tc ∼ M−α with [17]

α = 1

2

{1 − (1 + λ)(1 + 0.62λ)

[λ − µ∗(1 + 0.62λ)]2

}.

In the strong coupling limit (λ ≥ 1) the exponent α islargely reduced from its initial value 1

2 . Therefore, if thevalue of α is small, one may interpret this fact as the ev-idence for either strong electron–phonon coupling or thata new nonphonon mechanism is needed to explain the su-perconductivity.

F. Where Do We Go from Here?

The BCS theory is a microscopic theory providing adescription of thermodynamic and electrodynamic prop-erties as a function of two parameters: T/Tc and Ha/Hc,where Tc contains the effective attraction strength |V |.Such a simple approach is not possible for high-temperature superconductors, as one can see from alreadyexisting books and review articles [cf. Refs. 21a–k]. Inthe next two sections we summarize briefly the principalfeatures of strongly correlated systems. This discussionprovides us with new phenomena and some new termsdescribing them. This overview by no means containsa full discussion of papers published during the last 12years. Rather, we sketch different paths of approachingthe problems encountered in dealing with strongly corre-lated fermions.

III. NORMAL AND MAGNETIC STATESOF CORRELATED ELECTRONS

A. Narrow—Band Systems

The modern theory of metals derives from the concept ofa free electron gas, which obeys the Pauli exclusion prin-ciple. The principal influences of the lattice periodic po-tential on the individual electron states are to renormalizetheir mass and to change the topology of the Fermi surface.Landau [22] was the first to recognize the applicability ofthe electron–gas concept to the realistic situation wherethe repulsive coulomb interaction between particles is notsmall compared to the kinetic energy of electrons near theFermi surface. He incorporated the interaction betweenelectrons into a further (many–body) renormalization ofthe effective mass and investigated the physical proper-ties, such as specific heat, magnetic susceptibility, soundpropagation, and thermal and electric conductivities in theterms of quasiparticle contributions.

An important next development was contributed byMott [23], who pointed out that if the coulomb interac-tion between the electrons is sufficiently strong (that is,comparable to the band energy of the quasiparticles), thenelectrons in a solid would have to localize on the atoms,e.g., with one valence electron per atom. This qualitativechange of the nature of single–electron states from thosefor a gas to those for atoms is called the metal–insulatoror the Mott transition. An empty (unoccupied) state inthe Mott insulator (that is, that without electrons avail-able) will act as a mobile hole. In these circumstances, thetransport of charge takes place via the correlated hoppingof electrons through such hole states. In the Mott insulatorlimit, those hole states play a crucial role in establishingthe superconductivity of oxides, as discussed in Section IV.

Page 344: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

260 Superconductivity Mechanisms

The paramagnetic or magnetically ordered states ofelectrons comprising the Mott insulator distinguish thisclass of materials from ordinary band (Bloch–Wilson)insulators or intrinsic semiconductors; the latter arecharacterized at T = 0 by a filled valence band and anempty conduction band, separated by a gap. The electronsin the filled valence band are spin-paired into |k↑, k↓〉singlets; hence Bloch–Wilson insulators are diamagnetic.

The basic question now arises whether one can treatMott insulators and metals within a single microscopicdescription of electron states by generalizing the bandtheory of electron states so as to describe Mott insula-tors within the same microscopic model. The first step inthis direction was proposed by Hubbard [25], who showedby the use of a relatively simple model that as the inter-action strength (characterized by the magnitude U of theintraatomic coulomb repulsion) increases and becomescomparable to the band energy per particle (characterizedby the bare bandwidth W ), the original band of single-particle states splits into two halves. Thus, the Mott insu-lator may be modeled by a lattice of hydrogeniclike atomswith one electron per atom, placed in the lowest 1s state.The distinction between the normal metallic and the Mottinsulating states is shown schematically in Figs. 5a and b,where the metal (a) is depicted as an assembly of electronsrepresented by the set of plane waves characterized by thewave vector k and spin quantum number s = σ/2, whereσ ≡ ±1.

The transformation to the Mott localized state maytake place only if the number of electrons in the metal-lic phase is equal to the number of parent atoms, that is,when the starting band of free electron-like states is half-filled. The collection of such unpaired spin moments willlead to the paramagnetic Curie–Weiss behavior at hightemperatures. As the temperature is lowered, the systemundergoes a magnetic phase transition; in the case of theMott insulators, the experimentally observed transition isalmost always to antiferromagnetism, as shown in Fig. 5b,where each electron with its spin moment up is surroundedby electrons on nearest-neighboring sites with spins in theopposite direction (down). Such a spin configuration re-flects a two–sublattice (Neel) antiferromagnetic state. Theactual magnetic structure of Cu2+ ions in La2Cu O4, takenfrom Ref. 24, is shown in Fig. 6. The expectation value ofthe spin is reduced by 40% from the value sz = 1

2 .If the number of electrons in the band is smaller than

the number of available atomic sites, then electron local-ization cannot be complete because empty atomic sitesare available for hopping electrons. However, for the half-filled band case, as the ratio U/W increases, half of thetotal number of single-particle states in the starting bandis gradually pushed above the Fermi level εF. An increasein the ratio U/W may be achieved by lengthening the

FIGURE 5 (a) Schematic representation of a normal metal asa lattice of ions and the plane waves, with wave vector k repre-senting free electron states. (b) Model of the Mott insulator as alattice of atoms with electrons localized on them. Note that theground–state configuration is usually antiferromagnetic (with thespins antiparallel to each other).

FIGURE 6 The magnetic structure of La2CuO4. The neighboringCu2+ ions in the planes have their spins (each representing the3d 9 configuration) antiparallel to each other. The antiferromag-netic structure is three–dimensional. (From Endoh et al. [24].)

Page 345: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

Superconductivity Mechanisms 261

interatomic distance, thus reducing W , which is directlyrelated to the wave–function overlap for the two stateslocated on the nearest-neighboring sites. The splitting ofthe original band into two Hubbard subbands eliminatesthe paired (spin–singlet) occupations of the same energystate ε. Effectively, this pattern reflects the situation ofelectrons being separated from each other as far as pos-sible; however, the correspondence between the Hubbardsplit–band situation, shown schematically in Fig. 7, andthe electron disposition in the spin lattice in real space (cf.Fig. 5b) is by no means obvious and requires a more de-tailed treatment that relates these two descriptions of theMott insulator. This problem is dealt with in the followingsection.

1. The Hubbard Model

In discussing narrow band systems, one usually starts fromthe model Hamiltonian due to Hubbard [25], which ap-pears to be complicated but can really be interpreted insimple terms, namely,

H =∑kσ

εkσ nkσ + U∑

i

ni↑ni↓, (25)

where εk is the single-particle (band) energy per electronwith the wave vector k, U is the magnitude of intraatomiccoulomb repulsion between the two electrons located onthe same atomic site i , nkσ is the number of electrons inthe single–particle state |kσ 〉, and niσ is the corresponding

FIGURE 7 The Hubbard splitting of the states in a single, half–filled band for the strength of the intraatomic coulomb interactionU >Uc. The state with a filled lower Hubbard subband for U >Uc

is identified with that of the Mott insulator. [From Ref. 25.]

quantity for the atomic state |iσ 〉. This simple Hamiltoniandescribes the localization versus delocalization aspect ofelectron states since the first term provides the gain in en-ergy (εk < 0) for electrons in the band state |kσ 〉, whereasthe second accounts for an energy loss (U > 0) connectedwith the motion of electrons throughout the system that ishindered by encounters with other electrons on the sameatomic site. The competitive aspects of the two terms areexpressed explicitly if the first term in Eq. (24) is trans-formed by the so–called Fourier transformation to the site{|iσ 〉} representation. Then Eq. (24) may be rewritten

H =∑i jσ

ti j a+iσ a jσ + U

∑i

ni↑ni↓, (26)

where

ti j = 1

N

∑k

εk exp[ik · (R j − Ri )], (27)

is the Fourier transform of the band energy εk and a+iσ (aiσ )

is the creation (annihilation) of electrons in the atomic(Wannier) state centered on the site R j . The first term inEq. (25) represents the motion of an electron through thesystem by a series of hops j → i , which are describedin terms of destruction of the particle at site j and itssubsequent recreation on the neighboring site i . The widthof the corresponding band in this representation is givenby

W = 2∑j(i)

|ti j | ≈ 2z|t |, (28)

where z is the number of nearest neighbors (n.n.), and tis the value of ti j for the n.n. pair 〈i j〉. Thus, the Hamil-tonian [Eq. (25)] is parameterized through the bandwidthW and the magnitude U . In actual calculations, it is theratio U/W that determines the localized versus collectivebehavior of the electrons in the solid.

2. Hubbard Subbands and Hole States

The normal–metal case is represented in Eq. (24) by thelimit W/U � 1; the first (band) term then dominates. Onthe other hand, the complementary limit W/U � 1 cor-responds to the limit of well-separated atoms, since theexcitation energy of creating double occupancy on a givenatom (with the energy penalty ε ∼ U ) far exceeds the bandenergy of individual particles. The transition from themetallic to the atomic type of behavior takes place whenW ∼ U ; this is also the crossover point where the singleband in Fig. 7 splits in two. The actual dependence ofthe density of states for interacting particles is shown inFig. 8 (taken from Ref. 26). These curves were drawn forthe Lorentzian shape of the density of states (DOS), thatis, for a starting band with a characteristic width �:

Page 346: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

262 Superconductivity Mechanisms

FIGURE 8 The Hubbard splitting of the states for different bandfillings, n = 0.3, 0.6, and 0.9, and for different U/W ratios, 0.5,2, and 10, respectively. The x axis is the particle energy value;the y axis is the density of states value. The arrow indicates theposition of the Fermi energy, whereas the dashed line representsthe inverse lifetime of the quasiparticle state in the pseudogap.[From Ref. 26.]

po(ε) = �

π

1

(ε − t0)2 + �2, (28a)

where t0 determines the position of the center of the band(usually chosen as t0 = 0). Detailed calculations [28]show that with a growing magnitude of interaction (U/�),the DOS [Eq. (28a)] splits into two parts described by thedensity of states:

ρ(ε) = �

π

[1 − (n/2)

(ε − t0)2 + �2+ n/2

(ε − t0 − U )2 + �2

].

(28b)

The first term describes the original DOS [Eq. (28a)], withthe weighting factor (1 − (n/2)), whereas the second rep-resents the upper subband (on the energy scale), with theweighting factor (n/2) and shifted by an amount U . Thesetwo terms and the corresponding two parts of the DOS inFig. 8 describe the Hubbard subbands. The dashed linein Fig. 8 represents the inverse lifetime of single–electronstates placed in the pseudogap, while the arrows pointto the position of the Fermi energy in each case. For n = 1,the Fermi level falls in a pseudogap, where the lifetime ofthose quasiparticle states is very short. This is reminiscentof the behavior encountered in an ordinary semiconductor,where the states in the band gap are those with a complexwave vector k. The lifetime may qualitatively simulate theatomic disorder-producing spread (Lorentian-shape) formof the bare density states.

FIGURE 9 The position of the Fermi level εF as a function of theband filling n for different values of interaction (from the bottom tothe top curve), U/� = 0,0.5, 2, and 10. For U/� = 10, the Fermilevel jumps between the subbands when n ≈ 1. [From Ref. 26.]

To display the of Mott insulator as a two-band systemin which the Hubbard subbands assume a role similar tothat of the valence and conduction bands in an ordinarysemiconductor, we have plotted in Fig. 9 the position ofthe Fermi level as a function of the numbers of electrons nper atom in the system. As n moves past unity, a jump inεF occurs for U/� � 1. This is exactly what happens inthe ordinary semiconductor when the electrons are addedto the conduction band. This feature shows once morethat the states near the upper edge of the lower Hubbardsubband (that is, the states near εF for n close to but lessthan unity) can be regarded as hole states. We will see thatthose states are the ones with a high effective mass.

It should be emphasized that the Hubbard subbandstructure is characteristic of magnetic insulators and can-not be obtained with a standard band theoretical approachto the electron states in solids. The N states in the lowerHubbard subband are almost singly occupied; this is di-rectly related to the picture of unpaired spins in Fig. 5band is one of the reasons for calling the electron statesfor such interacting systems correlated electronic states.

Page 347: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

Superconductivity Mechanisms 263

The other reason (discussed in detail later) arises becausethe proper description of electronic states near the local-ization threshold (the Mott transition) requires that oneincorporates two-particle correlations into the quasiparti-cle states. The Hubbard split-band picture is only the firststep in the proper description of the electron states. Thoseadditional correlations will lead to a very heavy mass ofquasiparticles near the Mott transition; the heavy massindicates a strong reduction of the bare bandwidth W asthe localization threshold is approached from the metallicside.

3. Localized versus Itinerant Electrons:Metal–Insulator Transitions

The Hubbard split–band picture of unpaired electronicstates in a narrow band, shown in Figs. 7 and 8, providesa rationale for the existence of a paramagnetic insulatingground state of the interacting electron system. The corre-sponding experimentally observed metal–insulator transi-tions (MITs) at a finite temperature are very spectacular,as demonstrated in Fig. 10, where the resistivity (on a log-arithmic scale) is plotted as a function of the inverse tem-

FIGURE 10 Experimental measurements [27] pertaining to thevariation of resistivity ρ on a logarithmic scale with inverse temper-ature 1000/T for the (V1−xCrx)2O3 system. The atomic contentof Cr2O3 in V2O3 for each curve is specified.

perature for a canonical system (V1−x Crx )2O3 (the dataare from Ref. 27). The number of transitions (one, two, orthree) depends on the Cr content. Note the presence of anintervening metallic state between the antiferromagneticinsulating (AFI) and the paramagnetic insulating (PM)states, as well as the reentrant metallic behavior at hightemperatures for 0.005 � x � 0.0178. To rationalize thesedata, we discuss the physical implications of a model of in-teracting narrow-band electrons for U ∼ W starting fromthe Hamiltonian [Eq. (25)]. We summarize here the mainfeatures of the detailed discussion presented in Refs. 28–30, which provide the main features of the ground-stateand thermodynamic properties.

In the absence of interactions (U = 0), the band energyper particle is ε = −(W/2)n(1 − n)/2), where 0 ≤ n ≤ 2is the degree of band filling; for n = 1, this reduces toε = −W/4. When the interactions are present, the bandnarrows; this is because of a restriction on the elec-tron motion caused by their repulsion, as described ear-lier. One way of handling this restriction is to adjoin tothe bare bandwidth a multiplying factor �. This leadsto a renormalized DOS for quasiparticles, as illustratedin Fig. 11. The factor � is a function of the particle–particle correlation function η ≡ 〈ni↑ni↓〉, the expecta-tion value for the double occupancy of a representativelattice site. The quantity η is calculated for T = 0 self-consistently by minimizing the total energy EG (per site),composed of the band energy EB = �ε and the coulombrepulsion energy Uη, where the parameter is specified by� = 8η(1−2η) [28, 29]. These two energies represent theexpectation values of the two terms in Eq. (25) for the case

FIGURE 11 Schematic representation of the bare (ρ0) and quasi-particle (ρ) densities of states. The band narrowing factor � forinteracting electrons (b) is specified. The degeneracy temperatureTD for the interacting electrons and that corresponding to nonin-teracting electrons (T ∗

D ) are also indicated. The situation drawncorresponds to the half–filled case (n = 1), for which the Fermienergy can be chosen as εF = 0.

Page 348: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

264 Superconductivity Mechanisms

of a half-filled band. The optimal values of the quantitiesare given by

η0 = 1

4(1 − (U/Uc)), (29a)

�0 = 1 − (U/Uc)2, (29b)

and

EG = (1 − (U/Uc)2

)ε, (29c)

with Uc = 8|ε| = 2W . Thus, as U increases, η0 decreasesfrom 1

4 to 0. At the critical value U = Uc, EB = 0 and thereare no double occupancies for the same lattice site; this sig-nals the crossover by the system from the itinerant (band)to the localized (atomiclike) state. The point U = Uc corre-sponds to a true phase transition at T = 0; the last statementcan be proved by calculating the static magnetic suscepti-bility, which is [29]

χ = χ0

/{�0

[1 − Uρ

(1 + I )/2

(1 + I )2

]}, (30)

where I ≡ U/Uc, ρ is the density of bare band states atε = εF, and χ0 is the magnetic susceptibility of band elec-trons with energy εk at U = 0. As � → 0 (that is, U → Uc),the susceptibility diverges. The localized electrons are rep-resented in this picture by noninteracting magnetic mo-ments for which the susceptibility is given by the Curielaw χ = C/T → ∞ as T → 0. Thus, the MIT is a truephase transition;η0 may be regarded as an order parameter,and the point U = Uc as a critical point. We concentratenow on a more detailed description of the metallic phase,which permits a generalization of the previous results tothe case T > 0. First, as has been said, the increase in mag-nitude of interaction U reduces the band energy accord-ing to EB = −W�0/4. Eventually, EB becomes compa-rable to the interaction part Uη; they exactly compensateeach other at U = Uc. The resultant electronic configu-ration (localized versus itinerant) is then determined atT > 0 by the very low entropy and the exchange interac-tion contributions. The entropy of the metallic phase inthe low–temperature regime may be estimated by usingthe linear specific heat expression for electrons in a bandnarrowed by correlations, namely, Cv ≡ γ T = (γ0 | �0)T ,where γ0 = 2π2k2

Bρ/3 is the linear specific heat coefficient(per one atom) for uncorrelated electrons (that is, U = 0).Hence, the entropy S = γ T = Cv. Combining this relationwith the resultant energy at T = 0, given by Eq. (29c), onecan write an explicit expression for the free energy of themetallic phase [30]:

F

N=

(1 − U

Uc

)2

ε − 1

2

γ0

�0T 2. (31)

This is the free energy per one atomic site. On the otherhand, if the exchange interaction between the localized

moments is neglected, then each site in the paramagneticstate is randomly occupied by an electron with its spineither up or down. The free energy FI for such an insulatingsystem of N moments is provided by the entropy term forrandomly oriented spins, that is,

FI

N= −kBT �n 2 (32)

Now, a system in thermodynamic equilibrium assumesthe lowest F state. The condition for the transition fromthe metallic to the local-moment phase is specified byF = FI. The phase transition determined by this conditioncan be seen explicitly when we note that the free energyvaries with T either parabolically [Eq. (31)] or linearly[Eq. (32)], depending on whether the system is a param-agnetic metallic (PM) or a paramagnetic simulating (PI)phase. As illustrated in Fig. 12, several of those curves

FIGURE 12 (a) Plots of the free energies for the paramagneticMott insulator (the straight line starting from the origin) and thecorrelated metal (the parabolas). The parabolic curves’ points ofcrossing at L and J correspond to a discontinuous metal–insulatortransition, while those crossing at K and M correspond to the re-verse. (b) Schematic representation of the phase diagram be-tween paramagnetic metallic (PM and PM′ and paramagnetic in-sulating (PI) phases. The points of crossing from a are also shown.The vertical arrow represents a sequence of the transitions shownin Fig. 10 for 0.005 � x � 0.018 and in the paramagnetic phase.

Page 349: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

Superconductivity Mechanisms 265

intersect at one or two points depending on the value U/W .These intersection points determine the stability limits ofthe PM and PI phases. The lowest curve for the PM phaselies below the straight line for the PI state; there is notransition, that is, the metallic Fermi liquid state with theeffective mass enhancement m∗/m0 � 2.5 is stable at alltemperatures. As U/W increases, the parabolas fall higheron the free energy (F/WN ) scale and the possibilities fortransitions open up. The higher two curves illustrate thecase in which the intersections with the straight lines oc-cur at J and K and at L and M, respectively; at low andhigh temperatures the parabola lies below the straight linefor FI/WN , so that the metallic phase is stable in those Tregions. At intermediate temperatures, the PI phase is sta-ble. The loci of the intersections move farther apart on thekBT /W scale as U /W is increased, as shown in Fig. 12b,where the phase boundaries are drawn; this part of thefigure represents the temperature of the transitions (theintersection points in Fig. 12a) versus the relative mag-nitude of interaction U/W . We see that the PM phase isstable at low temperatures; thus, reentrant metallic behav-ior is encountered at high T . The explicit form of (thecurve in Fig. 12b) is obtained from the coexistence condi-tion F = FI, which leads to the following expression forthe transition temperatures [29]:

kBT+W

= 3

2π2

[1 −

(U

Uc

)2]{

�n 2

±[

(�n 2)2 + π2

3

1 − (U/Uc)

1 + (U/Uc)

] 12}

. (33)

The root T− represents the low-temperature part, that is,that for kBT/W ≤ 0.049. The T+ part is the one above thepoint where both curves meet; this takes place at the lowercritical value of U = U�c such that

U�c

Uc= 1 − 3

√2

1

(ρ|ε|) 12

≈ 0.75. (34)

Below the value of U = U�c, the correlated Fermi liq-uid is stable at all temperatures. Ultimately, for 1.58 ≤U/W ≤ 2.0, only one intersection (at low T ) of the curvesremains. This means that in this regime of U/W the reen-trant metallic behavior is achieved gradually as the temper-ature increases. The above-described transitions are ob-served when changing the magnitude of interaction (U/Wratio). In the case of high-temperature superconductors weobserve the transition from a Mott insulator to a supercon-ductor as a function of doping (carrier concentration). Thiscase is discussed next.

Note added in August 2000. In recent years, theMott localization in the limit of infinite dimension has

been discussed extensively (e.g., Gebhardt, Ref. 30). Acentral peak is located between the Hubbard subbands,which carries the main part of the quasiparticle weight.There are two problems with the application of this solu-tion to concrete systems. First, the upper critical dimen-sonality is not known for Mott systems. Second, the disap-pearance of the central peak at the localization thresholdis being debated.

4. Strongly Correlated Electrons: KineticExchange Interaction and MagneticPhases in Three-Dimensonal Space

In the limit W � U , the ground state of the interactingelectron system will be metallic only if the number ofelectrons Ne in the system differs from the number Nof atomic sites. Simply, only then can charge transporttake place via the hole states in the lower Hubbard sub-band (for Ne < N ), that is, when the transport of elec-trons can be represented via hopping from site to site,avoiding the doubly occupied configurations on the samesite. This restriction on the motion of individual elec-trons is described above in terms of the band narrowingfactor �, which, in the normal phase, is now of the or-der [28, 29] � = (1 − n)/((1 − n)/2). This shows that theeffective quasiparticle bandwidth W ∗ ≡ W� is nonzeroonly if the number of holes δ ≡ 1 − n > 0.

For W � U , there is one class of dynamic processes thatis important in determining the magnetic interactions bet-ween strongly correlated itinerant electrons, namely, thevirtual hopping processes, with the formation of a dou-bly occupied site configuration in the intermediate state.Such processes are depicted in Fig. 13, where one electronhops onto the site occupied by an electron with oppositespin and then hops back to the original site. During suchprocesses, the electrons can exchange positions (and theyields to the spin reversal of the pair with respect to theoriginal configuration) or the same electron can hop backand forth. The corresponding effective Hamiltonian, in-cluding the virtual-hopping processes in first nontrivialorder, has the form

FIGURE 13 Virtual hopping processes between singly and dou-bly occupied atomic sites that lead to an antiferromagnetic ex-change interaction between the neighboring sites. This interac-tion is responsible for the antiferromagnetism in most of the Mottinsulators.

Page 350: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

266 Superconductivity Mechanisms

H =∑kσ

�σ εknkσ +∑

i j

(2t2

i j

/U

)(Si · S j − 1

4ni n j

),

(35)where in general the band narrowing factor �σ = (1 − n)/(1 − nσ ), nσ = 〈niσ 〉 is the average number of particlesper site with the spin quantum number σ , and ni ≡ ni↑ +ni↓ is the operator of the number of particles on given sitei . Note that in the paramagnetic state nσ = n−σ = n/2,and �σ reduces to � = (1 − n)/((1 − n)/2), the value forthe normal state.

One should note that the effective Hamiltonian[Eq. (35)] represents approximately the original Hub-bard Hamiltonian for W � U (for more precise treatment,see Ref. 31 and Section IV.A). When n → 1, φ → 0, andEq. (35) reduces to the Heisenberg Hamiltonian with an-tiferromagnetic interaction, which is the reason why mostMott insulators order antiferromagnetically. In the limitof a half–filled band, we also find that the effective band-width W ∗ ≡ W� = 0, thus proving that the electrons inthat case are localized on atoms. The nature of the wavefunction for these quasi–atomic states has not yet beensatisfactorily analyzed, though some evidence given latershows that they should be treated as soliton states.

For n < 1, the normalized band (the first term) and theexchange parts in Eq. (35) do not commute with each other.This means that for the narrow-band system of electronsrepresented by the spin dynamics influences the nature ofitinerant quasiparticle states of energies �εk. What is evenmore striking is that, as n → 1, the two terms in Eq. (35)may contribute equally to the total energy. The criticalconcentration of electrons nc for which these two termsare comparable is

FIGURE 14 Commonly accepted magnetic phase diagram for strongly correlated electrons on the n–(W/U ) plane.

nc � 1 − 1

2z

W

U∼ 0.02 ÷ 0.05.

In Fig. 14, we have plotted schematically the commonlyaccepted phase diagram for three dimensional systemsdescribing the possible magnetic phases on the planen − (U/W ). Close to the case of one electron per atom,the antiferromagnetic (AF) phase is stable for any arbitrarystrength of interaction. At intermediate filling, the ferro-magnetic (F) phase may be stable. On the low-interactionside (W/U > 1), the ferromagnetic phase terminates atpoints where the Stoner criterion is met, that is, whenρ0(εF)U = 1, where ρ0(εF) is the value of the bare densityof states (per spin) at the Fermi level εF.

Peculiar features appear in the corner where n ≈ 1, andW/U � 1, that is, where the number of holes is small,so that the exchange interaction contribution to the totalsystem energy is either larger than or comparable to theband energy part �ε. In such a situation, a mixed ferro-magnetic–antiferromagnetic phase is possible [32]. Whenthe number of holes is very small, each hole may forma magnetic polaron with a ferromagnetic cloud accom-panying it: the hole is self-trapped within the cloud offerromagnetic polarization it created. We consider thoseobjects next.

B. Magnetic Polarons

1. The Classical Approach

It has been proved by Nagaoka [32] that in the limitW/U → 0 the ground state of the Mott insulator with onehole involves ferromagnetic ordering of spins. This is be-cause in this limit the antiferromagnetic exchange term

Page 351: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

Superconductivity Mechanisms 267

in Eq. (35) vanishes and the band energy is lowest when�σ=↑ = 1 and �σ=↓ = 1 − n. We can thus choose an equi-librium state with n↑ = n, n↓ = 0, that is, a state with allspins pointing up.

Mott, and Hertier and Lederer [23], has pointed outthat if W/U is small but finite, a hole may create locallya ferromagnetic polarization of the spins in a sphere ofradius R, surrounded by a reservoir of antiferromagneti-cally ordered spins. The situation is shown schematicallyin Fig. 15. The energy of such a hole accompanied by acloud of saturated polarization can be estimated roughly as

E(R) = −W

2+ π2|t |

(a

R

)2

+ 4π

3

(R

a

)3 zt2

U, (36)

where a is the lattice constant and t is the hopping integralti j between the z nearest neighbors. In this expression, thefirst term is the band energy of a free hole in a completelyferromagnetic medium, the second represents the kineticenergy loss due to the hole confinement, and the thirdinvolves the antiferromagnetic exchange energy penaltypaid by polarizing the spins ferromagnetically within avolume 4π R3/3. Minimizing this equation with respectto R, we obtain the optimal number of spins contained inthe cloud,

N = 4π

3

(πU

W

)3/5

, (37a)

and the polaron energy,

E0 = −W

2

[1 − 5π2

3z

(W

U

)2/5]. (37b)

FIGURE 15 Representation of the magnetic polaron state, thatis, one hole in the antiferromagnetic Mott insulator. This hole pro-duces ferromagnetic polarization around itself and may becomeself-trapped.

Equation (36) holds for a three-dimensional system; fora planar system, the factor (4/3)π (R/a)3 in the last termshould be replaced by the area π (R/a)2. One then obtainsthe corresponding optimal values,

N = π

(2πU

W

) 12

(38a)

and

E0 = −W

2

[1 − 2π2

z

(W

2πU

) 12

]. (38b)

These size estimates will be needed later when discussingthe hole states at the threshold for the transition from an-tiferromagnetism to superconductivity in high-Tc oxidematerials. One should note that U/W must be apprecia-bly larger than unity to satisfy the requirement N � 1. Inother words, the condition R � a must be met, so that thespin subsystem (and the hole dynamics) may be treatedin the continuous–medium approximation, the conditionunder which Eq. (36) can be derived.

2. The Quantum Approach: Two Dimensions

The motion of a single hole in the Mott insulator ismuch more subtle than the formation of the polaron dis-cussed above. Namely, if we consider n holes in the lowerHubbard subband, then the probability of electron hop-ping around is ≈n(1 − n), so effectively, the bandwidthof such itinerant states is Weff = zt(1 − n). For small n,we have W ≤ J , where J is the magnitude of the kineticexchange. In the limit of a single hole the dynamics isdetermined by the magnitude of exchange interactions J ,since Weff → 0. In effect, we have a hole moving slowlyin the background of antiferromagnetically ordered spins.This picture seems to be a good representation of the holemotion in highly insulating magnetic oxides such as NiOand CoO. Instead, in high-temperature superconductorsindividual polaronic states must overlap appreciably for(nc ∼ 0.95) when the magnetic insulator → metal tran-sition takes place. Therefore, some sort of homogeneousstate must be formed in the metallic phase. This is partic-ularly so since high-temperature superconductors evolvefrom a charge-transfer insulator, for which the gap for2p → 3dn+1(O2− → Cu1+) transitions is smaller than theHubbard gap � � U − W . In effect, the hole states arehybridized 3d–2p states (in proportions 2:1), not pure3d states due to copper ions. As a result, few alterna-tive pictures of the fermionic liquid of strongly correlatedelectrons in the normal (metallic) phase arise, startingfrom the phenomenological pictures of marginal Fermiliquid (Varma et al. [33a]) and nearly antiferromagneticFermi liquid (NAFL) [33b, 33c] to a mean-field picture of

Page 352: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

268 Superconductivity Mechanisms

strongly correlated electrons coupled to a gauge field (cf.Lee and Nagaosa [33d]). A separate class of models com-poses cluster calculations including realistic structures ofCuO2 planes (cf. Ref. 33e). Another class of models formsbosonic models with preformed pairs of bound bipolarons[33f]. The latter class of models requires that the bipolaronradius is R < (a0/x)

12 , where x is the hole concentration,

and a0 is the Cu–Cu interatomic distance. This, in turn,requires a rather strong attractive interaction, which mostprobably can be furnished only by the combined effect ofmagnetic and rather strong electron–phonon interactions.Finally, there exists a substantial number of papers on nu-merical diagonalization for small clusters [33g]. A sepa-rate class of models are those involving the stripe structure[33h].

The lack of microscopic theory of normal propertiestransforms into the arbitrariness in selecting the pairingpotential, as we shall see in the next section. The linearresistivity in the full temperature range at optimal doping[34a], the spin-gap existence [34b] in underdoped sys-tems, and the anomalous (non-Drude) form of the opticalconductivity all speak in favor of the non-Fermi-liquid (ab-sence of quasiparticles) behavior of correlated electrons[34c] in two spatial dimensions. The role of disorder hasnot been explained properly either.

3. The Spin Liquid

The difference between an electron liquid of strongly cor-related electrons (represented, for example, by the holesin the lowest Hubbard subband) and a Fermi liquid can beshown clearly in the limit of relatively high temperaturesW ∗ � kBT � U , where the quasiparticle band states withenergies (�εk) are populated equally, independent of theirenergy. Namely, if Ne electrons are placed into N availablestates of almost the same energy, then the number of con-figurations for a phase with excluded double occupanciesof each state is [34]

2NeN !

Ne!(N − Ne)!. (39a)

The first factor is the number of spin configurations forthe singly occupied sites, while the second specifies theconfigurational entropy—the number of ways to distributeNe spinless particles among N states. This leads to molarentropy in the form

SL = R[n �n 2 − n �n n − (1 − n) �n(1 − n)], (39b)

where n = Ne/N is the degree of subband filling and Ris the gas constant. The above reduces to SL = R �n 2 forn = 1, that is, to the entropy of the N spins ( 1

2 ) on the lat-tice. In contrast, in a Fermi liquid that obeys the Fermi–Dirac distribution, double occupancies are not excluded,

FIGURE 16 Schematic representations of the difference in the k–space occupation for ordinary fermions (a) and strongly correlatedelectrons (b) The spin subbands with σ = ↑ and ↓ are drawn. Notethat the holes drawn in b do not appear; they are shown only toindicate the single occupancy of each single–particle state. Theposition of the Fermi level is different for the same number ofelectrons in the two situations. [From Ref. 34.]

as illustrated in Fig. 16a. The corresponding number ofconfigurations is then(

N

Ne/2

)2

= N !

(Ne/2)!(N − Ne/2)!, (40a)

with the corresponding molar entropy,

SF = R[2 �n 2 − n �n n − (2 − n) �n(2 − n)]. (40b)

Page 353: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

Superconductivity Mechanisms 269

Hence, for n = 1, SF = 2SL = 2R �n 2. One should empha-size that only the value for SL reproduces correctly theentropy of N localized paramagnetic spins (the electronicpart of the entropy for magnetically disordered states ofthe Mott insulator). Hence, in accord with intuitive reason-ing, the Fermi–Dirac distribution, which allows for doublestate occupancy, cannot be applied to a strongly correlatedelectron liquid, which we call a spin liquid. The state ofsuch a liquid reduces to that of the spin system on the lat-tice if N = Ne (for the Fermi-liquid case, the ground stateis then a metal with a half–filled band).

One should now ask how these results may be general-ized to handle the regime of low temperatures and of anarbitrary number of holes. One observes that in Fig. 8 theband states for U � W are split for any arbitrary degree ofband filling [cf. also Eq. (28b)]. Therefore, in enumeratingthe distribution of particles in the lower Hubbard subband,one must exclude double occupancies of the same energy(ε) state. Since the quasiparticle energy is labeled by thewave vector k, one can equivalently exclude the doubleoccupancies of given state |k〉. Under this assumption, thestatistical distribution is given by [34]

nkσ = (1 − nk−σ )1

1 + exp[β(Ekσ − µ)], (41a)

where β = (kBT )−1, nkσ is the average occupancy of thestate |kσ 〉, and µ is the chemical potential that is deter-mined from the conservation of the total number of parti-cles

Ne =∑kσ

nkσ . (41b)

The corresponding molar entropy is now given by

SL = − R

N

∑k

[(1 − nk) �n(1 − nk)

+ nk↑�n nk↑ + nk↓ �n nk↓], (41c)

with nk = nk↑ + nk↓.One should note that the distribution function

[Eq. (41c)] differs from the ordinary Fermi–Dirac formulaby the factor (1 − nk−σ ), which expresses the conditionalprobability that there should exist no second particle withthe spin quantum number k(−σ ) if the state kσ is to beoccupied by an electron, as shown in Fig. 16b. If Ekσ ≡ Ek

(that is, when the particle energy does not depend on itsspin direction), Eq. (41a) reduces to

nk = 1

1 + (12

)exp[β(Ek − µ)]

. (41d)

This is the same type of formula that applies to theoccupation number of simple donors, if the index k isdropped and ε represents the position of the donor levelwith respect to the bottom edge of the conduction band.

FIGURE 17 Comparison of the Fermi–Dirac and Boltzmann dist-ributions for n kσ with that for strongly correlated electrons (thespin—liquid phase); the total occupancy n k = n k↑ + n k↓ is takenin the latter case.

At T = 0, each state is singly occupied. This is the princi-pal feature by which the present formula differs from theFermi–Dirac distribution at T = 0, as illustrated in Fig. 17.The distribution [Eq. (41a)] leads to a doubling of the vol-ume enclosed by the Fermi surface in the spin-liquid statecompared to the Fermi-liquid state. At low temperatures,application of the distributions [Eq. (41a) or (41c)] yieldsFermi liquid-like properties: a linear T dependence of thespecific heat (of large magnitude if n → 1) of the entropy.At high temperatures, the new distribution leads to entropyof the form of Eq. (39b) and local–moment behavior in theform of the Curie–Weiss law for susceptibility. Hence, theproperties of the spin liquid governed by the distribution[Eq. (41a) or (41d)] interpolate between those of a metaland those of local moments. Such behavior is observed inmany correlated systems, for example, in heavy fermions.

One should note that the entropy expression [Eq. (41c)]can be rewritten for the paramagnetic state in the followingform:

SL = −n R �n 2−kB

∑k

[nk �n nk +(1 − nk) �n(1 − nk)].

(42)The first part represents the entropy of spin moments; thesecond, the entropy of spinless fermions. An alternativedecomposition has been put forward [35] in which the dy-namics of correlated electrons is decomposed into that ofneutral fermions called spinons and the charged bosonscalled holons. Within this picture, the onset of supercon-ductivity is considered as a combined effect of Bose con-densation of the holons with the simultaneous formation ofa coherent paired state by the fermion counterpart [36–38].This problem is discussed in more detail in Section IV.A.

The above treatment of the spin liquid deals only withits statistical properties in the U → ∞ limit. The problem

Page 354: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

270 Superconductivity Mechanisms

FIGURE 18 Schematic representation of singlet–spin pairingforming the RVB state. All paired configurations should be takento calculate the actual ground state. (a) The RVB state for theMott insulator; (b) that with one hole. The latter case will containan unpaired spin, as indicated.

now arises as to what happens when the spin part of theform of the second term in Eq. (35) is explicitly included.The problem of the resultant quantum ground state of holesin a Mott insulator is a matter of intensive debate [36–38].The state called the resonating valence-bond (RVB) statehas been involked [36] specifically to deal with this prob-lem; this state is shown schematically in Fig. 18 for thecase without holes (a) and with one hole (b). The connect-ing lines represent bonds, across which the two electronfrom spin–singlet pairs. The resonating nature of bondsis connected with the idea that the RVB ground state is acoherent superposition of all such paired configurations.The dynamic nature of this spin dimerization is connectedwith the terms (S+

i S−j + S−

i S+j ) in the exchange part

of the Hamiltonian [Eq. (35)]. There is the possibilitythat the RVB state [which, for obvious reasons, differsfrom the ordinary (Neel) antiferromagnet] is a groundstate for the planar CuO2 planes in high-Tc oxides, such asLa2−x Srx CuO4, where the long–range magnetic order isdestroyed for x ≈ 0.02 ÷ 0.03. We return to this problemin Section IV when discussing the boundary line betweenantiferromagnetism and superconductivity for high-Tc

oxides.

C. Hybridized Systems

Most of the strongly correlated systems are encountered inoxides and in several classes of organic and inorganic com-

pounds. In oxides the 3d orbitals of cations such as Cu2+

and Ni2+ hybridize with the 2p orbitals of oxygen, par-ticularly if the atomic 3d states are energetically close tothe 2p states. The properties of correlated and hybridizedstates can be properly discused in terms of the Andersonlattice model Hamiltonian, which is of the form

H = εf

∑iσ

Niσ +∑kσ

εknkσ + U∑

i

Ni↑Ni↓

+ 1√N

∑kσ

(Vkeik·Ri a+

iσ ckσ + H.C.). (43)

In this Hamiltonian, the first term describes the energy ofatomic electrons positioned at εf, the second representsthe energy of band electrons, the third represents the in-traatomic coulomb repulsion between two electrons of op-posite spins, and the last describes the mixing of atomicwith band electrons due to the energetic coincidence (de-generacy) of those two sets of states (H.C. refers to theHermitian conjugate part of the hybridization part). Inheavy fermions, the atomic states are 4 f states, whereasthey are 3d states of Cu2+ ions in high-Tc systems; theband states are 5d–6s and 2p states, respectively. Note thatNiσ = a+

iσ aiσ and nkσ = c+kσ ckσ are the number of particles

on given atomic (i) or k states, respectively. In this Hamil-tonian, the following parameters appear: the atomic–levelposition εf, the width W of starting band states with en-ergies {εk}, the magnitude U of the coulomb repulsionfor two electrons located in the same atomic site, and thedegree of hybridization (mixing), Vk, characterized by itsmagnitude V .

Two completely different situations should be distin-guished from the outset: (1) U > W > |εf| � |V |, and(2) U > W > |V | � |εf|. Case 1 applies when the starting(bare) atomic level is placed deeply below the Fermi leveland the atomic states admix weakly to the band states.In case 2, the hybridization is large and is responsiblefor strong mixing of the two starting sets of states. Theband structure corresponding to the hybridized band statesin the absence of electron–electron interactions (that is,U = 0) is depicted in Fig. 19. We observe a small gapin the hybridized band structure; it occurs around thebare atomic level position εf and separates two hybridizedbands. Those two bands, which have the energies

Ek± = εk + εf

[(εk − εf

2

)+ |Vk|2

],

correspond to the bonding and antibonding types of statesin molecular systems. The structure of the hybridizedbands is demonstrated explicitly in Fig. 20 of the DOS foreach band. One sees that strongly peaked structures occurin the regions near the gap. If the Fermi level falls withinthese peaks, a strong enhancement of the effective mass

Page 355: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

Superconductivity Mechanisms 271

FIGURE 19 Schematic representation of the hybridized bandswith energies Ek±, which are formed by mixing the band states(with energy εk) and atomic states (located at ε = εf). The origi-nal band has width W, much wider than the peaked structure, ofwidth W ∗.

should takes place solely because of these peculiaritiesof the band structure. In some situations only a pseudo-gap caused by the hybridization is formed, as shown inFig. 21. This is so if the hybridization matrix element Vdepends on the wave vector k and if, along some directionsin reciprocal space, Vk = 0.

The inclusion of the interaction term in Eq. (43) rendersthe treatment of the Anderson lattice Hamiltonian muchmore complicated; up to now this problem has not beensolved rigorously. A large variety of approximate treat-ments has been proposed and reviewed recently [39–42],

FIGURE 20 Density ρ(ε) of hybridized states versus particle en-ergy ε. Note that the hybridization gap �h may be very smallcompared to the total width of the band states. The position of theFermi level εF corresponds to the filled lower band.

FIGURE 21 Same as Fig. 20 but with the pseudogap among thehybridized bands.

in all of which the principal task was to provide a satisfac-tory description of heavy-fermion materials [43]. In effect,the limiting case of almost localized strongly correlatedelectrons was studied, which, among others, provides aquasiparticle electronic structure similar to that shown inFig. 20, with a very strong enhancement of the DOS nearthe Fermi surface. This yields to very heavy quasiparti-cles, which, in some systems, may undergo transitionseither to antiferromagnetism or to superconducting states.In this respect, heavy-fermion materials are analogousto high-Tc systems, though with much lower transitiontemperatures.

D. The Electronic States ofSuperconducting Oxides

The high-Tc superconducting oxides, such as La2−x Srx

CuO4 (the so-called 214 compounds) and YBa2Cu3O7−δ

(the so-called 123 compounds), have one common struc-tural unit: the quasi–two–dimensional structure that isapproximated by CuO2 planes, one of which is shownschematically in Fig. 22. We discuss mainly the role ofthese planes since it is widely accepted that the electronicproperties of these subsystems are the main factor de-termining the observed superconductivity, antiferromag-netism, and localization effects in those materials. In sto-ichiometric La2CuO4 or YBa2Cu3O7, the formal valenceof Cu is 2+, that is, it corresponds to a one-hole (3d 9) elec-tron configuration. In a strictly cubic structure, with Cu2+

surrounded by O2− ions in an octahedral arrangement, thehighest band is doubly degenerate and of eg symmetry,that is, composed of dx2–y2 and d3z2–r2 orbitals. However,in high-Tc materials, the octahedra are largely elongated inthe direction perpendicular to the CuO2 planes, so that thebands are further split; it is commonly assumed that the

Page 356: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

272 Superconductivity Mechanisms

FIGURE 22 Schematic representation of the CuO2 planes in su-perconducting oxides in the tetragonal phase. The Cu–Cu dis-tance is ≈1.9 A for La2CuO4.

antibonding orbital dx2–y2 is higher in energy and hencehalf–filled. These d states hybridize with the oxygen 2px

and 2py orbitals of σ type, as shown schematically inFig. 23; both the bonding and the antibonding configura-tions are shown; the latter corresponds to the signs of thetwo p orbitals shown in parentheses.

A simple description of the electronic states for theplanar CuO2 system is obtained by introducing a singleband representing Cu d electrons in the tight-binding ap-proximation. For the square configuration of the Cu atoms(which reflects the tetragonal structure of La2CuO4), sucha dispersion of band energies has the form

FIGURE 23 The configuration of the 3dx2–y2 and pσ orbitals forbonding configurations. The reverse signs for the two p—orbitals(that is, those in parentheses) represent the hybridized configura-tion for the antibonding state.

εk = 2t(cos kx a + cos kya), (44)

where t is the so–called hopping or Bloch integral 〈i |V | j〉between the nearest neighboring ions i and j , and a is theCu–Cu distance. For La2CuO4 and YBa2Cu3O6.5, thisband is half–filled, with the Fermi surface for bare(noninteracting) electrons determined from the conditionεk = µ = 0. As shown in Fig. 24, this leads to a square inreciprocal space connecting the points (π/a)(±1, 0) withthe points (π/a)(0, ±). The oxygen electrons in the 2pstates are regarded as playing only a passive role of atransmitter of the individual d electrons from one dx2−y

state to its neighbor (note that the O2− valence state hascompletely filled p shells). If the number of electrons inthat band is decreased (for example, by substituting Srfor La in 214 compounds), then the Fermi surface shrinksand gradually transforms into a circle, as shown in Fig. 24[44]. Within such a model, La2CuO4 should be metal-lic. However, at T <TN � 240 K, this compound ordersantiferromagnetically [24], and the ground state is theninsulating. The fact that this system remains insulatingabove the Neel temperature TN means that the stoichio-metric La2CuO4 and YBa2Cu3O6.5 are Mott insulators,not a Slater split–band antiferromagnet; for the latter, thesplit–band structure for T <TN should coalesce into oneband as T → TN. The presence of the paramagnetic in-sulating state for both La2CuO4 and YBa2CuO6+δ sup-ports the view that those oxides should be regarded as

FIGURE 24 The shape of the two—dimensional Fermi surface forband energy of the form of Eq. (44). The values specified representµ/t as a parameter. The square shape corresponds to µ = 0 or,equivalently, to n = 1. [From Ref. 44.]

Page 357: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

Superconductivity Mechanisms 273

narrow–band systems characterized by strong electron–electron interactions (U > Uc), as originally proposed byAnderson [45]. An antiferromagnetic ground state is thenexpected since the kinetic exchange interaction betweenthe strongly correlated electrons takes place [31].

A principal problem appears when holes occur in theMott insulator, that is, when we consider a real situationLa2−x Srx CuO4 or YBa2Cu3O6.5+x . We have already seen(at the end of Section III.A) that for small x the kineticenergy of the holes and the exchange energy of electronsmay become comparable or the latter may become evenlarger than the former. In such situations, the motion ofthe holes will be influenced by the setting-in of almost-instantaneous spin–spin correlations. This means that suchmetallic states (if formed) cannot be regarded as the Fermiliquid with slowly evolving spin fluctuations; instead, theresonance between various spin configurations must bebuilt into the electron wave function characterizing itsitinerant state. The decomposition of the resonating spinconfigurations into spin pair–singlet configurations con-stitutes an important characteristic of the RVB theory ofthe normal state [36, 45]. Some experimental evidence forthe quantum spin–liquid state above the Neel tempera-ture has been provided by neutron quasi–elastic scattering[46]; these results were subsequently interpreted [47].

The interpretation of the metallic state in terms of a sin-gle, narrow band requires the presence of both 3d 9 (Cu2+)as well as 3d 8 (Cu3+) states. Most of the X–ray spec-troscopical studies [48] conclude that the satellite peakcorresponding to a 3d 8 configuration is actually absent.Therefore, to explain both the insulating properties ofLa2Srx CuO4 and the metallic properties of La2−x Srx CuO4

for 0.04 ÷ 0.05, one introduces hybridized 2p–3d statesfor the holes introduced by the doping. The proper modelof such states is then the Anderson lattice type of model[Eq. (43)]. Band–structure calculations by Mattheis [49]for La2−x Srx CuO4, shown in Fig. 25, justify a reasonabledescription within a simple two-dimensional tight-bindingmodel with only the Cu 3dx2−y2 and pσ orbitals on oxy-gens taken into account. Namely, the structures denotedA and B in Fig. 25 correspond, respectively, to antibond-ing and bonding hybridized bands, with respective bandenergies

Ek± = εp + εd

[(εp − εd

2

)2

+ 4V 2

(sin2 kx a

2+ sin2 kya

2

)] 12

, (45)

where εp and εd are atomic level positions for the 3d and2p states, respectively. Detailed calculations [49] lead toa nonzero bandwidth of the 2p band because of the p–poverlap; then εp → εp + εk, with

FIGURE 25 Energy bands for La2Cu4 calculated within a local—density approximation for the assumed crystal structure arebody—centered tetragonal. A portion of the x–y plane in the ex-tended Brillouin zone scheme is shown in the inset. Portions Band A correspond to the bonding and antibonding parts of thehybridized band discussed in the text. [From Ref. 49.]

εk = 2tp[cos(kx a/√

2) + cos(kya/√

2)].

The band structure calculations should be regarded asproviding input parameters for the parametrized modelswhich include electron correlations more accurately. Onthe basis of various estimates [50] of those parameters,one can assume that they fall in the range

|εp − εd| � 3.6 eV, |V | ≈ 1.3 ÷ 1.5 eV, |t| � 0.5 eV,

|tp| � 0.6 eV, and U � 8–10 eV.

From these estimates of the parameters, one sees that|V | ∼ |εp − εd |. Hence, one may not be able to use theperturbation expansion in V/(εp − εd ) of the Anderson lat-tice Hamiltonian. Such a perturbation expansion was used[51] when transforming the hybridized model representedby the p–d Hamiltonian into an effective narrow–bandmodel, which is represented by the effective Hamiltonianfor the electrons in the CuO2 plane,

H = t∑

i, j=n.n,σ

a+iσ a jσ + t ′ ∑

i, j=n.n.n,σ

a+iσ a jσ

+ J∑

i, j=n.n

(Si · S j − 1

4ni n j

),

where t and t ′ are the hopping integrals between the near-est (n.n.) and the next-nearest neighbors, J is the valueof exchange integral for the kinetic superexchange, and

Page 358: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

274 Superconductivity Mechanisms

the tilded operators mean that they are projected onto sub-space of singly occupied lattices sites. The three param-eters take the values t � −0.5 eV, t ′ � 0.05–0.1 eV, andJ � 0.13 eV.

On the basis of the facts that the present-day band cal-culations do not provide paramagnetic insulating statesfor stochiometric materials, such as La2CuO4, and thatthe antiferromagnetic ground state is difficult to achievewithin the local–density approximation [49], we concludethat an approach based on the parametrized models dis-cussed in the preceding two sections should be treatedin detail. The microscopic parameters obtained from theband–structure calculations should be treated as input pa-rameters in those models. A review of properties obtainedwithin the parametrized models and relevant to high-Tc

systems is given in Section IV.

IV. NOVEL MECHANISMS OFELECTRON PAIRING

The binding of two fermions into either a bosonic or amore complicate bound state is a prerequisite for the con-densation of microscopic particles into a coherent (super-fluid) macroscopic state. This condensation may take theform of Bose–Einstein condensation if the interaction en-ergy between the pairs is much lower than the bindingenergy of a single pair (also, the pairs must be well sepa-rated spatially). Such a Bose condensed state of chargedparticles may exhibit the principal properties of the super-conducting state such as the Meissner–Ochsenfeld effect[52]. In the BCS theory (discussed in Section II), pair con-densation occurs under a completely different condition,namely, when the states of different pairs overlap stronglyso that the motion of one widely separated pair takes placein the mean field of almost all other pairs.

The pairing of particles in the BCS theory is describedin momentum (reciprocal) space, where it is assumed thatthe quasiparticle states with a well-defined Fermi surfaceare formed first; the pairing involves electrons from theopposite points on the Fermi surface (k, −k) and gener-ates either a simple spin–singlet state (as in the classic su-perconductors) or a higher angular-momentum state, e.g.,L = S = 1 (as in superfluid 3He [53]). Because of the smallcoherence length (ξ ∼ 10 A), the new superconductors of-fer an opportunity for exploring the possibility of pairingin real (coordinate) space. Moreover, since the carrier con-centration determined from the Hall-effect measurements[54] for high-Tc oxides is at least one order of magnitudelower than that for ordinary metals, it is tempting to de-scribe the onset of the superconducting state as a Bose con-densation of preexisting pairs. In fact, the situation is notthat simple. For example, in La2−x Srx Cu4, with x = 0.04,

the average distance between holes in the normal phase is≈5a, a magnitude comparable to ξ . These circumstances,combined with antiferromagnetism and localization ef-fects, render the new superconducting materials unique inthe sense that their description requires a unification oftheoretical approaches to phenomena previously regardedas disparate.

The accumulated evidence for rather strong electron–electron interaction in high-Tc oxides [36,48] and inheavy–fermion systems [39–43] makes it unlikely thatelectron pairing in these materials is caused by ex-tremely strong electron–phonon interaction. Furthermore,the electron–phonon interaction does not allow for a con-nection (or, strictly speaking, competition) between theobserved superconductivity and antiferromagnetism [24].This is one of the reasons for an intensive search for apurely electronic mechanism of pairing. We now discusssome of the mechanisms that have been proposed. Themain emphasis so far has been placed on an exchange–mediated pairing for strongly correlated electrons [45],since for such systems, the pairing, antiferromagnetism,and MITs to the Mott localized phase are derived froma single theoretical scheme. The latter two phases havebeen discussed in Section III; here, we concentrate on thespin–singlet pairing among strongly correlated electrons.Later we discuss charge transfer- and phonon-mediatedpairings. Finally, we classify the types of correlated statesand metallic states in solids. This classification provides aconcise way of characterizing specific properties of thesesystems by which the almost-localized systems differ fromordinary metals.

A. Exchange Interactions andthe Real-Space Pairing

1. Narrow–Band Systems

In Section III.A we provided an approximate Hamiltonian[Eq. (34)], which includes the antiferromagnetic exchangeinteractions between the correlated electrons in the limitU/W � 1. The precise form of this Hamiltonian to secondorder in W/U is [31, 55]

H =∑i jσ

′ti j b

+iσ b jσ +

∑i j

′ 2t2i j

U

(Si · S j − 1

4νiν j

)

+ (three–site terms), (46)

where the primed summation means that i �= j . In thisHamiltonian, doubly occupied site i configurations |i↑↓〉are excluded. This exclusion is reflected by the presence ofcreation (b+

iσ ) and annihilation (biσ ) operators for electronsin the state |iσ 〉, which are defined as

Page 359: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

Superconductivity Mechanisms 275

b+iσ ≡ a+

iσ (1 − ni−σ ) and biσ ≡ aiσ (1 − ni−σ ), (47)

so that

νiσ = b+iσ biσ and νi =

∑σ

νiσ . (48)

The spin operator is defined as

Si ≡ (S+

i , S−i , Sz

i

) ≡ [a+

i↑ai↓, a+i↓ai↑, (ni↑ − ni↓)/2

].

Note that the same representation of the operator Si canbe written in terms of projected operators b+

iσ and biσ . Thefactor (1 − ni−σ ) in Eq. (47) imposes explicitly the restric-tion that the creation or the annihilation of electrons in thestate |iσ 〉 can take place only if there is no second electronalready on the same site. Thus, νi = ∑

σ niσ (1 − ni−σ )enumerates only the singly occupied sites (νi = 0 or 1). Inother words, the N states corresponding to the doubly oc-cupied site configurations have been projected out. Thus,Eq. (46) describes the dynamics of strongly correlatedelectrons for Ne ≤ N of electrons. Also, in performingthe summations in Eq. (46), one usually considers only thepairs 〈i j〉 of nearest neighbors; in this approximation theparameters Ji j = J and ti j = t can be chosen as constants.

The first term in Eq. (46) describes the single–particlehopping of electrons from the singly occupied to the emptyatomic sites; the second describes the exchange interactioninduced by virtual hopping between site i and site j , whilethe three-site part describes the motion of electron withspin σ from the singly occupied site located at i to thenext-nearest neighboring empty site k via the occupiedconfiguration (with electron of opposite spin) located atsite j . The various contributions to Eq. (46) are representedgraphically in Fig. 26.

If one introduces a new pair of creation and annihilationoperators in coordinate space by

b+i j = 1√

2(b+

i↑b+j↓ − b+

i↓b+j↑) (49a)

FIGURE 26 Various hopping processes in narrow-band systemsin a partial band-filling case: (a) virtual hopping processes leadingto a kinetic exchange interaction; (b) single-particle hopping rep-resenting the band energy of correlated electrons; (c) contributionto the pair hopping—this process gives the pairing contribution inEq. (50) with k �= i .

and

bi j = 1√2

(bi↓b j↑ − bi↑b j↑), (49b)

then the Hamiltonian [Eq. (46)] with inclusion of thethree–site part can be written in the following very sug-gestive closed form [55]:

H =∑i jσ

′ti j b

+iσ b jσ −

∑i j

(2ti j t jk/U )b+i j bk j . (50)

The first term represents, as before, the dynamics of sin-gle electrons moving between the empty sites regardedas holes; the second term combines the last two terms inEq. (46) and expresses the dynamics of the singlet pairs[cf. Eqs. (49a) and (49b)]. The division in Eq. (50) intosingle–particle and pair parts is in analogy to the BCSHamiltonian; however, here, the operators are expressedin coordinate space. The term with i = k in the pairing partenumerates the spin–singlet pairs of neighboring spins; theterms with i �= k represent pair hopping of such singlet pairbonds. Thus, in the language of operators [Eqs. (49)], oneadds the bond dynamics to that of single electrons. More-over, the forms of Eqs. (46) and (50) are completely equiv-alent; hence, the pairing effect and the antiferromagnetismshould be directly linked within this formalism (they aretwo different expressions of the same part of H ).

It is difficult to diagonalize the Hamiltonian [Eq. (50)]to obtain the eigenvalues of the system. Part of the prob-lem arises from the fact that the single–particle operatorsbiσ and b+

jσ do not obey the fermion anticommutation re-lation and that the pair operators b+

i j and bi j do not obeyboson commutation relations. Additionally, the two termsin (3.5) do not commute, so that the itinerant characteris-tics of the electrons and the pair-binding effects combineand produce a paired metallic phase, particularly if thetwo terms are of comparable magnitude. We have seenin Section III that if the number of holes δ ≡ 1 − n <

δc ∼ 0.02, then the pairing (or exchange) part dominatesand antiferromagnetism sets in. Detailed calculations [32]lead to the boundary line between the antiferromagneticand the ferromagnetic phase, as shown in Fig. 27. Theenergy of the completely saturated ferromagnetic phase(CF) indicated does not depend on the value of exchangeintegral Ji j ≡ 2t2

i j/U .We now discuss the superconducting phase for which

the pairing part in Eq. (50) plays a crucial role. To make theproblem tractable at this point, one replaces the operators[Eqs. (49)] by fermion operators [45, 56], that is,

b+iσ → a+

iσ , biσ → aiσ , (51a)

and introduces the replacement

b+i j → b+

i j = 1√2

(a+

i↓a+j↑ − a+

i↓a+j↑

)(51b)

Page 360: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

276 Superconductivity Mechanisms

FIGURE 27 Phase boundary between the mixed ferromagnetic(CF)–antiferromagnetic (AF) phase and the pure ferromagneticphase for simple cubic (z = 6) and b.c.c. cubic (z = 8) structures.A similar type of phase boundary can be obtained for other struc-tures. [From Ref. 63.]

and

bi j → bi j = 1√2

(ai↓a j↑ − ai↑a j↓). (51c)

Simultaneously, one renormalizes the parameters ti j andJi j in such a manner that they contain the restrictions onparticle dynamics due to the projection of doubly occu-pied site configurations in the expression for the ground-state energy. Within the Gutzwiller–Ansatz approximation[29], Eqs. (49) reduce the starting Hamiltonian to the form

H = δ∑i jσ

ti j a+iσ a jσ

∑i jk

(2ti j t jk/U )b+i j bk j , (52)

where δ = 1 − n. This Hamiltonian has been solved withinthe mean-field approximation equivalent to the BCS ap-proximation [56, 57] and with neglect of the pairing termswith k �= i . This leads to the following self-consistentequations for �k �= 0:

J

N

∑k

γ 2k

Ektanh

(βEk

2

)= 1, (53)

with J = 2t2/U , Ek = [(εk − µ)2 + |�k|2]12 , and γk =

cos(kx a) + cos(kya) for a planar configuration of the lat-tice. This equation must be supplemented with the equa-tion for the chemical potential in the superconductingphase of the form

1

N

∑k

(1 − εk

Ek

)tanh

(βEk

2

)= n. (54)

In solving Eq. (53), solutions of the following type havebeen considered:

1. extended s–wave [56, 58],

�(s)k = �[cos(kx a) + cos(kya)]; (55a)

2. d–wave [57, 58],

�(d)k = �[cos(kx a) − cos(kya)]; (55b)

3. mixed s and d phases [59],

�(sd)k = s�(s)

k + d�(d)k . (55c)

The mixed phase was found to be the most stable closeto the half-filled band case. For the half-filled band case,the ground-state energies for s– and d–wave states are thesame.

The type of solution obtained within the mean-field ap-proximation (cf. Section II for details) is illustrated inFig. 28, where the temperature dependence of the specificheat is shown for a different number of holes δ and for|t |/U = 0.1 and with the inclusion of the nearest-neighborrepulsive coulomb interaction V . A discontinuity of C(T )at T = Tc takes place for each δ. For comparison, the dot-ted lines represent the specific heat for the normal phase.

There is a major problem with the standard mean-fieldsolutions discussed in Refs. 55–59, namely, it yields anonzero (in fact, maximal or almost-maximal) value ofthe superconducting transition temperature Tc for the half-filled band case, which corresponds to the Mott insulat-ing state. This is a spurious result; it appears because byperforming the transformation Eqs. (51), the double-siteoccupancies reappear again for n < 1. To remove some ofthe unphysical features of the mean-field solution, a newformalism has been proposed [58–60] in which auxiliary(slave) bosons are introduced. In this formalism, some

FIGURE 28 Temperature dependence of the specific heat C (T )within the mean-field approach to the exchange-mediated pairingin a narrow band. The dotted line represents C (T ) for the nor-mal phase, while the discontinuity occurs at the transition to thesuperconducting phase. [From Ref. 58.]

Page 361: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

Superconductivity Mechanisms 277

of the properties of the projected operators [Eqs. (47)and (49)] are already preserved in the mean field-typeapproximation involving boson and fermion fields on thesame footing. The transition temperature Tc now vanishes,as it should, in the limit n = 1. According to Ref. 61, theslave bosons represent holes in the Mott insulators and areregarded as charged, while the fermions are neutral. Theseentities are called holons and spinons, respectively.

The holon–spinon language is introduced formally bynoting that the projected operators [Eq. (47)] are repre-sented as

b+iσ ≡ bi f +

iσ and biσ = b+i fiσ , (56)

where bi and b+i are annihilation and creation boson op-

erators located at the atomic site i , while f +iσ ≡ a+

iσ andfiσ ≡ aiσ are the commonly used fermion operators. Sub-stituting Eq. (56) into Eq. (46), one obtains

H =∑i jσ

′ti j bi b

+j f +

iσ f jσ −∑

i j

′(2t2

i j

/U

)b+

i j bi j

+∑

i

λi

(b+

i bi +∑

σ

f +iσ fiσ − 1

)

− µ∑iσ

niσ + (three–site terms). (57)

The first two terms represent, respectively, the coupledholon–spinon hopping and the binding of spinons into sin-glet pairs. The third term expresses the fact that the numberof holons and spinons is equal to unity on each site; theLagrange multiplier λi thus explicitly provides formallythe removal of double occupancies. The fourth term rep-resents the conservation of the number of electrons. Now,in a further approximation, one decouples fermions frombosons and then solves the two parts self-consistently. Themean-field treatment discussed earlier corresponds to theapproximation in which λi is taken as the same at eachsite (λi → λ) and in which one introduces the replace-ment 〈bi b

+i 〉 = 〈bi 〉 〈b+

i 〉 = |〈b〉|2 = 1 − n. The supercon-ducting solution is described in terms of two correlationfunctions: �B ≡ 〈b+

i b+j 〉 ≈ 〈b+〉2, characterizing the Bose

condensation of holons, and �F ≡ 〈 fi↑ f j↓〉, characteriz-ing the gap in the spectrum of fermion excitations (the siteindices i and j denote a pair 〈i j〉 of nearest neighbors). Thenonzero �B occurs only below a temperature TB, whichwe call the Bose condensation temperature, whereas thenonzero �F appears only below T = TRVB, characterizingthe mean-field solution within the RVB theory [56]. Thesuperconducting phase is characterized by nonzero valuesof both �B and �F simultaneously. This is because in themean—field approximation 〈bi↑bi↓〉 = �B�F. Hence, thelower of the two temperatures (TB and TRVB) determinesthe superconducting transition temperature. In Fig. 29,taken from Ref. 63, we have plotted these two tempera-

FIGURE 29 (a) Critical temperature Tc (the thick line) versusδ = 1 − n. The temperatures TRVB and TB are those characterizingthe onset of coherency for spinons and the Bose condensation ofholons. Note that Tc is determined by the lower of the two temper-atures. [From Ref. 62.] (b) Schematic theoretical phase diagramobtained within the gauge theory.

tures as a function δ = 1 − n. One should note that to haveTB �= 0, a small nonzero overlap tz = 0.1t was taken in thedirection perpendicular to the square planar configurationof the atoms. We see that Tc → 0 as δ → 0, as should bethe case.

Note added in August 2000. In the last 10 years theslave-boson approach evolved into the gauge-theory ap-proach to doped Mott insulators [33d]. This approachleads to the phase diagram shown schematically inFig. 29b. The details of this phase diagram go beyondthe scope of this article. The other factors are the detailedrole of the van Hove singularity in the two-dimensonalidensity of states for a square lattice [67a, b] in increasingeven the BCS value of the critical temperature, as wellas the determined d-wave symmetry reflecting the strongon-site Coulomb repulsion, which produces a node in thespatial dependence of the gap �(r) [67c, d]. Finally,

Page 362: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

278 Superconductivity Mechanisms

the role of the interlayer Josephson tunneling has beenstressed [67e, f ], although there is some discussion [67g]about the magnitude of the condensation energy due to theformation of (a) truly three-dimensonal paired state froma two-dimensonal normal metal. This means that a simpletype of Lawrence–Doniach–Ginzburg–Landau approach[67h] and other models based on the interplanar Josephsontunneling [67e, f ] may not be sufficient.

2. Hybridized Systems

The electron states near the Fermi surface in high-Tc

oxides such as La2CuO4 involve hybridization of elec-trons of atomiclike 3dx2−y2 states of copper with 2pσ ofoxygen (cf. Fig. 23 and Section III.D). These electronstates can be described by the Anderson lattice Hamilto-nian of the type of Eq. (43), with a width of the bare p–band W ≈ 4 ev, the position of the 3d 9 level at εf ≡ εd −εp ∼ 1 eV, U ≤ 10 eV, and hybridization magnitude|V | � 1.5 eV [68]. The hybridization is intersite in na-ture, that is, it involves the 2p and 3d orbitals located ondifferent sites. Therefore, the effective hybridization en-ergy is V z � 6 eV, where z = 4 is the number of nearest-neighboring O atoms in the plane for a given Cu atom. Wesee that V z > εf; hence, the 3d and 2p states mix strongly,that is, the d electrons can be promoted to 2p–hole states,and vice versa. Additionally, 2p electrons can be promotedto form the 3d10 configurations of the excited states. IfV z � εf, but |V |z � εf + U , the above two promotion–mixing events are low– and high–energy processes, re-spectively. The situation is shown schematically in Fig. 30,where the parameter U is assumed to be by far largerthan |εf |, W , or |V |z. We consider this limiting situationfirst [68].

The high–energy processes take place only as virtualevents, that is, with electron hopping from the p state tothe highly excited 3d state and back. Such virtual p–d–pprocesses are shown schematically in Fig. 31, where sitem labels the 2pσ state of the oxygen anion O2− centered atRm and site i labels 3dx2−y2 due to the Cu2+ ion centeredat Ri. Then the effective Hamiltonian can be rewritten inthe real–space language and for large U reads [68]

H =∑kσ

εknkσ + εf

∑iσ

b+iσ biσ +

∑imσ

Vim(b+

iσ cmσ

+ V ∗imc+

mσ biσ) −

∑imn

2V ∗mi Vim

U + εfB+

im Bin. (58)

The first term describes the band energy of itinerant (2pσ )electrons, while the third represents the residual mixingpairing since, as in the case of narrow-band electrons, theoperators (b+

iσ ) and (biσ ) are projected operators [Eq. (47)]for the starting 3d states. The last term represents the so-

FIGURE 30 Division of the charge-transfer (p–d ) processes intolow- and high-energy parts. The processes labeled II give rise toKondo and superexchange interactions when treated perturba-tionally to second and fourth order, respectively.

called hybrid interorbital pairing with the pairing opera-tors

B+im = 1√

2

(b+

i↑c+m↓ − b+

i↓c+i↑

)(59a)

and

Bim = 1√2

(bi↑cm↓ − bi↓ci↑). (59b)

FIGURE 31 Schematic representation of the hopping processesinduced by high-energy mixing processes. The hoppings labeled2 and 2′ are alternative processes.

Page 363: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

Superconductivity Mechanisms 279

The meaning of the effective Hamiltonian [Eq. (58)]is as follows. The first three terms provide eigenvaluesrepresenting the hybridized quasiparticle states with thestructure discussed in Section III.C. The last term providesa singlet pairing for those hybridized states. It expresses(for m = n) the Kondo interaction between the p and the3d electrons of the form∑

im

2|Vim |2U + εf

(Si · sm − 1

4νi nm

).

It is antiferromagnetic in nature, with the exchangeintegral

Jim ≡ 2|Vim |2U + εf

∼ 0.5 eV,

hence, the pairing results in a spin–singlet state. It mustbe underlined that Eq. (58) represents hybridized corre-lated states in the so-called fluctuating-valence regime inwhich U � |Vim | >∼ εf. This is the reason why we cannotcompletely transform out the hybridization. Also, the oc-cupancy nf of the atomic level is a noninteger because thestrong hybridization induces a redistribution of the parti-cles among starting atomic and band states.

When both U and |εf| are much larger than |Vim |, onecan transform out the hybridization completely and obtain,instead of Eq. (58), the following effective Hamiltonian:

H =∑kσ

εknkσ + εf

∑iσ

b+iσ biσ +

∑i jmσ

V ∗mi Vmi

εf

× b+iσ b jσ (1 − nmσ )

∑imσ

2V ∗mi VinU

εf(εf + U )b+

imbin. (60)

We now have a two–band system: the 3d electrons acquirea bandwidth W ∗ ∼ (V 2/εf)(1 − 〈nmσ 〉). The spin–singletpairing is again of the interband type. The part with m = nin the last term is equivalent to the Kondo interaction de-rived a long time ago for magnetic impurities [69]. Here,the lattice version of this Hamiltonian provides both pair-ing and itinerancy to the bare atomic electrons.

Note that the hybrid pairing introduced in this sectionexpresses both the Kondo interaction (the two–site part)and pair hopping. It is therefore suitable for a discussionof the superconductivity of Kondo lattice effects in heavy-fermion systems. The pairing part supplements the cur-rent discussions of the Anderson lattice Hamiltonian inthe U → ∞ limit [40–42]. One may state that the Kondointeraction-mediated pairing introduced above representsthe strong–coupled version of spin fluctuation-mediatedpairing for almost-localized systems introduced previ-ously [70].

An approach using the slave-boson language for hy-bridized systems has also been formulated [71] and con-tains a principal feature of the effective Hamiltonian

FIGURE 32 Superconducting transition temperature Tc ver-sus hole concentration xh. Squares, experimental data forLa2−xSrxCuO4; circles and diamonds, data for YBa2Cu3O7−y.[From Ref. 71.]

[Eq. (58)]; the solution in the mean-field approximationhas been also discussed. Figure 32 illustrates the depen-dence of the superconducting transition temperature Tc

versus the hole concentration xh; this is compared withexperimental data [72]. Dependence of Tc over the fullconcentration range of holes is shown in Fig. 33. Thesuperconductivity appears for La2−x Srx CuO4 only for0.04 � xh < 0.34. The full phase diagram comprising lo-calization and antiferromagnetism (LM phase) and super-conductivity (SC) is provided in Fig. 33.

3. An Overview

Two alternative models and mechanisms of exchange–mediated pairing have been discussed so far: the narrow–band model, with d–d kinetic exchange-mediated pairing,and the hybridized model, with d–p Kondo interaction-mediated pairing. The hybridized model should beregarded as a basis of narrow–band behavior in real ox-ides and in heavy–fermion systems since the direct d–d (orf – f ) overlap of the neighboring atomic wave functionsis extremely small. Next, we give a brief overview of the

FIGURE 33 Superconducting transition temperature Tc versushole concentration for La2−xSrxCuO4 over the full range. LM, theregime of local moments (insulating phase). [From Ref. 72.]

Page 364: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

280 Superconductivity Mechanisms

narrow–band properties of the correlated electrons start-ing from the hybridized (Anderson lattice) model.

First, we discuss the quasiparticle states in the U → ∞limit. The simplest approximation is to reintroduce ordi-nary fermion operators a+

iσ and aiσ in Eq. (58) and readjustthe hybridization accordingly [73]. In effect, one obtainsthe hybridized bands of the form of Eq. (45), that is,

Ek± = εf + εk

[(εf − εk

2

)2

+ 4|V k|2] 1

2

, (61)

where Vk ≡ q12 Vk, and q ≡ (1 − nf)/(1 − nf/2) for 0 ≤

nf ≤ 1, while Vk is the space Fourier transform of Vim .For the case of the CuO2 layers [74],

|V k |2 = qV 2

[sin2

(kx a

2

)+ sin2

(kya

2

)]. (62)

If the Fermi level falls into the lower hybridization bandand nf = 1 − δ, with δ � 1, then it can be shown thatthe quasiparticles describing the hybridized states are ofmainly quasi-atomic character. In other words, the ef-fective Hamiltonian [Eq. (58)] is approximately of thenarrow-band form [Eq. (52)]. The pairing takes place be-tween heavy quasiparticles. This limiting situation de-scribes qualitatively the situation in heavy fermions withKondo interaction mediating the pairing. In contrast, if theFermi level falls close to the top of the upper hybridiza-tion band (as is the case for high-Tc superconductors, sincethe p band is almost full and the 3d level is almost half-filled), then the pairing is due mainly to the band electrons(2p holes in the case of high-Tc oxides). These results areobtained by constructing explicItly the eigenstates corre-sponding to the eigenvalues [Eq. (61)] and taking the lim-its corresponding to heavy fermions (nf → 1) and high-Tc

systems (n = nd + n p ≈ 3, which also corresponds to thesituation of one hole in the system).

a. Mott–Hubbard insulators, charge–transfer insu-lators, and mixed–valent systems. The next problemconcerns the Mott localization in systems with hybridizedd–p states. The systems such as NiO, CoO, and MnOregarded as classic Mott insulators are, strictly speaking,hybridized 3d–2p systems. However, these cases are, to agood approximation, ionic systems in the sense that theelectronic configuration in, for example, NiO, is Ni2+O2−.Then, the valence 2p band is completely full and plays onlya passive role in effective d–d charge transfer processes[75], since a 2p → 3d transfer is followed by 3d → 2ptransfer from the neighboring 3d shell of Ni2+. In effect,the antiferromagnetic exchange interaction in Eq. (46) ex-presses formally the superexchange interaction that hasbeen known for a long time [75, 76]. In this approach, thekinetic exchange interaction between d electrons (induced

by virtual d–d transitions; cf. Section III.A) is expressedas a fourth-order effect in the hybridization V since thevirtual d–d transition involves a sequence of d–p and p–dtransitions in the fourth order.

The possible macroscopic states of hybridized systemsare illustrated in Fig. 34 as a schematic classification ofpossible states of hybridized systems modeled by the pe-riodic Anderson Hamiltonian [Eq. (43)]. The parameterW/U characterizes the degree of correlation of quasi–atomic electrons that may acquire a nonzero bandwidthdue mainly to hybridization; the parameter V/εf charac-terizes the degree of mixing of the states involved. If thed (or f ) atomic level lies deeply below the top of the val-ence band (V/εf � 1), then we have either Mott–Hubbard(M–H) or charge–transfer (C–T) insulators; for the for-mer the band gap � is due to dn → dn+1 excitations (thatis, � ∼ U − W ), whereas for the latter it is due to dn p2 →dn+1 p1 charge–transfer transitions. The atomic 3d (or 4f )electrons are unpaired in both the C–T and the M–H states.If V/εf � 1, and W/U � 1 then we enter mixed valent(M–V) and (close to the border with M–H) heavy–fermionregimes. On the other hand, if W/U � 1, then irrespec-tive of the value of V/εf, we encounter the correlated–metal regime that we call an almost–localized Fermi liquid(AL–FL). Both heavy–fermion and high-Tc systems areclose to the line separating M–H and M–V regimes. Such aclassification scheme for transition-metal oxides has beenproposed in Ref. [77].

The classification shown schematically in Fig. 34 pro-vides only a distinction between insulating and metallicstates. A complete magnetic phase diagram for the high-Tc

system La2−x Srx CuO4 is shown schematically in Fig. 35(taken from Ref. 79a). Stoichiometric or doped La2CuO4,

FIGURE 34 Schematic representation of the regimes of stabilityof the charge-transfer (C–T) and Mott–Hubbard (M–H) insulatingstates, as well as of the mixed-valent (M–V) and almost-localizedFermi-liquid (AL–FL) metallic states.

Page 365: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

Superconductivity Mechanisms 281

FIGURE 35 Schematic phase diagram on the plane T–x forLa2−xSrxCuO4. Antiferromagnetic (AF), spin-glass (SG), super-conducting (SC), insulating (I), and metallic phases are drawn,as well as the boundary between the orthorhombic (O) and thetetragonal (T) crystallographic phases. [From Ref. 80.]

with x � 0.02, exhibits antiferromagnetism (AF). In theregime of 0.02 � x � 0.04, the inhomogeneous (SG) mag-netic insulating phase sets in, while for x � 0.04, a transi-tion from insulating (I) to metallic (M) takes place and thesystem is superconducting until a transition from an or-thorhombic (O) to a tetragonal (T) crystallographic struc-ture occurs. A similar phase diagram was established forYBa2Cu3O6+x [77b]. Those phase diagrams combine allthe features we have discussed separately so far. The mainfeatures of this phase diagram are explained next.

b. Magnetic interactions hybrid, polarons, andpairing. To address the phase diagram shown in Fig. 35within the hybridized p–d model, we note first that antifer-romagnetism is stable only close to the half-filling of thed–band (cf. Fig. 27 and the discussion in Section III.A).In the case of the hybridized model, one has to calculateexplicitly the contributions to the d–p and d–d interac-tions. Within the perturbation expansion for the Andersonlattice model but with only the high–energy mixing pro-cesses (cf. Fig. 30) treated in this manner [68, 70], weobtain the magnetic part of the effective Hamiltonian tofourth order as

Hm � Jpd

∑im

(Si · sm − 1

4ni nm

)+ Jdd

∑〈i j〉

(Si · S j

− 1

4Ni N j

)+ Jpp

∑〈mm ′〉

(sm · s′

m − 1

4nmn′

m

),

(63)

where the first term represents the p–d Kondo–type inter-action, with the exchange integral

Jpd ≈ 2|V |2U + εf

[1 − |V |2

U + εf(nd + n p + 1)

]. (64)

The second term expresses the d–d (kinetic exchange)interaction, with Jdd = |V |4/(U + εf)3, and the last termrepresents the interaction between p holes, with Jpp =|V |4nd

/(U + εf)3 ≈ Jdd . The antiferromagnetic p–d andd–d interactions are not compatible; in the hole language,the p hole polarizes its surroundings ferromagnetically,as shown in Fig. 16 (note that the hole may be located inany O1− ion, so its position with the volume of radius Ris not fixed). A simple estimate [79] of the canting angleθ between the neighboring 3d spins Si and S j caused bythe hole polarization gives

cosθ

2≈ Jpd − 2Jdd

2Jdd. (65)

Taking Jpd ≈ 0.5 eV and Jdd � 50 K, we obtain the aver-age canting angle θ through the relation cos(θ/2) ≈ 25x p.The energy Ec of the system with a single hole canting thesurrounding spins is

Ec = −1

2

(Jpd − 2Jdd )2

Jddz − Jdd z. (66)

This energy is lower than the energy (−Jdd z) of the an-tiferromagnetic (Neel) state of antialigned d spins due toCu2+ copper ions. Next, we estimate the radius R of thehole polaron with aligned spins, as depicted in Fig. 36.Applying the same type of reasoning as in Section III.A,we obtain the expression for the energy E p of a singlepolaron:

E p = E0

(R/a)2− 1

2

(Jpd − 2Jdd )2

Jddz · x2

p, (67)

where now x = (a/R)2 is the probability of finding a phole on a given oxygen atomic site within the radius R.Minimization with respect to R for the two–dimensionalcase leads to

R

a=

[(Jpd − 2Jdd )2z

Jdd E0

] 12

≈ Jpd

√z

Jdde0≈ 4. (68)

A MIT takes place when the neighboring polarons over-lap, that is, when Rx−1/2

pc = 1; this yields the critical holeconcentration xc ≈ 0.07. One can also estimate this crit-ical concentration by equating the band energy of holes,which is −(W/2)x p(1 − x p), with the magnetic energygain per hole due to aligning the neighboring d spins(−J 2

pd/2Jdd)zx2p. This leads again to xc ≈ 0.068, in rough

agreement with the observed value xc � 0.04 ÷ 0.05. Forx > xc, the ground state of the system is metallic, and thepairing described in Sections IV.A.1 and IV.A.2 can takeplace. Within the exchange–mediated mechanism, all in-teractions in Eq. (63) are antiferromagnetic. Hence, in gen-eral, one has p–d pairing characterized by the operatorsof Eqs. (59), d–d pairingcharacterized by the operators

Page 366: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

282 Superconductivity Mechanisms

FIGURE 36 Schematic representation of a 2p-hole polaron in aplanar CuO2 structure. Cu2+ ions are indicated by arrows, whileO2− ions are indicated by open circles. The hole creates a cantedspin configuration with resultant ferromagnetic polarization andautolocalizes in it. This is the reason the high-Tc oxides remaininsulating when the concentration of the hole does not exceedxc ∼ 0.04 ÷ 0.05.

of Eqs. (49), and p–p pairing [80] characterized by theoperators

p+mm ′ ≡ 1√

2

(c+

m↑c+m ′↓ − c+

m↓c+m ′↑

)(69a)

and

pmm ′ ≡ 1√2

(cm↓cm ′↑ − cm↑cm ′↓). (69b)

All three types of pairing may contribute to the super-conducting ground state. However, the d–p interaction ismuch stronger, hence, the d–p hybrid type of pairing is inthe limit U > W > |V | � εf, the dominant one. As stated,this type of pairing may appear effectively as a d–d orp–p type of pairing in the hybridized basis, depending onwhether the Fermi level lies close to the top of the lower orupper hybridized bands, respectively. For the sake of com-pleteness, we write down the full effective Hamiltonianwith all the pairings specified, namely,

H =∑kσ

εknkσ+ εf

∑iσ

b+iσ biσ +

∑imσ

(Vimb+

iσ cmσ

+ V ∗imc+

mσ biσ) + Jpd

∑〈im〉

B+im Bin

+ Jdd

∑〈i jk〉

b+i j bk j + Jpp

∑〈mm ′〉

p+mm ′ pmm ′ . (70)

In deriving this result, one does not assume that|V | � εf; therefore Eq. (70) isapplicable to the situation

with fluctuating valence. Next, by introducing a slave-boson representation [Eq. (56)], we obtain the most gen-eral Hamiltonian for treatment of pairing in correlatedsystems [79]. We should be able to witness a decisiveprogress in the near-future concerning the relative roleof hybrid p–d, d–d, and p–d pairings in high-Tc sys-tems using the slave-boson or gauge-field approaches toEq. (70). Also, Eq. (70) should serve as a basis for thediscussion of antiferromagnetism and superconductivityin heavy–fermion systems; in that situation, the role ofitinerant 2p states is played by hybridized 5d–6s conduc-tion bands, while the role of 3d electrons is played bythe 4f electrons due to Ce or by 5f electrons in uraniumcompounds.

c. Coexistence of antiferromagnetism and super-conductivity. In the previous analysis, above we havetreated antiferromagnetism (AF) and superconductivity(SC) separately. Detailed calculations [43, 62, 81], withinthe mean–field theory discussed, point to the possibilityof the coexistence of AF and SC phases. It is possible tovisualize this coexistence by considering a narrow-bandmodel with the two-dimensional (almost-square) Fermisurface as shown in Fig. 37. Namely, the band energyof electrons located on the Fermi surface has the prop-erty εk+Q = −εk, where Q ≡ (π/a, π/a) = 2kF. This isthe so-called nesting condition; any system with this prop-erty is unstable with respect to the formation of the spindensity-wave (SDW) state with the wave vector Q. Oneshould note (cf. Fig. 37) that Q connects two single-particle states on the opposite sides of the Fermi sur-face since −kF + Q = kF. Furthermore, both SDW andSC states couple electrons with the opposite spins. Thisis why two sublattice AF and SC states are compatibleonly for n ≈ 1, i.e., for the half-filled band. There is no

FIGURE 37 Two-dimensional Fermi surface for a half-filled band.The opposite points of the surface are related by the wave vectorQ = (π/a)(1, 1).

Page 367: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

Superconductivity Mechanisms 283

clear experimental evidence that these two phases coex-ist in a high-Tc system, though there is some evidencefrom muon–spin rotation that it is so [82]. A clear detec-tion of such coexistence would demonstrate directly theimportance of exchange interactions in a superconduct-ing phase. Namely, within exchange-mediated supercon-ductivity, one can show [55] that close to the half-fillednarrow-band case, TN/Tc ∼ 6 ÷ 8 (this is a mean-field-approximation estimate). The analysis of AF–SC coexis-tence conditions within the Anderson lattice Hamiltonianhas not yet been performed satisfactorly, even though thosetwo phases coexist in heavy-fermion compounds such asUPt3 and URu2Si2.

B. Phonons and Bipolarons

After the discoveries of superconductivity in the 40 and90 K ranges [83], the obvious question was posed whetherthe phonon–mediated mechanism of pairing, so successfulin the past, can explain the superconductivity with such ahigh value of Tc. It was realized from the outset that oneshould include specific properties of these compounds,such as the quasi-planar (CuO2) structure with a logarith-mic (Van Hove) singularity in the density of states ρ(ε)at the middle point of the two–dimensional band [83–85],the polar nature of the CuO bonds rendering applicablethe tight–binding representation of the electronic states[49, 84], and strong electron lattice coupling [85–87],leading to the local formation of small bipolarons (thatis, two–electron pairs) [89] that may undergo Bose con-densation when the metallic state is reached [in a more re-fined version a mixed-fermion model is used (cf. Ranniger,Ref. 90)].

There is no clear evidence for the phonon–mediatedmechanism of pairing in classic high-Tc superconduc-tors since the isotope effect in both La systems [90]and Y systems [91] is quite small. However, the recen-tly discovered superconductors Ba1−x Kx BiO3 [92] exhi-bit a large isotope effect [93] and superconductivitywith 20 K ≤ Tc � 30 K in the concentration range 0.25 �x � 0.4. Also, the proximity of superconductivity and thecharge density—wave (CDW) state is observed [94].

The last property, as well as the observed diamagnetismin the insulating phase x � 0.25, is very suggestive [95]that small trapped polarons are formed before the elec-tron subsystem condenses into a superconducting phase.Condensation takes place when the percolation thresholdfor the insulator–metal transition is reached3 (at x ∼ 0.2).

3The actual percolation threshold for the onset of the metallic phaseis xc/2 ∼ 0.12 since the bipolarons reside on every alternate Bi lat-tice site. Also, the holes introduced by K doping must be present ina Bi–O hybridized band for x > xc to render the bipolarons mobile forx > xc.

Three specific features of Ba1−x Kx BiO3compoundsshould be noted. First, the diamagnetic nature of the par-ent compound BaBiO3 distinguishes the systems from theparent compounds La2CuO4 and YBa2Cu3O7, which areboth antiferromagnetic. Second, the Ba1−x Kx BiO3 sys-tems are copper-free and have a truly three–dimensionalcubic structure in the SC phase [92]. Third, their main su-perconducting properties are in accordance with the pre-diction of the standard BCS theory [96].

The theory of the Ba1−x Kx BiO3 compound must in-corporate three additional obvious facts. First, the pair-ing process 2Bi4+ → Bi3+ + Bi5+ is possible when theelectron–lattice coupling leads to an attraction overcom-ing the e–e repulsion in the Bi3+ state relative to the Bi5+

state [89]. It involves a relaxation of the O2− octahedra,that is, an optical, almost dispersionless, breathing mode.This can provide a local (on-site) attractive interaction be-tween 6s electrons of the type λni↑nn↓, which leads to ascalar (k–independent) pairing potential Vkk′ = λ, which,in turn, provides a justification for the observed propertiesreflecting an isotropic shape of the gap (�k ≡ �), as in thestandard BCS theory (cf. Section II).

Second, from the fact that the parent compound BaBiO3

is an insulator, we conclude that either the magnitudeV of the coulomb repulsion between the electrons onnearest-neighboring Bi atoms exceeds the width W ∗ ofthe bipolaron band [96] or the small bipolarons are self-trapped in the potential created by interaction with nearest-neighboring oxygens. The onset of the metallic phase atconcentrations near the percolation threshold xc ∼ 0.1 forn.n. interaction means that both effects may be important.In either case, the CDW state will set in, so the entropyof the bipolaron lattice vanishes at T = 0 (at least, forx = 0). The CDW phase plays the same role here as doesAF ordering in La2CuO4 and YBa2Cu3O6. The proper-ties of Ba1−x Kx BiO3 are instead similar to those of theBa1−x Pbx BiO3 compounds discovered over a decade ear-lier [98].

Third, the fact that the onset of the superconductivitycoincides with the transition from the CDW insulator toan SC metal speaks in favor of preexisting electron pairsalready present in the insulating phase. However, the bipo-laron concentration is large, and hence, the interpretationof the superconducting transition as Bose condensationof bipolarons may be inapplicable even when the coher-ence length is small. The overall theoretical situation isnonetheless much clearer for Ba1−x Kx BiO3 compoundsthan for either the La2−x Srx CuO4 or the YBa2Cu3O7−δ

series since the accumulated (so far) experimental evi-dence indicates that (optical?) phonon–mediated pairingtakes place [99].

The Ba1−x Kx BiO3 compounds seem to be natural can-didates for a bipolaronic mechanism of electron pairing

Page 368: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

284 Superconductivity Mechanisms

[100]. This is because the diamagnetic (and charge–ordered) parent system Ba2Bi3+Bi5+O3 can be regardedas an ordered lattice of locally bound two–electron pairs(bipolarons) located on alternate Bi3+ sites; these pairstates are stabilized by a strong relaxation of the sur-rounding oxygen anions. Effectively, the two electronsare attracted to each other. The effect of potassium dop-ing is to make these pairs mobile by diminishing thenumber of bipolarons per Bi site from the value (1/2)[101]. In essence, the lattice distortion is responsible forthe bipolaron formation in the same manner as in thecase of the copper pairs; the difference is due to the cir-cumstance that the bipolarons are locally bound com-plexes in a direct space that undergo a Bose condensa-tion from an incoherent state of preexisting and movingpairs. The temperature of such condensation is Tc ∼ x

23 ,

where x is the dopant (K) concentration [102]. A keyfeature of the bipolaron theory of superconductivity isthat the Bose-condensed state develops from the CDWinsulating state, not from the SDW (antiferromagnetic)state; the latter situation takes place for the cuprates.Further studies are necessary to calculate the physicalproperties of a bipolaron superconductor and, in parti-cular, the differences from an ordinary (phonon–mediated)superconductor.

C. Charge Excitations

In 1964 Little [103] introduced the idea that virtualelectron–hole (exciton) excitations may lead to a pairingwith a high value of Tc. This idea has been reformulatedrecently in the context of high-Tc superconductivity byconsidering the role of charge transfer (P → d and d → p)fluctuations [104], as well as of intraatomic (Cu d → d)excitations [105]. The charge–transfer fluctuations involveboth Cu2+–O− and Cu3+–O2− low–energy configurationsand Cu+O− states. The former two configurations are par-ticularly important if the energy difference |εp − εd | iscomparable to the magnitude |V | of the 2p–3d hybridiza-tion. This is the limit we have considered within the hy-bridized model in Section IV.A, probably extended to in-clude the 3d–2p coulomb interaction directly. The methodof approach is therefore similar to that in Section IV.Ain the limit of strongly correlated electrons. In the limitof weakly interacting electrons (that is, for U � W ), theperturbation expansion in the powers of U provides an ef-fective pairing potential in an explicit form. The processesleading then to the pairing are virtual exciations involv-ing charge and antiferromagnetic spin fluctuations [106].At the moment, it is difficult to see clearly the differencebetween exchange-mediated and charge transfer-medi-ated types of pairing for strongly correlated hybridizedsystems.

V. CONCLUSIONS

In this article we have concentrated mainly on review-ing the properties of correlated electrons in normal, an-tiferromagnetic, and superconducting phases, in copper-containing systems in which the last two are phases causedby antiferromagnetic exchange interactions. Two theoret-ical models have been discussed in detail: the Hubbardmodel of correlated narrow-band (3d) electrons and theAnderson lattice model of correlated and hybridized elec-trons, involving 2p and 3d states in the case of high-Tc

oxides. The latter model is regarded as more general andapplicable to both high-Tc and heavy–fermion systems; insome limiting situations discussed previously, hybridizedbands exhibit a narrow–band behavior.

The principal novel feature of the metallic phase involv-ing either 3d (in high-Tc oxides) or 4 f (in heavy–fermionsystems) electrons is that for the half–filled band config-uration the itinerant electron states transform into a set oflocalized states constituting the Mott insulator. The dif-ference between the Fermi liquid (FL) and the liquid ofcorrelated electrons [the statistical spin liquid (SL)] is il-lustrated in Fig. 38, where the high–temperature value ofthe entropy has been plotted for these two phases as a func-tion of the number n of electrons per atom [the statisticaldistribution Eq. (41d) was used to calculate the entropyS(n) for the latter phase]. Only the spin–liquid case cor-rectly reproduces the entropy of localized moments whenthe Mott insulator limit is reached for n → 1. This limit-ing value of the entropy per mole, S = R ln 2 for n → 1,represents one of the necessary conditions to be fulfilledby any theory claiming to describe properly the situationnear the Mott insulator limit. Additionally, those systemsare characterized by pseudo-particles with a very heavy

FIGURE 38 The high-temperature limiting value for the entropy(in units of the gas constant R ) as a function of n for the Fermiliquid (FL) and the spin liquid (SL). Note the difference in the valuesof a factor of two in the limit n → 1.

Page 369: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

Superconductivity Mechanisms 285

FIGURE 39 Schematic representation of the difference betweena normal metal and a correlated metal. Only the latter state maylead to Mott localization, as well as to heavy-fermion and spin-liquid metallic phases.

effective mass m∗ ∼ δ−1 or W ∗ ∼ δ. For δ � 1, the bandenergy becomes comparable to the kinetic exchange char-acterized by J = W 2/(U z). Itinerant systems for whichJ � W ∗ are called quantum spin–liquid systems. The Mottinsulator, the spin–liquid, and the heavy–fermion states arethe primary phases of correlated electrons different fromthe normal–metal state. This difference is sketched out inFig. 39, where the arrows point both to common featuresfor normal and correlated metals and to those specific tothe correlated systems.

The correlated systems that interested us here may alsobe called almost-localized systems. As discussed in Sec-tion II, there are two classes of such systems, separatedroughly by the Mott–Hubbard boundary U = Uc ∼ W :those for which the coulomb interaction U < Uc are re-garded as Fermi liquids have been treated extensively inRefs. 29 and 30, while those systems for which U > Uc

are the spin liquids. This qualitative division is sketchedin Fig. 40, where the various thermodynamic phases havebeen specified for each class (cf. also Fig. 14 for all mag-netic phases). The complementary regimes are those withU/W � 1 and U/W � 1. Most of the metallic systemscan be located between these two limiting situations. It re-mains to be proven more precisely that the Mott–Hubbardboundary separating, for n = 1, Fermi liquid from the Mottinsulator extends to the part of the diagram with n �= 1,

FIGURE 40 Qualitative distinction between the Fermi-liquid andthe spin-liquid states. The Mott boundary U =Uc roughly sepa-rates the two limiting phases.

where Fermi liquid transforms with increasing interac-tion into non-Fermi liquid. This is a fundamental prob-lem, related, in the case of strongly interacting systems,to the question of the validity of the Luttinger theorem4

and to the problem of the existence of local magneticmoments in the itinerant–electron picture, that is, to theproblem of the validity of the Bloch theorem for a cor-related metal. Also, the question of applicability of theFermi–liquid concept in the limit U/W � 1 is connectedwith that concerning the properly defined existence offermion quasiparticles,5 interacting only weakly amongthemselves. One should emphasize that the discussionof the standard mean-field treatment of superconductiv-ity presented in Section IV reduces the whole problemto the single-particle approach with a self-consistent field∼�k. It is not yet completely clear what types of collectiveexcitations (antiferromagnetic spin fluctuations? stripes?)are needed to make the theory complete. The introductionof holons as bosons and spinons as fermions [35] seemsto be just one possibility; more natural seems to be a treat-ment of holons as spinless fermions and of spinons as bo-son operators that reflect magnonlike properties of localmoments.

Early studies of high-Tc oxides revealed that some oftheir characteristics are close to those provided by theBCS theory. Namely, the value of 2�0/kbTc � 4 ÷ 6 is

4The Luttinger theorem states that, as long as the metallic state isstable, the volume encircled by the Fermi surface remains indepen-dent of the strength of the electron–electron interaction. This theo-rem is not valid when the Mott transition takes place, as the Fermisurface then disappears. The volume also doubles when metal is de-scribed by a statistical spin liquid discussed in Section II (cf. Figs. 16aand b).

5The holons and spinons cannot be regarded as quasiparticles, sincethe Green function describing them has branching cuts rather thanpoles.

Page 370: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

286 Superconductivity Mechanisms

indicated [107], the temperature dependence of the Lon-don penetration depth is close to [1 − (T/Tc)4]−

12 over

a wide temperature range [108], and the electron pair-ing is in the spin-singlet state [109]. Additionally, theshape of the Fermi surface for YBa2Cu3O7, as deter-mined by the positron annihilation technique [110], agreeswith the predictions of the band-structure calculations foran even (n ≈ 4) number of electrons. These results donot necessarily eliminate the principal features obtainedfrom the theory of strong electron correlations. We thinkthat before discarding the theory based on electron cor-relations, we must show clearly that the stoichiometricLa2CuO4 or YBa2Cu3O6 compounds are not insulating inthe paramagnetic phase; actually, they seem to be param-agnetic insulators with well–defined magnetic moments(that is, with the Curie–Weiss law for the magnetic sus-ceptibility obeyed), which supports strongly the view thatthey are Mott insulators. In this respect, the situation inheavy–fermion systems is rather clear since the recenttheoretical results [39–42] based on the theory of stronglycorrelated and hybridized states provide a reasonable ra-tionalization of most of the properties of their normal state.The mechanism of pairing in superconducting heavy–fermion systems has not yet been determined fully; butin view of the circumstances that some of the supercon-ductors (for example, UPt3) are antiferromagnetic and ex-hibit pronounced spin fluctuations in the normal state,the spin–fluctuation mechanism in the version outlinedin Section IV.A.2 is a strong candidate [111]. In the com-ing years one should be able to see a clarification of theseproblem.

Let us end with a methodological remark concerning theanalogy of the studies of magnetism and superconductiv-ity. In 1928, Heisenberg introduced the exchange inter-action Ji j Si · S j between the magnetic atoms with spins{Si }. The ferromagnetic state was understood in terms ofa molecular field Hi ∼ 〈Si 〉 which was related to the directexchange integral Ji j . Later, various other exchange in-teractions have been introduced, such as superexchange,double exchange, RKKY interaction, the Bloembergen–Rowland interaction, Hund’s rule exchange, and kineticexchange, to explain magnetism in specific systems, suchas oxides, rare-earth metals, and transition metals. How-ever, all these new theories provided a description in termsof a single–order parameter—the magnetization 〈Si 〉; theparticular feature of the electron states in each case (local-ized states, itinerant states, or a mixture of the two states)is contained only in the way of defining this order param-eter or the exchange integral. By analogy, the BCS theoryprovided a concept of a superconducting order parameter(�k), which is universal for all theories of singlet super-conductivity. New mechanisms of pairing should providea novel interpretation to the coupling constant Vkk′ as well

as supplying some details concerning the specific featuresof the system under consideration: the gap anisotropy, therole of hybridization, etc. It remains to be seen if somequalitative differences arise if superconductivity shouldoccur as a result of Bose condensation of the preexist-ing pairs. This question is particularly important in thecase when the coherence length is small, as in high-Tc

systems.In the coming years, one should see detailed calcula-

tions within the exchange mechanism and comparisonswith the experiment concerning the complete phase dia-gram, as well as the thermodynamic and electromagneticproperties of the new superconductors La2−x Srx CuO4 andYBa2Cu3O7−δ . It would not be surprising if the finalanswer for these systems came from a detailed analy-sis of the model outlined in Section IV.A. The systemsBa1−x Kx BiO3 will probably be described satisfactorilywithin the standard phonon–mediated mechanism. On theother hand, it is too early to say anything definite about Biand T� compounds with Tc > 100 K, though the suggestedinfluence of the electronic structure near εF by the CuO2

planes seems to indicate a nontrivial role of the exchangeinteractions also in those systems when coupled with in-terlayer pair tunneling. One of the missing links betweenthe properties of the last two classes of compounds andthose of La2−x Srx CuO4 is the conspicuous lack of evi-dence for antiferromagnetism in the Bi2Sr2CaCu2O8 andthe Tl2Ca2Ba2Cu3O10−y compounds.

Note added in August 2000. This article was orig-inally written almost 12 years ago. During those yearsa tremendous number of papers has been published, butthe questions concerning either the pairing mechanism orthe non-Fermi liquid behavior have not been clearly re-solved, either for high-TC or for heavy-fermion supercon-ductors. Nonetheless, a number of experimental resultshave been obtained in a clear form for high-TC systems.Let us mention two additional results. First, the role ofthe hopping between the next neighbors is important forobtaining the open Fermi surface (cf. Fig. 41). This Fermisurface is obtained from the photoemission and encom-passes all electrons [112], i.e., not only the hole statesin the doped Mott insulator. Thus, the principal questionis how to reconcile the strong-correlation nature of theelectrons in the CuO2 plane, as reviewed above with theLuttinger theorem, which seems to be obeyed in optimallydoped and overdoped systems, as concluded from the pho-toemission data. Does this mean that the photoemissionexperiment samples states physically different from thoseinvolved in thermally induced transport properties? TheFermi-liquid features seem clearly to break down in un-derdoped systems [113], where a pseudogap related to thesuperconducting gap is also observed [114].

Page 371: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

Superconductivity Mechanisms 287

FIGURE 41 Schematical representation of a two-dimensonalFermi surface for a nonzero amplitude of second-nearest-neighbor hopping.

ACKNOWLEDGMENTS

I would like to thank Leszek Spalek for technical help. This work wassupported by KBN (Poland) Project No. 2PO3B O92 18.

SEE ALSO THE FOLLOWING ARTICLES

ELECTRONS IN SOLIDS • SUPERCONDUCTING CABLES •SUPERCONDUCTING DEVICES • SUPERCONDUCTIVITY •SUPERCONDUCTORS, HIGH TEMPERATURE

BIBLIOGRAPHY

1. Bardeen, J., Cooper, L. N., and Schrieffer, J. R. (1957). Phys. Rev.106, 162. Bardeen, J., Cooper, L. N., and Schrieffer, J. R. (1957).Phys. Rev. 108, 1175.

2. Schrieffer, J. R. (1964). “Theory of Superconductivity,” W. A.Benjamin, Reading, PA.

3. De Gennes, P. G. (1966). “Superconductivity of Metals and Alloys,”W. A. Benjamin, Reading, PA.

4. Tinkham, M. (1996). “Introduction to Superconductivity,” 2nd ed.,McGraw–Hill, New York.

5. Rickayzen, G. (1965). “Theory of Superconductivity,” Wiley, NewYork.

6. Blatt, J. M. (1964). “Theory of Superconductivity,” AcademicPress, New York.

7. Barone, A., and Paterno, G. (1982). “Physics and Applications ofthe Josephson Effect,” John Wiley, New York.

8. Landau, L. D., Lifshitz, E. M. (1950). “Statistical Physics,” 2nded., Part 2, Chap. 5, Pergamon, Oxford, Abrikosov, A. A., Gorkov,

L. P., and Dzyaloshinski, I. E. (1963). “Methods of Quantum FieldTheory in Statistical Physics,” Chap. 7, Dover, New York.

9. Parks, R. D. (ed.) (1969). “Superconductivity” (2-vols.), Dekker,New York.

10. Kuper, C. G. (1968). “An Introduction to the Theory of Super-conductivity,” Clarendon Press, Oxford, New York. Rose—Innes,A. C., and Rhoderick, E. H. (1969). “Introduction to Superconduc-tivity,” Pergamon Press, Oxford.

11. Frohlich, H. (1952). Proc. Roy. Soc. A 215, 291.12. Cooper, L. N. (1956). Phys. Rev. 104, 1189.13. Bogoliuboy, N. N. (1958). Nuovo Cimento 7, 794. Valatin, J. G.

(1958). Nuovo Cimento 7, 843. Nambu, Y. (1960). Phys. Rev. 117,648.

14. Morel, P., and Anderson, P. W. (1962). Phys. Rev. 125, 1263. Seealso Scalapino, D. J., in Ref. 9, Chapt. 10.

15. Eliashberg, G. M. (1966). Zh. Eksp. Teor. Fiz. 38, 966 [Sov. Phys.JETP 11, 696 (1960)]. For review see Allen, P. B., and Mitrovic,B. (1982). In “Solid State Physics” (H. Ehrenreich, F. Seitz, and D.Turnbull, eds.), pp. 2–92, Academic Press, New York.

16. Migdal, A. B. (1958). Zh, Eksp. Teor. Fiz. 34, 1438 [Sov. Phys.JETP 7, 996(1958)].

17. McMillan, W. L. (1968). Phys. Rev. 167, 331.18. Khan, F. S., and Allen, P. B. (1980). Solid State Commun. 36, 481.19. Cohen, M. L., and Anderson, P. W. (1972). AIP Conf. Proc. No. 4

(D. H. Douglass, ed.), p. 17, (AIP, New York).20. Maxwell, E. (1950). Phys. Rev. 78, 477. Reynolds, C. A., Serin, B.,

Wright, W. H., and Nesbitt, L. B. (1950). Phys. Rev. 78, 487.21. Some of the monographs and reviews which appeared during the

last decade are as follows. (a) Ginsberg, D. H. (ed.) (1989–1995).“Physical Properties of High Temperature Superconductors,” Vols.1–5, World Scientific, Singapore. (b) Battlogg, B., et al., (1996).“Proceedings of the 10th Anniversary Workshop on Physics, Mate-rials and Applications, World Scientific, Singapore. (c) Tsuneto, T.(1998). “Superconductivity and Superfluidity,” Cambridge Univer-sity Press, Cambridge. (d) Cyrot, M., and Pavuna, D. (1992). “In-troduction to Superconductivity and High-TC Materials,” WorldScientific, Singapore. (e) Carbotte, J. P. (1990). Rev. Mod. Phys. 62,p. 1027ff. (f) Waldram, J. R. (1996). “Superconductivity of Metalsand Cuprates,” IOP, Bristol, Philadelphia. (g) Narlikar, A. (ed.)(1990–). “Studies in High Temperature Superconductors,” Vols.1–16, Nova, Science, New York, Budapest. (h) Anderson, P. W.(1997). “The Theory of Superconductivity in High-TC Cuprates,”Princeton University Press, Princeton, NJ. (i) Bisarsh, A. (ed.)(1999). “Superconductivity: An Annotated Bibilography with Ab-stracts,” Nova, Science, New York. ( j) Poole, C. P. (1999). “Hand-book of Superconductivity,” Academic Press, San Diego, CA. (k)Plakida, N. M. (1995). “High Temperature Superconductivity,”Springer Verlag, Berlin.

22. For review see, e.g., Baym, G., and Pethick, C. (1978). In “ThePhysics of Liquid and Solid Helium” (K. H. Bennemann and J. B.Ketterson, eds.), Chap. 3, Wiley, New York; Pines, D., and Nozieres,P. (1996). “The Theory of Quantum Liquids,” W. A. Benjamin, NewYork.

23. Mott, N. F. (1974). “The Metal–Insulator Transitions,” Taylor andFrancis, London. Heritier, M., and Lederer, P. (1977). J. Phys.(Paris) 38, L209.

24. Vaknin, D., et al. (1987). Phys. Rev. Lett. 58, 2802. Endoh, Y., et al.(1987). Phys. Rev. B 37, 7443.

25. Hubbard, J. (1964). Proc. R. Soc. London A 281, 401.26. Acquarone, M., Ray, D. K., and Spal✭ek, J. (1982). J. Phys. C 15,

959.27. Kuwamoto, H., Honig, J. M., and Appel, J. (1980). Phys. Rev. B

22, 2626.

Page 372: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

288 Superconductivity Mechanisms

28. Spal✭ek, J., Oles, A. M., and Honig, J. M. (1980). Phys. Rev. B 28,6802.

29. Brinkman, W. F., and Rice, T. M. (1970). Phys. Rev. B 2, 4302.30. Spal✭ek, J., Datta, A., and Honig, J. M. (1987). Phys, Rev. Lett. 59,

728. Spal✭ek, J., Kokowski, M., and Honig, J. M. (1989). Phys. Rev.B 39, 4175. For d = ∞ solution see, e.g., Gebhard, F. (1997). “TheMott Metal–Insulator Transition,” Springer Verlag, Berlin.

31. For the Mott insulators, the kinetic exchange interaction was intro-duced by in Anderson, P. W. (1959). Phys. Rev. 115, 2. This resultwas extended to the case of a strongly correlated metal by Chao,K. A., Spal✭ek, J., and Oles, A. M. (1977). J. Phys. C 10, L271.Cf. also Spal✭ek, J., and Oles, A. M. (1976). Jagiellonian Universitypreprint SSPJU–6/76, Oct. Zaanen, J., and Sawatzky, G. A. (1990).J. Solid State Chem. 88, 8ff.

32. Vischer, P. B. (1974). Phys. Rev. 10, 943. Spal✭ek, J., Oles, A. M.,and Chao, K. A. (1981). Phys. Stat. Sol. (b) 108, 329. Nagaoka, Y.(1966). Phys. Rev. 147, 392.

33. (a) Varma, C. M., et al. (1989). Phys. Rev. Lett. 63, 1996;(1989). Int. J. Mod. Phys. B 3, 2083–2118; Littlewood, P. B., andVarma, C. M. (1991). J. Appl. Phys. 69, 4979. (b) For review seePines, D. (1998). In “The Gap Symmetry and Fluctuations in High-TC Superconductors” (J. Bok et al., eds.), Plenum Press, New York,NATO ASI Series, Ser. B, Vol. 371, and references therein. (c)Moriya, T., and Ueda, K. (2000). Adv. Phys. 49, 555–606, and ref-erences therein. (d) Lee, P. A., and Nagaosa, N. (1992). Phys. Rev. B46, 5621; for review see Nagaosa, N. (1999). “Quantum Field The-ory in Strongly Correlated Electronic Systems,” Chap. 5, SpringerVerlag, Berlin; Lee, P. A. (1996). J. Low Temp. Phys. 105, 581. (e)van den Brink, J., et al. (1995). Phys. Rev. Lett. 75, 4658; (1996).Phys. Rev. Lett. 76, 2826. (f ) Bal✭a, J., et al. (1995). Phys. Rev. B52, 4597. (g) Dagotto, E. (1994). Rev. Mod. Phys. 66, 763, andreferences therein; Dagotto, E., and Rice, T. M. (1996). Science271, 618; Dagotto, E. (1998). J. Phys. Chem. Solids 59, 1699. (h)Zaanen, J. (1998). J. Phys. Chem. Solids 59, 1769; Castellani, C.,et al., ibid., p. 1694; Kivelson, S. A., ibid. p. 1705, and referencestherein.

34. (a) Takagi, H., et al. (1992). Phys. Rev. Lett. 69, 2975. (b) Rossat-Mignot, J., et al. (1991). Physica C 185–189, 86; Bourges, P., et al.(1997). Phys. Rev. B 56(11) 439. (c) Anderson, P. W. (1997). “TheTheory of Superconductivity in the High-TC Cuprates,” Prince-ton University Press, Princeton, NJ; Byczuk, K., Spalek, J., andWojcik, W. (1998). Acta Phys. Polonica B 29, 3871; Varma, C. M.(1997). Phys. Rev. B 55(14), 554.

35. Spal✭ek, J., and Wojcik, W. (1988). Phys. Rev. B 37, 1532.36. Kivelson, S. A., Rokhsar, D. S., and Sethna, J. P. (1987). Phys. Rev.

B 35, 8865. Anderson, P. W., and Zou, Z. (1988). Phys. Rev. Lett.60, 132.

37. Anderson, P. W. (1988). Cargese 1988, Lecture Notes. Also (1987).In “Proceedings of the International School ‘Enrico Fermi’—1987:Frontiers and Borderlines in Many—Particle Physics,” North-Holland, Amsterdam.

38. For a recent review, see Fukuyama, H., Hasegawa, Y., and Suzu-mura, Y. (1988). Physica C 153–155.

39. Anderson, P. W. (1988). Physica C 153–155.40. Lee, P. A., Rice, T. M., Serene, J. W., Sham, L. J., and Wilkins, J. W.

(1986). Comments Condens. Matter. Phys. 12, 99.41. Rice, T. M. (1987). “In Proceedings of the International School

of Physics ‘Enrico Fermi’—1987: Frontiers and Borderlines inMany—Particle Physics,” North-Holland, Amsterdam.

42. Newns, D. M., and Read, N. (1987). Adv. Phys. 36, 799.43. Fulde, P., Keller, J., and Zwicknagl, G. (1988). In “Solid State

Physics” (H. Ehrenreich and D. Turnbull, eds.), Vol. 41, AcademicPress, San Diego, CA.

44. For a review of experimental properties of heavy fermions, seeStewart, G. R. (1984). Rev. Mod. Phys. 56, 755. Fisk, Z., et al.(1986). Nature (London) 320, 124. Steglich, F. (1955). In “The-ory of Heavy Fermions and Valence Fluctuations” (T. Kasuya andT. Saso, eds.), p. 23ff, Springer-Verlag, Berlin, New York.

45. Wrobel, P., and Jacak, L. (1988). Mod. Phys. Lett. B 2, 511.46. Anderson, P. W. (1987). Science 235, 1196.47. Shirane, G., et al. (1988). Phys. Rev. Lett. 59, 1613.48. Chakravarty, S., et al. (1988). Phys. Rev. Lett. 60, 1057.49. Nucker, M., et al. (1987). Z. Phys. B 67, 9. Fujimori, A., et al.

(1987). Phys. Rev. B 35, 8814. Steiner, P., et al. (1988). Z. Phys. B69, 449.

50. Mattheis, L. F. (1987). Phys. Rev. Lett. 58, 1028. Yu, J., Free-man, A. J., and Xu, J.-H. (1987). Phys. Rev. Lett. 58, 1035.Szpunar, B., and Smith, V. H., Jr. (1988). Phys. Rev. B 37, 2338.For review see Hass, K. C. (1989). In “Solid State Physics” (H.Ehrenreich and D. Turnbull, eds.), Vol. 42, Academic Press, SanDiego, CA.

51. Fulde, P. (1988). Physica 153–155, 1769. Hybertsen, M. S., et al.(1994). Phys. Rev. B 41, 11068.

52. Zhang, F. C., and Rice, T. M. (1988). Phys. Rev. B 37, 3759.53. Schafroth, M. R. (1955). Phys. Rev. 100, 463.54. For a review see Leggett, A. (1975). Rev. Mod. Phys.55. Shafer, M. W., Penney, T., and Olson, B. L. (1987). Phys. Rev. B

36, 4047.56. Spal✭ek, J, (1988). Phys. Rev. B 37, 533. Acquarone, M. (1988).

Solid State Commun. 66, 937.57. Baskaran, G., Zou, Z., and Anderson, P. W. (1987). Solid State

Commun. 63, 973.58. Cyrot, M. (1987). Solid State Commun. 62, 821.59. Ruckenstein, A. E., Hirschfeld, P. J., and Appel, J. (1987). Phys.

Rev. B 36, 857.60. Kotliar, G. (1988). Phys. Rev. B 37, 3664.61. Isawa, Y., Maekawa, S., and Ebisawa, H. (1987). Physica 148B,

391.62. Zou, Z., Anderson, P. W. (1988). Phys. Rev. B 37, 627.63. Inui, M., Doniach, S., Hirschfeld, P. J., and Ruckenstein, A. E.

(1988). Phys. Rev. B 37, 2320.64. Suzumura, Y., Hasegawa, Y., and Fukuyama, H. (1988). J. Phys.

Soc. Jpn. 57, 2768.65. Kotliar, G., and Liu, J. (1988). Phys. Rev. B 38, 5142.66. Nagaosa, N. (1966). In “Proceedings of the 10th Anniversary HTS

Workshop on Physics, Materials, and Applications” (B. Batlogget al., eds.), pp. 505ff, World Scientific, Singapore, and referencestherein.

67. (a) Labbe, J., and Bok, J. (1987). Europhys. Lett. 3, 1225;Newns, D. M., et al. (1992). Comments Cond. Mat. Phys. 15, 273.(b) Bouvier, J., and Bok, J. (1998). In “The Gap Symmetry andFluctuations in High-TC Superconductors” (J. Bok et al., ed.), pp.37–54, Plenum Press, New York. (c) Van Harlingen, D. J. (1995).Rev. Mod. Phys. 67, 515. (d) Annett, J. F., Goldenfeld, N., andLegett, A. J. (1996). In “Physical Properties in High-TemperatureSuperconductors,” Vol. 5 (D. M. Ginsberg, ed.), pp. 375–461, WorldScientific, Singapore. (e) Chakravarty, S., et al. (1993). Science 261,337. (f ) Byczuk, K., and Spal✭ek, J. (1996). Phys. Rev. B 53, R518.(g) Leggett, A. J. (1998). J. Phys. Chem. Solids 59, 1729; Tsvetkow,A. A., et al. (1998). Nature 395, 360. (h) Lawrence, W. E., and Do-niach, S. (1971). In “Proc. 12th Int. Conf. Low Temp. Phys.” (E.Kanda, ed.), Keigaku, Tokyo; see also Tinkham, M., in Ref. 4.

68. Mila, F. (1988). Phys. Rev. B 38, 11358, and references therein.69. Spal✭ek, J. (1988). Phys. Rev. B 38, 208. Spal✭ek, J. (1988). J. Solid

State Chem. 76, 224.70. Schrieffer, J. R., and Wolff, P. A. (1966). Phys. Rev. 149, 491.

Page 373: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT/GUE P2: GLQ Final Pages

Encyclopedia of Physical Science and Technology EN016A-750 July 31, 2001 15:31

Superconductivity Mechanisms 289

71. Anderson, P. W. (1984). Phys. Rev. B 30, 1549. Miyake, K.,Schmitt–Rink, S., and Varma, C. M. (1986). Phys. Rev. B 34, 6554.Scalapino, D. J., Loh, E., and Hirsch, J. E., (1986). Phys. Rev. B34, 8190. Norman, M. R. (1987). Phys. Rev. Lett. 59, 232. Nor-man, M. R. (1988). Phys. Rev. B 37, 4987.

72. Newns, D. M. (1987). Phys. Rev. B 36, 5595. Newns, D. M. (1988).Phys. Scripta T 23, 113.

73. Torrance, J. B., et al. (1988). Phys. Rev. Lett. 61, 1127.74. Rice, T. M., and Ueda, K. (1985). Phys. Rev. Lett. 55, 995.

Rice, T. M., and Ueda, K. (1986). Phys. Rev. B 34, 6420.75. Miyake, K., Matsuura, T., Sano, K., and Nagaoka, Y. (1988). J.

Phys. Soc. Jpn. 57, 722.76. Anderson, P. W. (1959). Phys. Rev. 115, 2. Also (1963). In “Solid

State Physics” (F. Seitz and D. Turnbull, eds.), Vol. 14, pp. 99–213,Academic Press, New York. Cf. also Zaanen, J., and Sawatzky,G. A. (1990). J. Solid State Chem. 88.

77. For a review see, e.g., Vonsovskii, S. V. (1974). “Magnetism” Wiley,New York.

78. Zaanen, J., Sawatzky, G. A., and Allen, J. W. (1985). Phys. Rev.Lett. 55, 418; (1986). J. Magn. Magn. Mat. 54–57, 607.

79. (a) Aharony, A., et al. (1988). Phys. Rev. Lett. 60, 1330. (b)Tranquada, J. M., et al. (1988). Phys. Rev. Lett. 60, 156.

80. Spal✭ek, J., and Honig, J. M. (1990). In “Studies of High-Temperature Superconductors” (A. Narlikar, ed.), Vol. 4, NovaScience, New York.

81. The p–p pairing was discussed first by Emery, V. J. (1987). Phys.Rev. Lett. 58, 2794, and also by Emery, V. J., and Reiter, G. (1988).Phys. Rev. B 38, 4547.

82. The coexistence of SDW and SC states has also been discussedwithin BCS theory: Baltensperger, W., and Strassler, S. (1963).Phys. Kondens. Mater. 1, 20; Nass, M. J., et al. (1981). Phys. Rev.Lett. 46, 614. Overhauser, A. W., and Daemen, L. (1988). Phys.Rev. Lett. 61, 1885. The corresponding problem for exchange–mediated superconductivity has been outlined by Parmenter, R. H.(1987). Phys. Rev. Lett. 59, 923.

83. Weidinger, A., et al. (1989). Phys. Rev. Lett. 62, 102. Brewer, J. H.,et al. (1988). Phys. Rev. Lett. 60, 1073.

84. Bednorz, J. G., and Muller, K. A. (1986). Z. Phys. B 64, 189.Chu, C. W., et al. (1987). Phys. Rev. Lett. 58, 405. Uchida, S.,et al. (1987). Jpn. J. Appl. Phys. 26, L1. Wu, M. K., et al. (1987).Phys. Rev. Lett. 58, 908.

85. abbe, J., and Bok, J. (1987). Europhys. Lett. 3, 1225.86. Prelovsek, P., Rice, T. M., and Zhang, F. C. (1987). J. Phys. C 20,

L229.87. Jorgensen, J. D., et al. (1988). Phys. Rev. Lett. 58, 1024.88. Weber, W. (1987). Phys. Rev. Lett. 58, 1371. Barisic, S., Batistic, J.,

and Friedel, J. (1987). Europhys. Lett. 3, 1231.89. For review of phonon- and bipolaron-mediated superconductivity

see, e.g., Oguri, A. (1988). J. Phys. Soc. Jpn. 57, 2133; de Jongh,L. J. (1988). In “Proc. 1st Int. Symp. Superconduct.,” Nagoya,Springer—Verlag, New York.

90. Alexandrov, A., and Ranninger, J. (1981). Phys. Rev. B 24, 1164.Alexandrov, A., Ranninger, J., and Robaszkiewicz, S. (1986). Phys.Rev. B 33, 4526. For review see Micnas, R., Ranninger, J., and

Robaszkiewicz, S. (1990). Rev. Mod. Phys. 62, 113; Ranninger, J.(1998). J. Phys. Chem. Solids 59, 1759, and references therein.

91. Batlogg, B., et al. (1987). Phys. Rev. Lett. 59, 912. Faltens, T. A.,et al. (1987). Phys. Rev. Lett. 59, 915.

92. Leary, K. J., et al. (1987). Phys. Rev. Lett. 59, 1236.93. Mattheis, L. F., Gyorgy, E. M., and Johnson, D. W., Jr. (1988).

Phys. Rev. B 37, 3745. Cava, R. J., et al. (1988). Nature 332, 814.Hinks, D. G., et al. (1988). Nature 333, 6176.

94. Hinks, D. G., et al. (1988). Nature 335, 419.95. Pei, S., et al., preprint.96. Rice, T. M. (1988). Nature 332, 780.97. Dabrowski, B., personal communication.98. Varma, C. M. (1988). Phys. Rev. Lett. 61, 2713.99. Sleight, A. W., Gillson, J. J., and Bierstedt, P. E. (1975). Solid

State Commun. 17, 27.100. For critical estimates of isotope shifts of the Tc value, see

Allen, P. B. (1988). Nature 335, 396.101. Chakraverty, B. K. (1979). J. Phys. Lett. 40, L99, Alexandroy, A. S.,

and Ranninger, J. (1981). Phys. Rev. B 23, 1796. Alexandrov, A. S.,Ranninger, J., and Robaszkiewicz, S. (1986). Phys. Rev. B 33,4526.

102. Rice, T. M. (1988). Nature 332, 780.103. Prelovsek, P., Rice, T. M., and Zhang, F. C. (1987). J. Phys. C 20,

L229.104. Little, W. A. (1964). Phys. Rev. 134A, 1416. Ginzburg, V. L.

(1970). Sov. Phys. Uspekhi 13, 335.105. Varma, C. M., Schmitt-Rink, S., Abrahams, E. (1987). In

“Proceedings of the Conference on Novel Mechanisms of Super-conductivity” (S. A. Wolff and V. Z. Kresin, eds.), p. 355, PlenumPress, New York; (1987). Solid State Commun. 62, 681.

106. Weber, W. (1988). Z. Phys. B 70, 323.107. Scalpino, D. J., Loh, E., Jr., and Hirsch, J. E. (1987). Phys. Rev. B

35, 6694. Schrieffer, J. R., Wen, X. G., and Zhang, S. C. (1988).Phys. Rev. Lett. 60, 944. White, S. R., and Scalpino, D. J. (1997),Phys. Rev. B 55, 6504.

108. For a critical review see Little, W. A. (1988). Science 242, 1390.109. Fiory, A. T., Hebard, A. F., Mankiewich, P. M., and Howard, R. E.

(1988). Phys. Rev. Lett. 61, 1419.110. Niemeyer, J., Dietrich, M. R., and Politis, C. (1987). Z. Phys. B

67, 155.111. Smedskjaer, L. C., Liu, J. Z., Benedek, R., Legnini, D. G.,

Lam, D. J., Stahulak, M. D., and Bansil, A. (1988). Physica C156, 269.

112. Cf. also Mathur, N. D., et al. (1998). Nature 394, 39; Fisk, Z., andPines, D. ibid., p. 22.

113. For review see Shen, Z.-X., and Dessau, D. S. (1995). Phys. Rep.253, 1ff; Ding, H., et al. (1995). Phys. Rev. Lett. 74, 2784; (1996).Phys. Rev. B 54, R9878; Campuzano, J.-C., et al. (1998). In “TheGap Symmetry and Fluctuations in High-TC Superconductors”(J. Bok et al., eds.), Plenum Press, New York.

114. Fujimori, A., et al. (1998). J. Phys. Chem. Solids 59, 1892, andreferences therein.

115. Ding, H., et al. (1998). J. Phys. Chem. Solids 59, 1888, andrefrences therein; Norman, M. R., et al., ibid. p. 1902.

Page 374: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

Superconductors, HighTemperature

John B. GoodenoughUniversity of Texas

I. IntroductionII. Superconductivity

III. Metallic OxidesIV. High-Tc Superconductors

GLOSSARY

Bohr magneton µB Magnetic moment of an electron.Brillouin zone Volume in k space containing two valence

electrons per atomic valence orbital per atom of a prim-itive unit cell of crystal lattice.

Correlation energy Electrostatic electron-electron inter-actions not accounted for in Hartree-Fock one-electronband theory of an itinerant electron.

Debye temperature ΘD Characteristic temperature pro-portional to maximum vibrational frequency of atomsof a solid (k�D = hωmax).

Isotope Same element with different nuclear masses.k-space Momentum (or reciprocal-lattice) space in which

electron momenta and energies can be plotted.Magnetic flux Lines of magnetic-field strength defining

field direction; their density defines field strength.Phonon Quantum of lattice vibrational energy hω.Quasi-particle Electron of a one-electron energy band

renormalized by electron-electron and/or electron-lattice interactions.

Wave function ψ Quantum-mechanical descriptor of anelectron; |ψ(r)| 2 is the probability of finding an elec-tron at position r.

I. INTRODUCTION

In 1908, the Dutch physicist Heike Kammerlingh Onnessucceeded in liquifying helium. This accomplishmentmade possible the exploration of the low-temperatureproperties of matter; and in 1911 he reported a phasetransition in metallic mercury from a normal state toa superconductive state below a critical temperatureTc. What Kammerlingh Onnes observed was an abruptchange in the direct-current (dc) resistance of mercuryat a Tc = 4.15 K; the normal state exhibited an electri-cal resistance Rn with attendant joule heating I 2 Rn onpassing a current I , whereas the superconductive statewas a “perfect” conductor with no measurable resis-tance (Rs = 0). Moreover, in the absence of a magneticfield, Tc is independent of the shape or the size of the

291

Page 375: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

292 Superconductors, High Temperature

sample; superconductivity is an intrinsic property of thematerial.

Since 1911, many materials have been surveyed to de-termine whether they are superconductors and, if so, thevalue of their Tc. Although there is still little understandingto guide in the search for, or design of, high-Tc supercon-ductors, extensive investigations of the elements, of alloys,of compounds, and of polymers had, by 1985, resulted inseveral empirical guidelines.

1. Only metals are superconductors.2. Superconductivity is associated with a dynamic

electron–phonon coupling.

FIGURE 1 Maximum known Tc versus date of discovery.

3. The highest values of Tc are associated with partiallyfilled d bands, and Tc varies sensitively with theelectron–atom ratio for the partially filled band.

4. Tc is suppressed where the conduction electronsexhibit magnetic order at low temperatures.

5. Tc is suppressed where the electron–phonon couplingbecomes static, inducing a phase transformation to anonmetallic state.

From 1911 to 1986, the critical temperature Tc remainedbelow 25 K, increasing by less than 0.3 K per decade(see Fig. 1). Moreover, the existing theory—applicableto nearly all known superconductors—predicted a ceiling

Page 376: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

Superconductors, High Temperature 293

for Tc in the neighborhood of 30 K. Nevertheless alternatemechanisms for enhancing Tc had been suggested, anda few experimentalists persisted in the hope of finding amaterial that exhibited such an enhancement.

Bednorz and Muller, of IBM Zurich, were two who per-sisted, and in 1986 they reported the existence of super-conductivity above 30 K in a multiphase oxide containingBa, La, and Cu. Although their discovery was initiallyoverlooked within their own corporation, it was imme-diately pursued by Kitazawa and Uchida of the Univer-sity of Tokyo, who identified the superconductor phaseas La2−x Bax CuO4−y having a well-known intergrowthstructure. Announcement of this identification in Decem-ber 1986 at conferences in Boston, Massachusetts, andBangalore, India, triggered an excited effort to repro-duce, extend, and “explain” this breakthrough. Withinweeks, substitution of Sr for Ba had increased the Tc

to 40 K and attempts to substitute Y for La had re-sulted in a polyphase mixture containing a new super-conductor with a Tc ≈ 90 K. First announced in theNew York Times by Chu of the University of Houston—but found independently at the same time by workersin Tokyo, Peking, and Bangalore—the latter discoveryelectrified the entire solid-state community. A Tc higherthan 77 K, the boiling point of nitrogen, introducedan entirely new technical dimension, and conventionaltheory clearly could not be stretched to include this newfinding without some radical modification. A race to ar-ticulate this theoretical modification, to establish it, andto use it to find new high-Tc superconductors had begun.Simultaneously, the problem of processing these new ma-terials for technological exploitation began to be addressedin more than 1000 laboratories around the world. After 12years of intensive effort by many groups, there is no con-sensus yet even on the character of the normal state out ofwhich the superconductive pairs condense, and process-ing the brittle ceramic materials into flexible wires or tapesthat can remain superconducting in high magnetic fieldsremains a technical challenge. This article can be only apersonal commentary on this activity.

II. SUPERCONDUCTIVITY

A. Phenomenology

1. Nomenclature

A superconductor is any material that undergoes a tran-sition from the normal state to the superconductive statebelow a critical temperature Tc. It is superconducting whenit is carrying a resistance-free (Rs = 0) current (i.e., a su-percurrent) in the superconductive state.

2. The Normal State

Superconductors are metallic in the normal state. Eachconduction electron of a metal is said to be itinerant be-cause it belongs equally to all like atoms at energeticallyequivalent lattice positions in a crystal; each may also be-long, to a lesser extent, to other atoms in the crystal. Be-cause their position in real space is poorly defined, itiner-ant electrons are characterized by their momentum vectork, where the momentum p transforms to hk (h = h/2π ,where h is Planck’s constant) in the absence of a mag-netic field. Where the like-atom interatomic interactionsare much stronger than the intraatomic electron–electroninteractions, each itinerant electron may be described asa single particle moving in the average electrostatic po-tential created by the atomic nuclei and all the other elec-trons; they therefore occupy one-electron states, each hav-ing an energy εk and, in the absence of a magnetic field, atwofold spin degeneracy. Moreover, the one-electron en-ergies for an N -atom array are grouped into energy bandscontaining 2N/n states per atomic orbital, where n ≥ 1is an integer that depends on the translational symmetryof the crystal. The density of one-electron states N (ε) perinfinitesimal energy interval dε is a fundamental parame-ter; so also is the effective mass entering the relationshipεk − E0 = h2k2/2m∗, where E0 is a band-edge referenceenergy. The Pauli exclusion principle allows one electronper state, so at T = 0 K the electron states are successivelyoccupied from the bottom of an energy band until all theelectrons are accounted for.

What distinguishes a metal from a semiconductor suchas silicon is that, in a metal, an occupied band of itinerant-electron states is only partially filled (Fig. 2). In this casethere is an abrupt change in the electron population at asurface in k space; this surface is called the Fermi surface,and the energy of the Fermi surface is called the Fermienergy EF. At finite temperatures, electrons are thermally

FIGURE 2 Energy versus (a) the density N(ε) of one-electronstates for a semiconductor and a metal and (b) the Fermi–Diracdistribution function f (E).

Page 377: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

294 Superconductors, High Temperature

FIGURE 3 Temperature dependence of (a) the resistance and (b) the thermopower of a superconductor.

excited from occupied to unoccupied states in the band:this process gives rise to a Fermi–Dirac statistical distri-bution that, for kBT EF (kB is Boltzmann’s constant),leaves intact the concept of a Fermi surface (Fig. 2b).

As the ratio of interatomic/intraatomic interaction en-ergies decreases toward unity, the band of one-electronenergies narrows, increasing N (EF) for a given EF; singleelectrons become transformed into quasiparticles that can-not be described by the average potential of all other elec-trons because correlations between the electrons on near-est like neighbors reflect the intraatomic electron–electroninteractions and/or because electron–phonon interactions“dress” the electrons in local crystallographic distortions.Although these interactions reduce the discontinuity in theelectron population at EF, an identifiable Fermi surface re-mains as long as the specimen is metallic.

Partially occupied, narrow energy bands of quasiparti-cle states may lose their Fermi surface at EF by induc-ing a diffusionless phase transition at low temperaturesthat splits the band into bands of occupied and emptystates separated by a finite energy gap Eg. Three types oftransitions cause such a splitting: (1) an atomic clusteringthat changes the translational symmetry of the crystallo-graphic structure, (2) a magnetic ordering that changesthe translational symmetry and/or the degree of electronlocalization at atomic positions, and (3) the onset of super-conductivity caused by a pairing of one-particle states hav-ing energies near EF into an ordered condensate of two-particle states. Since the first two processes compete withthe onset of superconductivity, any realization of a high Tc

must involve a mechanism that suppresses the stabiliza-tion of atomic clustering and of magnetic ordering of theconducting electrons.

3. The Superconductive State

The critical temperature Tc marks the boundary betweentwo distinguishable thermodynamic states of the material,

each with its own set of properties. The superconductivestate is distinguished from the normal state by its electric,magnetic, thermodynamic, and tunneling properties.

a. Electric. The dc resistance R of a superconductorwire drops abruptly at Tc, from Rn > 0 in the normal stateto Rs = 0 in the superconductive state (Fig. 3a).

In the normal state, the potential difference V betweenthe ends of a wire of length l and resistance Rn is, byOhm’s law,

V = IRn (1)

if the wire carries a current I . By definition, a constantelectric field E = V/ l then exists in the wire. In the su-perconductive state, on the other hand, Rs = 0 makesV = E = 0. There is no constant electric field in, or po-tential difference across, a superconducting wire. Con-sequently all the thermoelectric effects present in thenormal state vanish abruptly at Tc. For example, in thenormal state an applied temperature gradient T givesrise to an electric field E in the conductor; the thermoelec-tric power, defined as E/T , vanishes with E below Tc

(Fig. 3b).The resistance Rs of the superconductive state is strictly

zero only for direct currents of a constant value. If the cur-rent changes with time, as in an alternating-current (ac)application, then Rs is not zero. Nevertheless at tempera-tures T Tc, Rs remains much less than the resistance Rn

of the normal state for frequencies ν < Eg/h, where Eg isthe energy gap (see Section II.C) at the Fermi energy ofthe superconductive state. The ratio Rs/Rn increases froma small value to nearly 1 in a finite frequency interval ν

(Fig. 4). The width ν broadens and its midpoint shifts toa lower frequency as T increases to Tc.

b. Magnetic. The magnetization M of a substanceis defined as its magnetic moment per unit volume. Themagnetic susceptibility per unit volume is defined as

Page 378: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

Superconductors, High Temperature 295

FIGURE 4 Ratio of superconductive-to-normal resistance versusfrequency for different values of T < Tc.

χ = µ0 M/Ba, (2)

where Ba/µ0 is the intensity of an applied magnetic field.Substances with a negative magnetic susceptibility are

called diamagnetic; those with a positive susceptibilityare called paramagnetic. Diamagnetism reflects changesin electron motion that oppose the applied magnetic field;paramagnetism reflects an increase in the populations ofelectron spins (or of localized atomic moments) orientedparallel to the applied magnetic field. The inner, closed-shell atomic cores retain spin-paired electrons; they al-ways give a small, temperature-independent diamagneticcontribution χcore < 0 to the susceptibility. However, in su-perconductors the dominant contribution is made by theconduction electrons.

In the normal state of a superconductor, an applied mag-netic field Ba defines the orientations of the one-electronspin states and stabilizes the parallel spin states relativeto the antiparallel spin states of the conduction band byan energy 2µB Ba/µ0, where µB is the magnetic momentimparted by a single-electron spin (the Bohr magneton)(Fig. 5). The resulting change in electron population of

FIGURE 5 Shifting of α-spin and β-spin energies in an appliedmagnetic field strength Ba/µ0.

parallel (α) versus antiparallel (β) spin states creates, tolowest order, a paramagnetic magnetization

M ≈ µ2B N (EF) · Ba/µ0, (3)

where N (EF) is the density of one-electron states at EF.First derived by Pauli, this contribution to the total mag-netization is called the Pauli spin magnetization. Equation(3) applies to broad energy bands with EF � kBT .

Changes in the motions of the conduction electrons in-troduce a diamagnetic contribution. Landau has shownthat where Eq. (3) applies, this contribution to M is mi-nus one-third the Pauli spin magnetization, so the totalconduction-electron contribution to the normal-state sus-ceptibility is paramagnetic and temperature independent:

χcond = µ0 M/Ba = (2/3)µ2B N (EF) > 0. (4)

It generally dominates the total temperature-independentsusceptibility,

χ = χcond + χcore· (5)

If the energy bands are narrow, it is necessary to introduceinto χcond a temperature-dependent enhancement factor.

In the superconductive state the situation is quite dif-ferent. Meissner and Ochsenfeld found that, if a super-conductor is cooled in a magnetic field to below Tc, themagnetic flux within the superconductor in the normalstate is pushed out of the superconductive state as illus-trated in Fig. 6. This phenomenon is called the Meissnereffect.

The extent to which the internal magnetic flux is ex-pelled by the Meissner effect depends not only on thetemperature and the magnitude of the applied magneticfield Ba, but also on the sample shape and its orientationwith respect to Ba. A long, thin cylinder (or wire) orientedwith its long axis parallel to Ba has a negligible demagne-tizing field within it, and the internal magnetic field is

B = Ba + µ0 M, (6)

where M is the induced magnetization. Complete expul-sion of B would make B = 0 inside the superconductor,to give perfect diamagnetism, with

M = −Ba/µ0 and χ = −1. (7)

The magnetization curve for such a situation at T < Tc isillustrated in Fig. 7a. It is found to apply quantitatively topure specimens for applied fields less than a critical fieldstrength Hc(T ):

Ba/µ0 ≤ Hc· (8)

Two types of superconductors can be distinguished.Type I, originally termed soft superconductors, exhibit anabrupt loss of the Meissner effect at Hc. Type II supercon-ductors exhibit two critical field strengths (Fig. 7b): Hcl,

Page 379: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

296 Superconductors, High Temperature

FIGURE 6 (a) The magnetic field in a superconductive cylinder and sphere induced by a uniform applied field Ba.(b) The total induced plus applied field.

beyond which the Meissner effect is less than complete;and Hc2, beyond which it disappears completely.

In the absence of a magnetic field, the transition at Tc

is always sharp; the material is, essentially, wholly su-perconductive at temperatures T < Tc and wholly normalat T > Tc. At temperatures T < Tc, the change from thesuperconductive to the normal state at the critical fieldstrength Hc(T ) is not sharp in type II superconductors, andfor most geometries it is not sharp even in type I super-conductors. Penetration of magnetic flux occurs betweenHc1 and Hc2 in Fig. 7b; in this range of applied field, partsof the specimen are in the normal state and parts are in thesuperconductive state. It is possible to distinguish type I

FIGURE 7 Magnetization versus applied magnetic field Ba for a bulk superconductive cylinder with its axis parallelto Ba for (a) type I and (b) type II superconductivity.

from type II superconductors by the way in which thenormal-state regions penetrate the superconductive statewith increasing Ba/µ0.

The distinction between the two types of superconduc-tors is illustrated in Fig. 8 for the case of a Ba applied per-pendicular to a plane slab of a superconductor. If the super-conductor is type I, the normal regions enter as relativelythick, parallel laminae; and if both normal and supercon-ductive states coexist, the superconductor is said to be inan intermediate state. If the superconductor is type II, thenormal regions enter as numerous, extremely thin tubularfilaments separated by small distances (≤10−5 mm), andfor Hcl < H < Hc2 the superconductor is said to be in a

Page 380: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

Superconductors, High Temperature 297

FIGURE 8 (a) Intermediate state of a type I superconductor. (b)Vortex state of a type II superconductor.

vortex state (or in a mixed state). The field Hc2 in Fig. 7bdepends on the mobility of the vortices. Large values ofHc2 are obtained by introducing crystalline imperfectionsthat pin the vortices; flux pinning introduces hysteresisin Fig. 7b between the curve obtained by increasing andthat obtained by decreasing Ba. A hard superconductoris a type II superconductor exhibiting a large magnetichysteresis due to vortex pinning.

c. Thermodynamic: Order parameter. In the ab-sence of an applied magnetic field, the transition from thesuperconductive to the normal state is second order: thereis no discontinuity at Tc in either entropy (no latent heat)or volume (no thermal hysteresis), but there is a sharpdiscontinuity C in the heat capacity C (Fig. 9).

A decrease in entropy on going from the normal to thesuperconductive state shows that the superconductive stateis more ordered and can be described by an order pa-rameter that varies smoothly with temperature from unityat 0 K to 0 at T = Tc. A natural choice for the orderparameter in classical physics is ns/n0, the local den-sity of superconductive electrons normalized to its valueat 0 K. However, superconductivity is a quantum—not aclassical—phenomenon, and a more profound choice isthe corresponding quantum physics wave function

FIGURE 9 Temperature dependence of the specific heat capacity for (a) a BCS superconductor and (b) a typicalantiferromagnet.

ψ(r ) = |ψ(r )| exp[iφ(r )], (9)

where φ(r ) is a phase factor and ns ≡ |ψ |2.i. Persistent currents and flux Quantization. The most

basic implication of the existence of a phase factor in ψ(r )is the quantization of magnetic flux in a superconductingring. Consider first the macroscopic ring in Fig. 10. Theapplication perpendicular to the ring of a uniform mag-netic field of flux density Ba that varies with time t createsa voltage that induces a current I (t) to circulate in the ring.According to Lenz’s law

−ArdBa

dt= RI (t) + L

dI(t)

dt, (10)

where Ar is the area enclosed by the ring, R the resistanceof the ring, and L the inductance of the ring. If there isno applied magnetic field (Ba = 0), then the solution ofEq. (10) is

I (t) = I (0) exp(−Rt/L), (11)

which shows that any initial current circulating in the ringdecays exponentially to zero in the normal state. How-ever, in the superconductive state an R = Rs = 0 makesI (t) = I (0), and the initial current I (0) continues to circu-late around the ring without any change in its magnitude.Such currents are known as persistent currents; and anycurrent I circulating around the ring produces a magneticflux threading the ring equal to LI. In the presence of Ba,the total flux � threading the ring is � = Ar Ba + LI, andEq. (10) reduces to

d�/dt = −IR. (12)

In the superconductive state, R = 0 gives

� = constant = Ar Bs, (13)

where Bs is the ring magnetic field.

Page 381: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

298 Superconductors, High Temperature

FIGURE 10 Production of persistent current in a superconductivering: (a) Ba applied normal to a ring at T > Tc; (b) ring cooled toT < Tc; (c) Ba removed, leaving persistent current.

A persistent current and its associated magnetic field areestablished in a macroscopic ring by introducing a currentI (t) into the ring before it is cooled to below Tc; the exter-nal circuit is switched off only after the ring is in its super-conductive state. In a type II superconductor below Tc, thepenetration of flux in a Ba/µ0 > Hcl is accomplished by themovement of a normal-state filament into the supercon-ductive state; the normal-state region contains flux, and itsmovement into the superconductor creates a microscopicpersistent current within the surrounding superconductivestate that traps the flux within the normal-state core of thevortex. The amount of flux within a microscopic vortex isquantized because of the phase factor φ in Eq. (9).

For a superconducting ring, the single-valuedness ofψ requires that φ(r ) return to itself modulo 2π on goingonce around the circuit; that is, if the orbit of a super-conductive electron is quantized to a path length that is anintegral number of electron wavelengths, then the electronneither gains nor loses energy. For a superconductive par-ticle in an orbit of radius r , the condition for quantization,and hence the existence of a supercurrent, is

p · 2πr = Nh, (14)

where h is Planck’s constant, N is an integer, and the can-onical momentum in a local magnetic field B = ∇ × A is

FIGURE 11 Quantum steps in flux of persistent current versusapplied field for a type II superconductor.

p = h(kl + k2 · · · kn) + neA (15)

for a superconductive particle consisting of n electrons. Ifthe superconducting particle consists of a pair of electronshaving opposite momentum vectors k and −k, then n = 2and p = 2eA. Moreover, the flux enclosed by a vortex is� = 2πr A, so that Eq. (14) reduces to

� = Nh/2e. (16)

This result is of outstanding importance. It means thatif the superconductive state consists of paired electrons,then in a closed superconducting circuit the flux isquantized in units of

�0 = h/2e = 2.07 × 10−15 Wb. (17)

The existence of flux quantization and the magnitude of�0 have been confirmed experimentally (Fig. 11); theseexperiments demonstrate not only the quantum characterof the ordering, but also that ordering in the superconduc-tive state consists of the formation of pairs of electronshaving opposite momentum vectors k and −k. Moreover,the perfect diamagnetism associated with this order indi-cates pairing of s = 1

2 and s = − 12 spins of a supercon-

ductive electron pair. The critical field strength Hc is thatrequired to decouple the spin pairing of a superconductiveparticle.

A localized atomic moment interacts with the conduc-tion-band electrons via spin–spin “exchange” to producea local magnetic field; if the local field strength exceedsHc, superconductivity is suppressed. Ferromagnetic or-dering of localized moments generally suppresses super-conductivity, but antiferromagnetic ordering may not beincompatible with superconductivity.

ii. Energy gap. The electronic heat capacity in the su-perconductive state Ces, normalized to its value γ Tc in thenormal state at T = Tc, is commonly found to vary expo-nentially as −1/T at temperatures T Tc:

Ces/γ Tc = a exp[−bTc/T ]

= a exp[−0/kBT ], (18)

Page 382: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

Superconductors, High Temperature 299

FIGURE 12 Normalized energy gap Eg(T )/Eg(0) versusnormalized temperature T/Tc.

which is suggestive of an excitation of electrons across anenergy gap Eg = 20. An energy gap Eg has been mea-sured independently by spectroscopic techniques; it com-pares favorably at the lowest T with the calorimetric data(i.e., b ≈ 0/kBTc). Moreover, Eg = 2(T ) is found to de-crease with increasing temperature, from 20 at 0 K to 0at T = Tc (Fig. 12) with a Brillouin-function dependence,which makes (T )/0 a measure of the order parameterin a mean-field description of the transition. In addition,the presence of an energy gap at the Fermi energy EF

shows that the superconductive electron pairs have beenformed by condensing out single-electron states from thevicinity of EF.

iii. Isotope effect. The Tc for mercury, and mostother elemental superconductors, varies smoothly with theaverage atomic mass M as the isotope mix is varied:

MαTc = constant. (19)

This correlation of Tc and M is known as the isotope effect.This early observation shows that for conventional su-perconductors, electron–phonon interactions play an im-portant role in the binding of superconductive pairs ofelectrons. In the simplest theory, only the electronic stateswithin an energy kB�D of EF, where �D is the Debye tem-perature, can be coupled by electron–phonon interactions.This simplest theory limits the magnitude of the energygap to a specific multiple of kBTc and predicts, for an ele-mental superconductor, an α = 1

2 in Eq. (19) [see Eqs. (42)and (43)]. Although an α = 1

2 has been observed for mer-cury, there is nothing sacred about this value even for theelements; for example, an α = 0 for Zr and Ru does notsignal the absence of a phonon mechanism in these twosuperconductors.

An electron–lattice mechanism for binding a supercon-ductive pair leads to an upper limit for Tc of about 30–40 K;a higher Tc requires either another type of superconduc-tive pair, the bipolaron, which is stabilized in the limit ofstrong electron–lattice coupling, or an electronic enhance-ment of the electron–lattice mechanism. An electronic en-

hancement would replace �D with �e ≈ hωe/kB, wherehωe EF is the energy of the electronic excitations thatenhance the pairing potential energy.

iv. Many-body condensate. Significantly, the usualsuperconductor transition is much sharper than other sec-ond-order transitions. A second-order magnetic-orderingtransition, for example, exhibits a substantial temperaturerange of short-range order above the critical temperaturefor long-range order. In this case, each atomic momentinteracts strongly with only a few near neighbors, so ther-modynamic fluctuations not treated by a mean-field theoryplay an important role. In conventional superconductors,only small vestiges of superconductivity remain above Tc,and any resistivity remaining in the superconductive stateis infinitesimally small. This observation indicates thateach electron pair in the superconductive state is stronglycoupled to all the other pairs in a many-body condensate.To break the binding of a given electron to the condensatecosts a minimum energy 0. This many-body aspect ofthe superconductive condensate makes it difficult to de-pict in real space the nature of the electron ordering thatis occurring. An inability to picture the condensate in realspace has hindered formulation of a chemical guide forthe search for new high-Tc materials.

v. Temperature dependence of Hc. In the presence ofa magnetic field Ba, the transition at Tc becomes first order.In a type I superconductor, the increase in free energy atBa/µ0 = Hc is, from Eqs. (7) and (8),

G =∫ Hc

0µ0 MdH = 1

2 µ0 H 2c , (20)

and the latent heat at the transition becomes

Q = T S = Td(G)

dT= µ0THc

dHc

dT, (21)

which vanishes at T = Tc where Hc = 0. It is found empir-ically that at temperatures below Tc, the entropy differenceis described by

S = γ T[1 − (T/Tc)2

]. (22)

Equating Eqs. (21) and (22) and integrating with respectto the boundary conditions Hc = 0 at T = Tc and Hc = H0

at T = 0 K gives the relation

Hc = H0[1 − (T/Tc)2

](23)

for the transition between the superconductive and thenormal state in the presence of an applied field (Fig. 13).The extent of the intermediate-state region depends onthe shape of the sample and its orientation with respectto Ba.

d. Tunneling. If two metals are separated by an insu-lator, the insulator acts as a barrier to electron flow from

Page 383: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

300 Superconductors, High Temperature

FIGURE 13 Variation with temperature of (a) Hc of type I and (b) Hc2 of type II superconductors.

one metal to the other. However, the conduction-electronwave functions extend beyond the metal surface, decay-ing exponentially in magnitude with the distance from thesurface. If the insulating layer is thin enough (less than 10to 20 A), a significant amplitude extends through the in-sulating layer into the other metal. If an empty electronicstate of equal energy is also available in the other metal,then there is a finite probability that an electron impingingon the barrier will pass through the insulating layer. Thisphenomenon is called tunneling.

If both metals are superconductors, two types of par-ticles may tunnel: single quasiparticles and paired super-conductive particles. Tunneling of single quasiparticleshas been used to measure the energy gap in the super-conductive state; tunneling of superconductive particles—called Josephson tunneling—exhibits unusual quantumeffects that have been exploited in a variety of quantumdevices.

In 1962, Josephson proposed that a tunnel junction bet-ween two superconductors—each in their superconduc-tive state—should exhibit a zero-voltage supercurrent inthe direction x perpendicular to the junction,

Ix = I0x sin γ, (24)

due to the tunneling of superconductive electron pairs.Both the phase differences φ2 − φ1 of the wave functionon either side of the insulating layer and the canonicalmomentum of Eq. (15) in the presence of a magnetic fluxenter into

γ = (φ2 − φ1) − 2π

�0

∫ 2

1Ax dx . (25)

A maximum dc flows in the absence of any electric ormagnetic field. This is the dc Josephson effect.

Josephson further predicted that if a voltage differenceV is applied across the junction, the parameter γ becomestime dependent,

γ (t) = γ (0) − (4πeV t/h), (26)

which means that the current oscillates with a frequency

ν = 2eV/h. (27)

This is the ac Josephson effect.These predictions have been verified experimentally

and shown to apply to any sufficiently thin “weak link” ina superconducting circuit. A weak link can be any planardefect at which Tc is sharply reduced from its value in thebulk superconductor. Such weak links appear to limit thesupercurrents in the new high-Tc superconductors.

B. Applications

The technical applications of superconductivity have ex-ploited all its basic properties. However, an extensivecommercial potential has been made possible only bythe discovery of type II superconductors and Josephsontunneling.

1. High Magnetic Field, High Direct Current

The discovery of zero dc resistance, which makes possiblemacroscopic persistent currents, immediately raised thehope of building a solenoid magnet of superconductivewire capable of generating an intense magnetic field atmanageable power levels. Although no energy is expendedby a static magnetic field, the energy required to create andsustain an intense magnetic field with a normal conductoris prohibitive.

Attempts to exploit this concept encountered the intrin-sic limitation imposed by Hc. A cylindrical wire of radiusrw carrying a current I has, at its surface, a magnetic fieldstrength produced by the current

Hsurf = I/2πrw. (28)

A supercurrent may increase until Hsurf = Hc; for any cur-rent higher than the critical current,

Ic = 2πrw Hc. (29)

Page 384: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

Superconductors, High Temperature 301

The surface of the wire is transformed to the normalstate. In type I superconductors, critical field strengthsH0 ≤ 80 kA/m (1 k0e) are not sufficient to replace an iron-core magnet. However, a type II hard superconductor iscapable of remaining superconductive to high magneticfields Hc2; and the generation of high magnetic fields withtype II superconductors is now used in a wide range ofapplications.

2. Alternating-Current Devices

A low Rs/Rn ratio requires ac operation at T Tc andν Eg/h. Type I superconductors may retain an Rs ≈ 0up to 100 MHz. This property has enabled the realizationof very high-frequency linear electron accelerators withmagnifications up to 1010; they can operate continuouslywith only a fraction of the power requirements of conven-tional accelerators.

Larger energy gaps Eg in type II superconductors permitlow-loss ac transmission over superconductive strip linesto even higher frequencies. On the other hand, attempts touse superconductors in ac power devices remain restrictedto specialty applications such as space vehicles where,with type II superconductors, high current densities in highfields permit significant reductions in weight and size.

3. Levitation

The Meissner effect is demonstrated in the classroom bylevitation of a bar magnet over a superconductive bowl.The experiment begins with a bar magnet resting on thebottom of a shallow bowl of superconductor in its normalstate. The bowl is then cooled to below Tc; expulsion of amagnetic field from the bowl creates an “image” magnetthat exerts a repulsive force on the real magnet, causingit to rise until this force is balanced by the weight of themagnet.

A most spectacular application of this principle is the“levitated train,” which requires high magnetic fields and,therefore, a hard, type II superconductor.

4. Bolometer

A bolometer detects electromagnetic radiation by an ab-sorption of radiation that increases its temperature. Thetemperature increase T is related to the energy E ab-sorbed per unit mass via the specific heat capacity Cv:

T = E/Cv. (30)

At low temperatures, a low Cv enhances T for a givenE . A type I superconductive detector is designed to op-erate in the intermediate state where a small T gives riseto a large resistance change R as illustrated in Fig. 14.

FIGURE 14 Temperature variation of resistance for a wirecarrying a current I1 < I2 < I3.

The ability to amplify R electronically makes the su-perconductive bolometer an extremely sensitive radiationdetector; it is particularly important in the far-infrared re-gion of the spectrum, where most other types of radiationdetectors are inoperative.

5. Josephson Tunneling

Practical application of the dc Josephson effect has beenrealized in very sensitive galvanometers and magnetome-ters. The SQUID (superconducting quantum interferencedevice) magnetometer, for example, is used for measur-ing small magnetic fields, with extensive use in geologi-cal surveying. A laboratory SQUID was the first practicaldemonstration of the high-Tc superconductor oxides.

The ac Josephson effect has been used in precisiondeterminations of the value of h/e.

Applications in the computer field promise higher-density, lower-power components; however, their realiza-tion in practice requires an exquisite control of materi-als processing that is particularly demanding for the newhigh-Tc superconductor oxides.

C. Theory

1. History

Once the basic phenomena of zero resistance and theMeissner effect had been established, the experimentalstrategies responsible for our understanding of supercon-ductivity were guided by theory. The theory began withpurely phenomenological equations; these equations in-troduced fundamental length parameters as well as the or-der parameter, and their application permitted Abrikosovto explain the distinction between type I and type IIsuperconductivity.

The quantum theory of Bardeen, Cooper, and Schrieffer(BCS) introduced numerical values for three universal

Page 385: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

302 Superconductors, High Temperature

ratios relating Tc to 0, H0, and the jump C(Tc) inthe specific heat at Tc. Refinement of the BCS theoryby Eliashberg has expressed these ratios in terms of twoparameters, and with this refinement the agreement be-tween theory and experiment for the three universal ratiosis truly remarkable over a wide range of conventionalsuperconductors.

The phenomenon of high-Tc superconductivity cannotbe accounted for with the Eliashberg theory if the the-ory is restricted to binding of the superconductive elec-tron pairs by a dynamic coupling to phonons. This break-down of existing theory has split theorists into two camps:one camp would extend the BCS or Eliashberg theory byintroducing into the pair-binding potential energy an elec-tronic mechanism together with the phonon mechanism;the other camp would break from the “weak-coupling”theory to construct a “strong-coupling” theory in whichelectron pairs form as a disordered array of “bipolarons”at temperatures T > Tc, ordering into the superconductivestate occurring only below Tc.

2. London Equation

In order to account for zero resistance and the Meiss-ner effect in the superconductive state, the Londonbrothers postulated that the local current density in thesuperconductive state is proportional to the vector poten-tial A, where B = ∇ × A:

js = (1/µ0λ

2L

)A. (31)

Applying the Maxwell equation ∇ × B = µ0j0, applicableunder static conditions, to Eq. (31) gives, on taking the curlof both sides of Maxwell’s equation,

∇2B = B/λ2

L (32)

for a superconductive state. This equation accounts for theMeissner effect because it does not allow a solution uni-form in space unless B = 0. Moreover, Maxwell’s equa-tion shows that j = 0 wherever B = 0.

On the other hand, Eq. (32) does allow a solution forB that is nonuniform in space. If a field Ba is appliedparallel to an external surface, as illustrated in Fig. 15,then Eq. (32) gives the solution

B(x) = B(0) exp(−x/λL), (33)

where x is the vertical distance into the superconductorfrom the surface and B(0) is the value of Ba at x = 0. ThusλL measures the depth of penetration of the magnetic field;it is known as the London penetration depth. The currentflowing in the superconductor responsible for expelling Bis confined to a thin surface layer.

A measure of the magnitude of λL can be obtainedfrom the canonical momentum p = mv + eA. If the av-erage superconductive-particle momentum is zero in the

FIGURE 15 Penetration of an applied magnetic field into a semi-infinite superconductor. The penetration depth λL is the distanceat which B decays to Ba/e .

ground state, then the average velocity is 〈vS〉 = −eA/m.If the number density of electrons participating in the rigidground state is ns, then the local superconductive currentdensity becomes

js = nse〈vs〉 = −nse2A/m, (34)

and comparison of Eq. (34) with Eq. (31) gives

λL = (m∗/µ0nse

2)1/2

. (35)

However, careful measurements of λL(T ) near T = 0 K in-dicate that λL(0) is larger than the prediction of Eq. (35),which suggests a reduced ns and hence a rigidity of thecondensate only over a finite volume defined by a charac-teristic length ξ0.

3. Coherence Length

The concept of a characteristic dimension ξ0 was intro-duced by Pippard to formulate a nonlocal generalizationof the London equation. He estimated this length from theHeisenberg uncertainty principle: only electrons havingenergies within ∼kBTc of EF can play a major role in aphenomenon that sets in at Tc; these electrons have a mo-mentum range p ≈ kBTc/vF, where vF = hkF/m∗ is thevelocity of an electron with Fermi energy EF. From theuncertainty principle, x ≥ h/p ≈ hvF/kBTc defines acharacteristic length

ξ0 = ahvF/kBTc, (36)

where a is a numerical constant of order unity. The lengthξ0 plays a role analogous to the mean free path l in thenonlocal electrodynamics of normal metals; and in thepresence of scattering, the characteristic length is calledthe coherence length, where

(1/ξ ) = (1/ξ0) + (1/ l). (37)

In fact, Ginzburg and Landau were the first to intro-duce the idea of a characteristic length. In 1950, they

Page 386: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

Superconductors, High Temperature 303

FIGURE 16 Interface between superconductive and normal do-mains in the intermediate state; h(x) is the local magnetic-fieldstrength.

introduced the order parameter ψ(r ) in Eq. (9) definedby ns = |ψ |2 to obtain an equation for the supercurrentdensity js in terms of ψ(r ). (This theory was later shownto be a limiting form of the microscopic BCS theory firstpresented in 1957.) With this formalism they were ableto treat two features that were beyond the scope of theLondon formalism: (1) nonlinear effects in fields strongenough to change ns and (2) the spatial variation of ns.

A major triumph of the formalism was its descriptionof the intermediate state of a type I superconductor inwhich superconductive and normal domains coexist in anapplied field strength Ba /µ0 ≈ Hc (Fig. 16). The interfacebetween the two domains is characterized by two lengths:the penetration depth λL(T ), over which the local magneticfield is varying, and the coherence length

ξGL(T ) = h/|2m∗α(T )|1/2, (38)

over which ψ(r ) can vary without an undue energy in-crease. In pure superconductors at T Tc, the Ginzburg–Landau coherence length approaches the Pippard co-herence length [i.e., ξGL(T ) ≈ ξ0], but ξGL(T ) divergesas (Tc − T )−1/2 near Tc since α vanishes as (Tc − T ).Since λL(T ) also diverges as (Tc − T )−1/2, the ratioλL/ξGL is nearly independent of temperature. Thereforethe Ginzburg–Landau parameter is

κ = λL/ξGL. (39)

In type I superconductors, a κ 1 results in a positiveinterface energy, which stabilizes a macroscopic domainpattern.

Abrikosov investigated what would happen if theGinzburg–Landau parameter is greater than unity. Hefound that for κ > 1/

√2, the energy of the interface be-

tween normal and superconductive domains becomesnegative and the superconductor is type II. With a negativeinterface energy, field strengths Ba/µ0 ≥ Hc1 perpendicu-lar to a superconducting slab cause flux to penetrate withincylindrical, normal-state domains; persistent supercur-rents surrounding the normal-state regions form vortices.The vortex concentration increases with Ba/µ0 ≥ Hc1 un-til Hc2 = √

2κ Hc; above Hc2 the vortices are merged intoa single normal-state phase.

4. BCS Theory

In 1956, Cooper showed that so long as there exists an at-tractive interaction between pairs of electrons, the Fermisea of electrons is unstable against the formation of at leastone bound pair formed from states with k > kF. Moreover,he argued that the two-electron wave function for a super-conductive pair is a singlet (paired spins), spherical statecontaining a weighted sum over k > kF of product wavefunctions with momentum k, −k for each product andthat the maximum contribution comes from states withk ≈ kF. The two-electron binding energy relative to 2EF

was shown to be (hωc = hνc)

Ebind ≈ 2hωc exp[−2/V N (EF)] (40)

in the weak-coupling limit V N (EF 1. In this derivation,Cooper made the approximation that the coupling energyV is a constant for all values of k out to a cutoff energyhωc away from EF. Since Ebind is of the order of kBTc,an argument similar to that preceding Eq. (36) suggeststhat the size of the Cooper-pair state is approximately ξ0,which is much larger than the interparticle distance. Thusthe Cooper pairs are strongly overlapping, which is whythey form a rigid condensate.

The interaction between electrons of a pair always con-tains an electrostatic repulsive energy Up between the twoelectrons; a high dielectric constant introduces an elec-tronic screening that reduces Up, but it is always repulsive.The problem is to identify an attractive mechanism.

In 1950 Frolich suggested that electron–lattice interac-tions were responsible for the attractive potential, and thisidea was confirmed experimentally with the discovery ofthe isotope effect. The physical idea in the BCS treatmentof this suggestion is that the first electron polarizes thecrystal by attracting the positive atomic cores; the polar-ization in turn attracts a second electron provided that itarrives in the polarized region of the crystal before thelattice has had time to relax to its initial state. This timeconstraint limits the size of a Cooper pair to a charac-teristic length of order ξ0. Moreover, the cutoff energyhωc in Eq. (40) is, for this mechanism, the Debye energyhωD = kB�D, which characterizes the cutoff of the phononspectrum. If the attractive energy VC exceeds the electro-static repulsive energy in magnitude, then the net BCSpotential

VBCS = VC − Up (41)

is attractive.The BCS theory involves a calculation of the ground

state of the system in the presence of a net attractive poten-tial VBCS. Condensation of Cooper pairs changes the stateof the Fermi sea (the collection of one-particle states), andat some point the binding energy for an additional pair has

Page 387: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

304 Superconductors, High Temperature

gone to zero. The simplest form of the theory contains oneadjustable parameter, VBCS; all other parameters enteringthe theory are independently measurable. Its principal de-ductions are the following:

Tc ≈ 1.13�D exp[−1/VBCS N (EF)], (42)

where �D ≈ M−1/2 is the Debye temperature, and

20/kBTc = 3.53, (43)

C(Tc)/γ Tc = 1.43, (44)

γ T 2c

/H 2

0 = 0.168, (45)

where 0 and H0 are the gap parameter Eg = 2 andthe critical field strength Hc at T = 0 K, C(Tc) is thediscontinuity in the specific heat at T = Tc, and γ is theSommerfeld constant of the electron gas in the normalstate (i.e., γ Tc is the electronic specific heat of the nor-mal state at T = Tc). It is found experimentally that α inEq. (19) is not universally 1

2 and that Eqs. (43) to (45) donot hold quantitatively in many high- and intermediate-Tc

materials, which indicates a need to extend the simplestBCS theory.

5. Beyond BCS

The limitation of the BCS theory is that it is a one-parameter theory in which VBCS is assumed to be constantin an energy region about EF of width ±hωD; and thereis no prescription available for calculating VBCS from mi-croscopic theory. An important extension of BCS theoryhas been given by Eliashberg.

Whereas BCS theory simply postulates an attractivepotential VBCS, Eliashberg theory treats properly the mi-croscopic electron–phonon interactions responsible forthe pairing potential in conventional superconductors. TheEliashberg theory contains two parameters. One is thepseudopotential µ∗ for the Coulomb electron–electronrepulsions; it is adjusted to give the correct value of Tc

for a given electron–phonon interaction. The other is theelectron–phonon spectral density α2 F(ω), where F(ω) isthe number of phonon modes (lattice vibrations) having anenergy between hω and h(ω + dω); α2 F(ω) is a phononfrequency distribution weighted by the strength of theelectron–phonon interaction for that mode. It is possible tomeasure α2 F(ω) accurately with superconductive-state–insulator–normal-state tunneling experiments.

The Eliashberg equations determine not only the tun-neling characteristic of a tunnel diode, but also all of thethermodynamics of a particular superconductive materialprovided that α2 F(ω) and µ∗ are known. In this theory,

the BCS universal ratios [Eqs. (43)–(45)] have becometransformed to

20/kBTc = 3.53[1 + 12.5m2ln(1/2m)], (46)

C(Tc)/γ Tc = 1.43[1 + 53m2ln(1/3m)], (47)

γ T 2c

/H 2

0 = 0.168[1 − 12.2m2ln(1/3m)], (48)

where m = kBTc/hωln contains a parameter ωln that rep-resents a weighted measure of the significant phononfrequencies appearing in α2 F(ω). The agreement betweentheory and experiment for elemental and alloy supercon-ductors, as given by Carbotte, is displayed in Fig. 17 forconventional weak to strong coupling regimes m ≤ 0.25.

Within the Eliashberg theory,

20/kBTc < 9 (49)

would reach its maximum value only in the unrealisticsituation that the entire spectral weight occurs at an op-timum vibrational energy hωE = 0.75 meV. The criticaltemperature

Tc = C(µ∗)Ap/k (50)

increases with the strength of the effective electron–phonon interaction

Ap =∫ ∞

0α2 F(ω) d(hω), (51)

where C(µ∗) decreases smoothly with increasing µ∗. Tc

increases with Ap until a lattice instability freezes out astatic distortion of the structure. Thus the theory of su-perconductivity itself does not put an upper limit on Tc;however, the conditions for a high Tc appear to be the sameas those for the stabilization of competitive mechanisms.

The electronic density of states N (EF) at the Fermi en-ergy plays an important role, as in the BCS theory, since

Ap = N (EF)〈g2〉, (52)

where 〈g2〉 involves a double Fermi-surface average of thesquare of the electron–atomic core interaction. In the ab-sence of any physical intuition as to how to enhance 〈g2〉,efforts to increase Tc have traditionally concentrated onincreasing N (EF), but this strategy is frustrated by the ap-pearance of spontaneous magnetism or atomic-clusteringlattice instabilities as competing processes. Significantlythe high-Tc copper oxides have a relatively small N (EF),which implies that a large 〈g2〉 is enhancing the ratio minto a very strong-coupling regime. This observation re-quires, in turn, either an electronic enchancement of theelectron–phonon interaction or an entirely novel mecha-nism for the formation of electron pairs.

Page 388: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

Superconductors, High Temperature 305

FIGURE 17 The ratios (a) 20/kBTc, (b) C(Tc)/γ Tc, and (c) γ T2c /H2

0 versus m= kBTc/h✏ ωl n. [After Carbotte, J. P.(1987). Sci. Prog. Oxf. 71, 327.]

Page 389: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

306 Superconductors, High Temperature

III. METALLIC OXIDES

The new high-Tc superconductors are oxides. Since onlymetals are superconductors, this fact may appear anoma-lous, as most of the common oxides are either insulatorsor low-mobility semiconductors. However, many oxidesare metallic, and a few of these have extensive com-mercial application. For example, the metallic cathodeof the lead–acid battery is PbO2; and Bi2Ru2O7 has atemperature-independent resistivity in the vicinity of roomtemperature, which makes it an important resistor materialin the electronics industry. Discussion of the high-Tc su-perconductors rightfully begins, therefore, with a reviewof the conditions that must be satisfied if an oxide is to bemetallic.

A. Electron Energies in a Typical Insulator

Figure 18 shows schematically the construction of an en-ergy diagram for the insulator MgO. The O2− ion is notstable in free space; a negative electron affinity places theO−/2− redox energy above the lowest energy Evac for afree electron in a vacuum. An energy E1 is required to re-move the outer Mg = 3s1 electron from a free Mg+ ion toa free O− ion to create free Mg2+ and O2− ions. This costin energy is more than compensated by the electrostaticenergy EM gained by ordering the Mg2+ and O2− ions intoa crystal structure; the Madelung energy EM is calculatedfor a lattice of point charges. The crystalline electric fieldraises the Mg2+/+ level and lowers the O−/2− level; cross-ing of these two energies ensures stabilization of the crys-talline phase with a charge transfer from magnesium tooxygen.

In the real MgO crystal, transfer of an integral elec-tronic charge does not occur; a quantum-mechanical co-valent component in the Mg O bond transfers a frac-tion of the O2−-ion electronic charge back onto the Mg2+

ion. However, the reduction in EM caused by this lower-ing of the effective ionic charges is compensated by thequantum-mechanical covalent-mixing repulsion betweenthe two ionic energy levels. Therefore the point-charge

FIGURE 18 Electron energies for MgO: (a) free ions; (b) point-charge model; (c) band model; (d) density of states.

model gives a good first approximation to the bindingenergy of the solid. The covalent component to the bond-ing introduces an O-2p character into the Mg-3s statesand a Mg-3s character into the O-2p states, but withoutchanging the number of electron states at each energylevel. Even where this “mixing” is large, it is customaryto identify the energy levels by their ionic component only(i.e., as O2−:2p6 and Mg2+:3s0 levels); the wave functionsdescribing the mixed Mg-3s and O-2p states are referredto as crystal-field orbitals so as to distinguish them fromthe atomic orbitals of a point-charge model.

The crystal-field energies reflect the point-group sym-metry of the near-neighbor Mg O interactions. The finalstep is to introduce the like-atom interatomic interactions,which broaden the energy levels of the crystal-field or-bitals into energy bands of one-electron states. Whereasthe crystal-field orbitals are localized to discrete atomicsites, the one-electron states are itinerant with a well-defined momentum p = hk in the absence of a magneticfield Ba = ∇ × A. Each band contains 2N/n one-electronstates per atomic orbital (the factor 2 reflects the twofoldspin degeneracy of an orbital) for an array of N like atomscontaining n atoms per primitive unit cell. Thus the bandstates reflect the translational space–group symmetry ofthe crystal.

In MgO there is one magnesium and one oxygen atomper primitive unit cell, so that O2−:2p6 and Mg2+:3s0 lev-els are each broadened into single bands with a band-width much broader than the small spin-orbit splittingof the threefold degeneracy of the oxygen 2p crystal-field orbitals. Therefore the highest occupied band ofone-electron states is represented as an orbitally threefold-degenerate O:2p6 band, which is full; the lowest unoccu-pied band is identified as the Mg2+:3s0 band.

In tight-binding theory, the width of a band of one-electron states is

W ∼= 2zb, (53)

where z is the number of like nearest neighbors on ener-getically equivalent lattice sites and

b ≡ (ψi , H ′ψ j ) ∼= εi j (ψi , ψ j ) (54)

Page 390: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

Superconductors, High Temperature 307

is a measure of the strength of the interatomic interactionsbetween nearest-neighbor like atoms at positions Ri andR j in the lattice. The perturbation H ′ of the potential atR j by the presence of a like atom at Ri factors out as aone-electron energy εi j , which increases with the overlapintegral (ψi , ψ j ) for crystal-field wave functions ψi andψ j at Ri and R j . Therefore, the overlap integral becomesthe guiding qualitative indicator of the strength of theinteratomic-interaction parameter b. (Chemists call thisparameter a resonance integral; physicists, an electron-energy transfer integral.)

Neglected in the band description of crystalline orbitalsis the intraatomic electron–electron electrostatic energyU associated with the creation of polar states (e.g., O−

and O3− in the O2−:2p6 band); the tight-binding theoryadmixes polar and nonpolar states equally in a first-orderperturbation theory. So long as the condition

W � U (55)

is valid, the assumption of U ≈ 0 is a useful approxima-tion. Outer s and p electrons participating in near-neighborchemical bonding satisfy Eq. (55).

The right-hand side in Fig. 18 indicates the density N (ε)of one-electron states versus the energy E for the equilib-rium lattice constant. Since the O2−:2p6 band is filled andthe Mg:3s band is empty, the Fermi energy lies near themiddle of a large energy gap Eg = Ec − Ev between thetwo bands, which makes MgO an insulator. The highestoccupied band O2−:2p6 is called the valence band; thelowest unoccupied band Mg2+:3s0 is called the conduc-tion band.

Attempts to render MgO conducting by doping withaliovalent impurities, as in semiconductor technology, arefrustrated by the energetic inaccessibility of both Ec andEv; it is energetically favorable for the crystal to incorpo-rate a native defect that charge compensates for the dopantso as to retain EF near the middle of the energy gap Eg. Thecost of introducing a native defect is less in an ionic crystalthan in a covalent solid, which is why oxides with largeband gaps Eg tend to be good insulators. It follows that thefirst requirement for metallic conduction in an oxide is theintroduction of energetically accessible electron energies.

B. Problems with 5s and 6s Electrons

Heavy group B metals such as Sn and Pb have a rel-atively large separation of 5s from 5p or 6s from 6pstates; it is therefore chemically straightforward to sta-bilize Sn2+:5s2 and Pb2+:6s2 configurations in oxides,which demonstrates that with these cations the outer 5sor 6s states have become energetically accessible. There-fore, electrons can be introduced into the 5s band of SnO2

and the 6s band of PbO2. In SnO2−x (Fig. 19a) the 5s

FIGURE 19 Electron energies for (a) SnO2−x and (b) PbO2. Vo

is an oxygen vacancy trapping two electrons from the Sn:5s band.

conduction band is over 3 eV above Ev; but an oxygen de-ficiency introduces oxygen vacancies that trap out from theconduction band Sn-5s electrons in shallow, two-electronSn-5s donor states. In PbO2 (Fig. 19b) the Pb-6s con-duction band appears to overlap the O-2p valence band(Ev > Ec), thus eliminating Eg. On the other hand, intro-duction of additional conduction electrons, as is done byhydrogen insertion into PbO2 on battery discharge, rendersthe system unstable with respect to a disproportionationreaction represented by

2Pb3+ → Pb2+ + Pb4+ (56)

Lattice instabilities associated with trapping out the con-duction electrons as pairs at specific Pb2+ sites plague ef-forts to increase the conduction-electron density to anysignificant concentration. In SnO2−x they are alreadytrapped as pairs at oxygen vacancies.

C. Problems with Valence-Band Holes

Alternatively it is possible to gain access to the O2−:2p6

valence band with strongly electropositive cations such asthe alkali-metal ions A+ and the larger alkaline-earth ionsSr2+ and Ba2+. However, in this case holes introduced intothe valence band become trapped as pairs in the homopo-lar O O bonds of peroxide ions (O2)2−. Only where thecovalent component of the M O bond is strong and thereis some overlap of the conduction and valence bands, as inPbO2, are the valence-band holes not trapped out by O Odimerization.

D. Transition-Metal Oxides

Transition-metal cations may have dn or f n configurationswith energies lying within Eg that offer the possibility of

Page 391: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

308 Superconductors, High Temperature

obtaining partially filled d or f bands. However, the wavefunctions of partially filled d or f shells have smaller ra-dial extensions, which reduces the interatomic-interactionparameter b of Eq. (54) and increases the intraatomic-interaction parameter U , so Eq. (55) may no longer besatisfied.

1. Problems with Outer 4f Electrons

The 4 f electrons of a rare earth ion are tightly boundto their atomic nucleus and they are screened from nearneighbors by closed 5s25p6 shells, so the condition

W U (57)

prevails in the rare earth oxides. Under these conditionsthe 4 f orbitals remain as crystal-field orbitals; they arenot transformed into itinerant-electron states. Therefore4 f n configurations remain localized; they impart localizedatomic moments to the ions that are essentially identicalto the localized atomic moments they impart to the freeions or atoms. In this situation, successive redox potentialsare separated by a large energy U > Eg, which restrictsthe accessible 4 f n configurations on a given rare earthcation to at most two, and two only if one happens to liewithin Eg.

If a 4 f n/4 f n+1 redox couple does lie within Eg, asillustrated in Fig. 20a, then it is possible to obtain mixed-valent compounds in which EF intersects the redox energy.In this case, metallic conduction could be expected wereb > hωR, where ωR is the frequency of a breathing-modelattice vibration. However, b is so small that the near neigh-bors have time to relax about a mobile electron, therebytrapping it in a local potential well. These lattice relax-ations stabilize the occupied states at the expense of un-occupied states, as does the molecular reorganization in aliquid for a given redox couple. The mobile electrons of the

FIGURE 20 Electron energies for (a) EuO and (b) Gd2O3. EFmoves into the 4 f 7 energy level in Eu1−δO containing a Eu3+/2+mixed valence.

mixed-valent state thus become “dressed” in a local latticedeformation, which introduces an activation energy intotheir mobility. These “dressed” electrons are called smallpolarons; they move diffusively, so k is no longer a goodquantum number. The rare earth mixed-valent systems arenot metallic.

On the other hand, where a broad conduction band over-laps a 4 f n energy level and EF intersects the 4 f n en-ergy level to give a 4 f n+1/4 f n mixed valence (in thiscase denoted an “intermediate valence”), hybridization ofthe 4 f wave functions with the conduction-band wavefunctions may lead to “heavy-fermion” metallic behav-ior. Superconductivity has been observed in some heavy-fermion compounds (not oxides), but in all of them Tc

is low.

2. Outer d Electrons

a. General considerations. The 4 f electrons of arare earth ion in an oxide are only weakly perturbed fromtheir free-ion behavior; the outer s and p electrons are sostrongly perturbed that they are transformed into itiner-ant electrons. The perturbations of the d wave functionsin a transition-metal oxide are of intermediate strength.In this case also it is convenient to consider first the per-turbations imposed by the nearest-neighbor metal–oxygen(M O) interactions; these give rise to crystal-field orbitalscontaining the quantum-mechanical covalent mixing be-tween overlapping cation and oxygen orbitals. Whetherthe crystal-field orbitals remain localized or are trans-formed into itinerant-electron band states depends on therelative strengths of the crystal-field intraatomic interac-tions U and the bandwidth W due to interatomic interac-tions between crystal-field wave functions on neighboringmetal atoms M.

Covalent mixing between cation d and oxygen 2s and2p orbitals in a transition-metal oxide has two importantconsequences. First, the fivefold orbital degeneracy of thefree ion is at least partially removed by a crystal-fieldsplitting of the energies of the crystal-field orbitals. Sec-ond, mixing of oxygen wave functions with the d wavefunctions spreads the crystal-field orbitals of d wave-function symmetry out over the oxygen atoms, whichboth reduces the intraatomic energy U and increases thebandwidth W ; it also allows M O M as well as M Minteractions.

For half-filled crystal-field orbitals, the addition of onemore electron to a cation costs an intraatomic energy

U = U ′ + ex, (58)

where U ′ is the energy required to add an electron to anempty orbital and ex is the additional electrostatic en-ergy required to add it to a half-filled orbital. The term

Page 392: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

Superconductors, High Temperature 309

FIGURE 21 (a) Evolution of electron energies versus resonanceintegral b for interactions between nearest-neighbor like atoms onenergetically equivalent sites. (b) Corresponding phase diagramfor a single-valent system with a large separated atom U . (Half-filled band illustrated.)

ex enters U wherever an electron is added to a half-filledmanifold. Figure 21 shows the evolution of U with increas-ing interatomic-interaction parameter b, where W ≈ 2zbin the tight-binding approximation with W � U . As b in-creases, screening of the electrons of a given manifoldby electrons on neighboring atoms causes U to decreaserapidly with b in the vicinity where W ≈ U causes a transi-tion from semiconducting to metallic behavior. In the do-main W < U , the crystal-field orbitals remain sufficientlylocalized to impart an atomic magnetic moment, whereasin the domain W > U the compound not only is metal-lic, but also has no spontaneous atomic magnetic mo-ment. Clearly a necessary criterion for metallic conduc-tivity in a single-valent transition-metal compound is thecondition

W > U. (59)

If the initial U at small b is relatively small, as mayoccur where

U = U ′, (60)

then the bands may be so narrow at W ≈ U that the oc-cupied states become split from the unoccupied states(Fig. 22) by a displacement transition that changes the

translational symmetry of the structure. Such displace-ment transitions exhibit atomic clustering: like-atom clus-tering where M M or O O interactions are important,M O clustering in a disproportionation reaction whereM O M interactions are dominant. Cooperative transi-tions are to static charge-density wave (CDW) states thatcompete with the superconductive state. This type of tran-sition is not restricted to a single-valent situation; a longerwavelength CDW may be stabilized where like cationsare present on energetically equivalent sites with a mixedvalence.

For the case of a mixed valence on energetically equiva-lent lattice sites, a narrow band may also be split by small-polaron formation as discussed for the mixed-valent rareearth ions. Elimination of small polarons, a necessary cri-terion for metallic conductivity in mixed-valent systems,requires a bandwidth

W > hωR, (61)

where ωR is the frequency of the optical-mode vibrationthat traps the charge carrier in a local lattice deformation.

These several considerations are best illustrated bysome specific examples.

b. MnO, a single-valent compound with W U.Figure 23 illustrates the construction of an energy dia-gram for the antiferromagnetic insulator MnO. Compari-son with Fig. 18 shows that it is similar to the construction

FIGURE 22 Same as Fig. 21 for a small separated atom U .

Page 393: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

310 Superconductors, High Temperature

FIGURE 23 Electron energies for MnO: (a) free ions; (b) point charge; (c) crystal.

for MgO except for the appearance of a filled crystal-field d5 configuration within the large energy gap be-tween the empty Mn:4s0 and the filled O:2p6 bands. Be-cause EF lies above the top of the O:2p6 band, the formalvalence Mn(II) unambiguously assigns five d electronsper Mn.

The Mn atom sits in an octahedral interstice of oxy-gen atoms in the rock-salt structure of MnO; in thisconfiguration the fivefold d-orbital degeneracy is split intwo by a cubic crystalline field. The wave functions forthe two crystal-field manifolds have the form

ψe = Nσ [ fe − λsφs − λσφσ ], (62)

ψt = Nπ [ ft − λπφπ ], (63)

where, from Fig. 24, the twofold-degenerate e orbitals ofsymmetry x2 − y2 and [(z2 − x2) + (z2 − y2)]/

√2σ -bond

with 2pσ and 2s orbitals at neighboring oxygen atomsbut are orthogonal to the O-2pπ orbitals; the threefold-degenerate t2 orbitals of symmetry xy, yz, zx π -bond with

FIGURE 24 Illustration of cation d and anion p orbitals in the(001) plane of the rock-salt structure.

the O-2pπ orbitals but are orthogonal to the O-2pσ andO-2s orbitals. The admixture wave functions φs , φσ ,and φπ are, respectively, linear combinations of nearest-neighbor O-2S, O-2pσ , and O-2pπ orbitals having thesame symmetries as the atomic fe or ft orbitals withwhich they mix. The covalent-mixing parameters are de-fined as

λσ ≡ ∣∣bcaσ

∣∣/E and λπ ≡ ∣∣bcaπ

∣∣/E, (64)

where E = (Ed − E p) is the energy required to transferan electron from an O-2p orbital to an empty d orbital atthe point-charge Mn(I) = 3d6 energy. Because the overlapintegrals entering

bca ≡ (ψcat, H ′ψanion) ≈ ε(ψcat, ψanion) (65)

are larger for the σ -bonding orbitals, a λσ > λπ raises theenergy of the antibonding crystal-field e orbitals relativeto that of the t2 orbitals. The crystal-field splitting

10Dq ≡ c = m + 12

(λ2

σ−λ2π

)(Ed − E p)

+ 12λ2

s (Ed − Es) (66)

between e and t2 crystal-field energies contains only arelatively small electrostatic term m.

The energy difference between the Mn(I):3d6 and theMn(II):3d5 manifolds is

U = Ut + ex, (67)

where Ut is the energy required to add an electron toan empty t2 orbital. In MnO, a ex ≈ 3 eV makes bothU and E large, and a large E makes λσ and λπ rel-atively small. Therefore a W U and a c < ex stabi-lize a localized t3

2 e2 configuration at a Mn(II) ion, and adirect exchange interaction between spins in orthogonalorbitals couples the spins parallel—in accordance withHund’s highest multiplicity rule for free ions—to give alocalized Mn(II)-ion magnetic moment of 5µB.

Page 394: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

Superconductors, High Temperature 311

In the presence of a W U , the interactions betweenhalf-filled orbitals on like atoms are treated in second-order perturbation theory:

ε ≈ |t |2/U, (68)

where the spin-dependent resonance integral is

t = b sin(�/2) (69)

because the rotation of a spin through an angle � trans-forms as

α = α′ cos(�/2) + β ′ sin(�/2),(70)

β = −α′ sin(�/2) + β ′ cos(�/2),

and the Pauli exclusion principle allows transfer only ofan antiparallel spin to an orbital that is already half-filled.Substitution of Eq. (69) into Eq. (68) gives a spin–spincontribution to the interatomic interaction of the form

Hex = −2Ji j (si · s j ),(71)

Ji j = 2b2/U,

for s = 12 . This superexchange interaction has the form

of the Heisenberg spin–spin interatomic-exchange inter-action. It is responsible for long-range antiferromagneticorder below a Neel temperature TN.

In MnO, 180◦ Mn O Mn interactions compete withMn Mn interactions. The magnetic order and the ex-change striction below TN = 118 K demonstrate thatMn O–Mn interactions are dominant in this compound.Placement of EF in a large energy gap between the Mn:4s0

band and the Mn:t32 e2 level makes MnO an antiferromag-

netic insulator.

c. Li[Mn2]O4, a mixed-valent compound with W <

h✏ ωR . Because the Mn(II):t32e2 energy lies above theO:2p6 bands, it is possible to oxidize Mn(II) to Mn(III)by the removal of a single e electron per Mn atom. Theintraatomic electrostatic energy separating the Mn(II):t32e2

and Mn(III):t32e1 manifolds does not contain either ex orc; it is

U = Ue, (72)

which is small enough to retain the Mn(III) level abovethe top of the O:2p6 band. Consequently it is also possibleto remove the remaining e electron at a Mn(III) ion tooxidize it to Mn(IV):t32e0. The Mn(IV):t32e0 level, on theother hand, lies well below the top of the O:2p6 bandbecause it is separated from the Mn(III):t3

2 e1 level by therelatively large energy

U = Ue + c. (73)

FIGURE 25 Electron energies for two spinels: (a) Li[Mn2]O4 and(b) Li[Ti2]O4.

Therefore an octahedral-site Mn(V) valence is not stabi-lized in oxides.

The cubic spinel Li[Mn2]O4 illustrates a mixedMn(III) + Mn(IV) valence configuration on energeticallyequivalent octahedral sites. In this compound, the Fermienergy intersects the Mn(III):t32e1 energy level (Fig. 25a),and electron transport can occur via the reaction

t32 e1 + t3

2 e0 = t32 e0 + t3

2 e1. (74)

However, the bandwidth of the t32 e1 level is so narrow that

the time τh ≈ h/W for an electron to hop to a neighboringsite is long compared to the period ω−1

R of the optical-modelattice vibration that traps it as a small polaron. There-fore, the Mn:d4 level is split by a polaron energy εp intooccupied Mn(III):t32e1 and empty Mn(IV):t32e0 states in amanner analogous to the splitting of a Mn4+/3+ redox cou-ple in a liquid electrolyte. Small-polaron formation intro-duces an activation energy into the charge-carrier mobil-ity, so Li[Mn2]O4 is a magnetic semiconductor; it is not asuperconductor.

d. Li[Ti2]O4, a mixed-valent compound with W >

h✏ ωR . In contrast to the manganese oxides, which gen-erally have localized dn configurations, the titanium ox-ides generally have itinerant d electrons. For exam-ple, the cubic spinel Li[Ti2]O4 is a superconductor withTc = 13.7 K. In this compound, the Ti Ti interactions arestrong enough to make τh < ω−1

R (i.e., W > hωR) for theelectron-transfer reaction

Ti(III):t12 e0 + Ti(IV):t0

2 e0 = Ti(IV):t02 e0 + Ti(III):t1

2 e0,

(75)

so band theory becomes applicable (Fig. 25b).

Page 395: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

312 Superconductors, High Temperature

e. Ti4O7 and the bipolaron. The mixed-valent com-pound Ti4O7 is obtained by removing oxygen from TiO2.The oxygen vacancies order into shear planes so as to leavea close-packed tetragonal oxide-ion array as in TiO2, butwith the Ti atoms arranged to give TiO2 slabs connectedevery four Ti atoms along the TiO2 c axis by a shear planeacross which Ti atoms share common octahedral-site facesrather than octahedral-site edges. A strong electrostatic re-pulsion between cations across a shear plane displaces theshear-plane Ti atoms away from each other. The possibil-ity of a ferroelectric-type displacement of a Ti(IV) ion inan octahedral site stabilizes the shear-plane structure andlocates the Ti–3d electrons within the slabs.

At room temperature, both the Ti Ti and the Ti O Tiinteractions within a rutile slab satisfy the conditionW > hωR, and Ti4O7 is metallic. However, a first-ordersemiconductor–metal transition occurs at a Tt ≈ 150 Kdue to Ti Ti dimerization within the slabs; the electronsare trapped out as pairs within specific Ti Ti homopolarbonds. Such a transition would be typical of cation clus-tering except that in the mixed-valent compound Ti4O7

the homopolar bonds are mobile in the temperature range130 < T < 150 K; they become stationary only below130 K. These mobile homopolar bonds represent stronglycoupled, localized electron pairs that, like small polarons,move diffusively in the crystal. Such a mobile electron pairis called a bipolaron. Formation of spin-paired bipolaronscauses a sharp drop in the paramagnetic susceptibility. Butthe compound becomes a semiconductor, not a supercon-ductor, and there is no Meissner effect.

f. TiO, a single-valent compound with W >U. Thetitanium atom may be stabilized in single-valent oxides asTi(II) in TiO, Ti(III) in Ti2O3, and Ti(IV) in TiO2. This ispossible, even though TiO2 is an insulator with empty d or-bitals some 3 eV above the top of the O:2p6 band, becausethe energy separating the Ti(III):t1

2 e0 and Ti(II):t22 e0 lev-

els is a relatively small U = Ut . Moreover, the absence ofantibonding e electrons allows the Ti O bond to be shortrelative to the radial extension of the t2 crystal-field or-bitals, so the overlap of t2 orbitals on neighboring Ti atomssharing a common octahedralsite edge is large enough toensure a W > Ut for Ti Ti interactions and the overlap ofTi-t2 and O-2pπ orbitals is large enough to make W ≈ Ut

for 180◦ Ti(III) O Ti(III) interactions.The corundum structure of Ti2O3 contains, on a hexag-

onal basis, c-axis pairs of cations sharing a commonoctahedral-site face. Although Ti2O3 is a metal aboveroom temperature, it becomes a semiconductor at lowertemperatures because the d electrons become trapped inhomopolar Ti Ti bonds within the c-axis pairs.

TiO, on the other hand, is metallic and a superconductor(Tc = 1.5 K), even though it contains about 15% cation

and anion vacancies in its rock-salt structure that becomeordered at lower temperatures. Here, also, the intraatomicenergy U = Ut contains no additional term c or ex,and some hybridization of titanium 4s and 3d orbitalsincreases W .

g. Oxides with only M O M interactions. Theperovskite and pyrochlore structures provide systems hav-ing 135 to 180◦ M O M interactions and no M M in-teractions. A survey of the oxides with these structuresshows that the condition W > U may be fulfilled in single-valent oxides and the condition W > hωR may be foundin mixed-valent oxides with these structures.

The relevant bandwidth W ∼= 2zb arising fromM O M interactions is proportional to either λ2

π or λ2σ .

A large covalent-mixing parameter λπ or λσ requires,according to Eq. (64), a small E = (Ed − E p) and/ora large bca. The energy E decreases with increasingformal charge on the cation and, for a given charge, ongoing to the right in any long period, provided that c

or ex is not introduced into U on adding another delectron to compensate for the increased nuclear charge.However, bca also decreases on going to heavier atoms,but it increases on going from 3d to 4d to 5d orbitals.The overlap integrals are also sensitive to the M O Mangle and to the character of the countercation A in theAMO3 perovskites and the A2M2O7 pyrochlores.

Perovskite and pyrochlore oxides containing partiallyfilled 5d orbitals are generally metallic, whether stoichio-metric as in ReO3 or mixed-valent as in the Nax WO3

bronzes; but single-valent metallic conductivity, as inReO3, does not guarantee that the compounds aresuperconductors.

Perovskite and pyrochlore oxides containing partiallyfilled 4d orbitals are intermediate in character; the 4d elec-trons are itinerant, but a W ≈ U may result in spontaneousmagnetism.

The perovskite and pyrochlore oxides containing par-tially filled 3d orbitals commonly contain localized 3dn

configurations; however, several of these oxides havea W ≥ U and are either metallic or stabilize itinerant-electron antiferromagnetic order. In the perovskites, thehomogeneous electronic picture in Fig. 21 does nothold where W ≈ U ; a static CDW and/or spin-densitywave (SDW) may be stabilized as indicated in Fig. 22.Recently, dynamic displacive phase segregations havebeen identified in the metallic systems Sr1−x Cax VO3,La1−x Ndx CuO3, and Ln1−x Cax MnO3 and the LnNiO3

family (Ln = lanthanide). These dynamic segregations oc-cur because there is a first-order change in the equi-librium M O bond length on going from localized 3delectrons on the M atoms (W < U ) to itinerant 3d elec-trons (W > U ); the equilibrium M O bond is longer for

Page 396: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

Superconductors, High Temperature 313

localized than for itinerant 3d electrons. In the LnNiO3

family, the onset of a static CDW/SDW below a criticaltemperature Tt has been shown to be an order–disordertransition of localized-electron fluctuations in an itinerant-electron host; it is not due to either Fermi-surface nestingin a new Brillouin zone created by a change in latticesymmetry or to the homogeneous Mott–Hubbard transi-tion illustrated in Fig. 21. In the Ln1−x Cax MnO3 system,all the Mn atoms carry a localized t3 configuration withspin S = 3/2, and a single electron per Mn(III) occupiesa narrow (W ≈ U ) σ ∗ band of e-orbital parentage. In thiscase, the transition from localized e to itinerant σ ∗ elec-trons is approached from the localized-electron side, andordering of the twofold-degenerate e orbitals promotese-electron localization and charge ordering. In the ab-sence of charge and orbital ordering, the system becomesferromagnetic below a Curie temperature Tc, and a dy-namic phase segregation in compositions with W ≈ U re-sults in a “colossal magnetoresistance” (CMR) at tem-peratures T ≥ Tc; a dynamically segregated, ferromag-netic, metallic phase of higher Curie temperature growsin an applied magnetic field at the expense of the hostphase until it reaches a percolation threshold. This evi-dence of vibronic (hybridization of electronic and vibra-tional state) phenomena in oxides with perovskite-relatedstructures is of great significance for our understand-ing of the high-temperature superconductivity in copperoxides.

h. Summary. A review of the known properties oftransition-metal oxides reveals the following generaliza-tions that apply to compounds having an EF above the topof the O:2p6 bands.

1. Formal valences provide a count of the number ofcrystal-field d electrons per transition-metal ion. Any am-biguity in the distribution of d electrons among differenttransition-metal ions or between crystallographically in-equivalent lattice sites can generally be resolved.

2. Single-valent oxides require a W > U to be metallic,and W is larger for 5d than for 4d or 3d electrons. Itinerant3d and 4d electrons are found only where U containsneither a c nor a ex.

3. Mixed-valent oxides require a W > hωR to be metal-lic. Metallic mixed-valent oxides are more commonly su-perconductors.

4. Single-valent oxides having W ≈ U are not super-conducting; electron correlations that introduce an en-hancement of the magnetic susceptibility, even if theydo not induce magnetic order at low temperatures, com-pete with superconductivity. Moreover, where a smallU permits a large N (EF) compatible with W > U ,static charge-density waves compete with superconduc-

tivity. In fact, CDWs may also compete in mixed-valentoxides.

5. At the crossover from localized to itinerant electronicbehavior (W ≈ U or W ≈ hωR), a first-order change inthe equilibrium M O bond length can give rise to vi-bronic phenomena in both single-valent and mixed-valenttransition-metal arrays.

i. Peculiarity of copper oxides. Copper oxides are un-usual in two respects. First, octahedral-site Cu(II):t6e3

contains a single e hole in the 3d shell, which makes itorbitally degenerate and therefore a strong Jahn–Tellerion; consequently, Cu(II) ions normally occupy octahedralsites that are deformed to tetragonal (c/a > 1) symmetryby Jahn–Teller orbital ordering. However, in the absenceof a cooperativity that stabilizes long-range orbital order-ing, the electrons may couple locally to E-mode vibra-tions, forming vibronic states in a dynamic Jahn–Tellercoupling. Second the Cu(II):3d9 energy level lies belowthe top of the O2−:2p6 valence band in an ionic model;the introduction of covalent bonding creates states of e-orbital symmetry at the top of the O2−:2p6 bands that havea large O-2pσ component (see Fig. 26). Locally this O-2p σcomponent increases dramatically on oxidation of Cu(II)to Cu(III). The change in hybridization represents a polar-ization of the oxygen atoms that decreases the equilibriumCu O bond length, but the change in polarization is fast

FIGURE 26 Schematic energy density of one-electron statesfor La2CuO4 with bandwidth x2–y2. (a) Wσ >U and (b) Wσ <U ,Eg =U − Wσ .

Page 397: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

314 Superconductors, High Temperature

relative to the motion of the oxygen nucleus. Therefore,a dynamic vibronic phenomenon may reflect coupling tothe polarization cloud of the oxygen atoms rather thanto significant oxygen-atom displacements. Nevertheless,hybridization with a polarization wave on the oxygen-atom array would significantly increase the effective massm∗ of an itinerant electron.

The copper-oxide superconductors all contain CuO2

sheets in which apical Cu O bonds perpendicular to asheet are significantly longer than the in-plane Cu Obonds. This structural feature signals full occupancy ofthe (3z2–r2) orbitals of an e-orbital pair. The parent com-pounds of the superconductive systems contain all Cu(II)in the CuO2 sheets, which leaves the in-plane (x2–y2) or-bitals half-filled with a

U = Ue + ex. (76)

A W < U results in localized (x2–y2) electrons thatinteract with one another on nearest neighbors bysuperexchange to give antiferromagnetic order within asemiconductive CuO2 sheet. On oxidation of the CuO2

sheets, the system undergoes a crossover from localizedto itinerant electronic behavior, and a thermodynami-cally distinguishable p-type superconductive phase isfound at crossover with a hole concentration x per Cuatom of the CuO2 sheets in the range 0.14 ≤ x ≤ 0.22.Superconductivity has also been observed on reductionof the CuO2 sheets, but n-type superconductivity is moredifficult to stabilize and has been studied much less.

IV. HIGH-Tc SUPERCONDUCTORS

A. System BaPb1−x BixO3

1. Structure

The high-Tc superconductors are oxides having structuresrelated to the cubic perovskite. The ideal cubic perovskitehas the composition ABX3, where A is a large cation, B is asmaller cation, and X is an anion. As illustrated in Fig. 27,the BX3 array consists of a framework of corner-sharedoctahedra, and the large A cation occupies the center ofeach “cage” of the framework. For the A cation to fit easilyinto the cage, the A X and B O bond lengths must satisfythe following relation among the “ionic radii” RA, RB,and RX:

t � (RA + RX)/√

2(RB + RX), (77)

where t is called the Goldschmitt tolerance factor. Sincethe A X and B X bond lengths can, at best, be opti-mized simultaneously only at a single temperature for afixed pressure, it is common in AMO3 perovskites to find

FIGURE 27 Two views of the ABX3 cubic perovskite structure.

that optimization of the A O interactions induces a dis-tortion of the cubic MO3 cage; these distortions—to or-thorhombic, rhombohedral, or tetragonal symmetry—areaccomplished by a cooperative rotation of the MO6/2 oc-tahedra that somewhat reduces the M O M bond anglesfrom 180◦.

The “cubic” perovskite structure can sustain a widerange of compositional variations. Partial substitution ofany of the ions is possible: A1−x A′′

x MO3, AM1−x M′x O3,

and AMO3−x Fx are known, for example. Removal of Acations and anions is also possible: the cubic bronzesNax WO3 contain an A-cation deficiency, and the ReO3

structure consists of just the cubic MO3 array. Small con-centrations of oxygen deficiency may be disordered, asin the superconductor SrTiO3−x ; but a large electrostaticrepulsion between oxygen vacancies tends to introduceshort-range order, at least, and commonly a long-rangeorder that defines a new structural type.

2. Superconductive versus CDW State

Pure BaPbO3 is a pseudocubic, metallic perovskite thatis distorted to orthorhombic symmetry by a cooperativerotation of the PbO6/2 octahedra; all the Pb(IV) ionsare in energetically equivalent octahedral sites. Oxygen-deficient BaPbO3−y is an n-type metal and a conventionalsuperconductor.

Pure BaBiO3, on the other hand, is monoclinic, withtwo distinguishable bismuth octahedra obtained by a co-operative shifting of the oxygen atoms away from onenear-neighbor bismuth toward the other so as to make theBi O bonds short and long in alternate octahedra. Such a“breathing-mode” oxygen displacement is indicative of adisproportionation reaction

2Bi(IV) → Bi(III) + Bi(V) (78)

in which the energy gained by stronger covalent mixingin a Bi(V)O6 more than compensates for the electrostaticenergy U required to transfer an electron from one Bi(IV)

Page 398: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

Superconductors, High Temperature 315

FIGURE 28 Temperature variation of the resistivity for variousvalues of x in the system BaPb1−xBixO3. [After Thanh, T. D.,Koma, A., and Tanaka, S. (1980). Appl. Phys. 22, 205.

to the other. Physicists refer to such a spontaneous dispro-portionation as a “negative U” reaction.

The system BaPb1−x Bix O3 is pseudocubic in the com-positional range 0.05 ≤ x � 0.3, and it is an unconven-tional superconductor with a Tc that increases with x toabout 13 K at the limiting composition of the pseudocubicsingle-phase field (Fig. 28).

Superconductivity was first discovered in theBaPb1−x Bix O3 system by Sleight of DuPont. Al-though this perovskite system reaches a maximum Tc

of only about 13 K, it is considered unconventionalbecause such a Tc requires a large VBCS N (EF) product[Eq. (42)], and a small measured N (EF) then requiresan exceptionally large pairing potential VBCS. Therefore,the system has been examined for clues to the strongcoupling mechanism operative in the higher-Tc copperoxides (Fig. 1).

Superconductivity also appears on suppression of thestatic CDW of BaBiO3 by substitution of more than 12%K+ for Ba2+ in Ba1−x Kx BiO3; a maximum Tc = 32 K isfound near x = 0.4, where the system becomes cubic. Forx > 0.47, the system behaves as a normal metal withoutany superconductivity. The effect on Tc of substituting 18Ofor 16O gave a conventional isotope shift, α = 0.4 to 0.5,which indicates that the BCS phonon-mediated pairingmechanism is operative in these systems. On the otherhand, Kumal, Hall, and Goodrich have shown that thetransition at Tc is fourth-order, not second-order, in theEhrenfest classification.

FIGURE 29 Structure of La2CuO4: (a) tetragonal and (b) or-thorhombic cooperative CuO6/2 rotations.

B. Copper-Oxide Superconductors

1. Structure

Where the tolerance factor in Eq. (77) approaches unity,epitaxial (001) interfaces between an AX rock-salt layerand an ABX3 perovskite layer are lattice matched. Naturerecognizes this fact by stabilizing intergrowth structures(AX)(ABX3)n in which perovskite layers alternate withrock-salt layers along an [001] axis. The La2CuO4 struc-ture in Fig. 29, for example, is tetragonal at high tempera-tures, with LaO rock-salt layers alternating with LaCuO3

perovskite layers on traversing the c-axis. Lattice match-ing requires a 45◦ rotation of the [100] axis of a rock-saltlayer relative to that of a perovskite layer. As in the per-ovskite structure itself, the Goldschmitt tolerance factorof Eq. (77) is a measure of the mismatch of the equilib-rium A X and B X bond lengths. Since the A X andB X bonds have different thermal expansions and com-pressibilities, matching (t = 1) of the bond lengths can beperfect only at a specific temperature for a given pres-sure, and the value of t calculated from tabulated ionicradii corresponds to room temperature at ambient pres-sure. On cooling, t decreases, and a t < 1 is compensatedby a cooperative rotation of the CuO6/1.5 octahedra abouta tetragonal [110] axis (Fig. 29b), to lower the symme-try to orthorhombic. These rotations buckle the Cu–O–Cubonds from 180◦ to (180◦ − φ), so the CuO2 planes be-come CuO2 sheets.

In the case of La2CuO4, the crystallographic c/a ratiois anomalously high because the Cu(II) ions distort their

Page 399: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

316 Superconductors, High Temperature

FIGURE 30 (a) T ′-tetragonal structure of Nd2CuO4. (b) T∗-tetragonal structure of Nd2−x−yCeySrxCuO4.

octahedra to tetragonal (c/a > 1) symmetry with their longapical Cu O bond along the c-axis. This c-axis orderingof the filled (3z2–r2) orbitals has an important structuralconsequence: the apical oxygen atoms are not stronglybound to the Cu atoms and may be removed from the oxy-gen coordination at a Cu(II). Consequently, the Cu(II) ionmay be found in six-, five-, or fourfold oxygen coordi-nation, but the strong square-coplanar bonding within aCuO2 sheet is always maintained.

Replacement of La with a smaller trivalent rare earthion illustrates well the weak bonding of the apical oxygen.Substitution of a smaller A cation lowers t , and the struc-ture accommodates the bond-length mismatch by displac-ing the apical oxygen to tetrahedral interstices of an (001)La bilayer to form a fluorite Ln O2 Ln layer (Ln = Prto Gd) as illustrated in Fig. 30a. This structure is labeledT′-tetragonal to distinguish it from the T-tetragonal phaseof high-temperature La2CuO4. An important consequencefor the chemistry of these phases is that the displacementof the apical oxygen in the T′ phase places the CuO2 planesunder tension, whereas the CuO2 sheets of La2CuO4 areunder compression as a result of the bond-length mis-match. A tensile stress is relieved by adding antibond-ing (x2–y2) electrons to the CuO2 planes; a compressivestress is relieved by removing antibonding (x2–y2) elec-trons from the CuO2 sheets. As a result, the T′ phase canonly be doped n-type to give n-type superconductivity,whereas La2CuO4 can only be doped p-type. In fact, caremust be exercised in the preparation of La2CuO4, as itmay accept interstitial oxygen in the tetrahedral sites ofthe rock-salt bilayers to give La2CuO4+δ; this compositionphase segregates below room temperature to give filamen-tary p-type superconductivity in the oxygen-rich phase. In

the p-type system La2−x Srx CuO4, the larger Sr2+ ion re-lieves the compressive stress, and oxygen stoichiometryis more easily achieved as x increases.

In the T∗-tetragonal structure of Nd2−x−yCeySrx CuO4

(Fig. 30b), the larger Sr2+ ions order into alternate A-cation bilayers; the Sr2+ ions stabilize rock-salt bilayers,whereas the alternate bilayers have the fluorite structure.As a result, the Cu(II) are fivefold coordinated. Whetherthe Cu(II) are six-, five-, or fourfold coordinated, su-perconductivity requires preservation of the translationalsymmetry within a CuO2 plane or sheet and, therefore,the same nearest-neighbor oxygen coordination for everycopper atom within a plane or sheet.

The variable oxygen coordination at a Cu(II) also makesit possible to remove the apical oxygen atoms from a per-ovskite multilayer if the A-site cations of the perovskiteblock are stable in eightfold oxygen coordination. In fact,all the copper-oxide superconductors that would containperovskite multilayers contain eightfold coordination ofthe A’ cations (A’ = Ca, Y, or a trivalent lanthanide) toform an A’m−1(CuO2)m layer, with integral m ≥ 2; thesesuperconductive layers alternate with AO–�–AO layers(A = La, Sr, Ba) that have a rock-salt AO interface.The intralayer composition � has a variable oxygen con-tent and may be quite varied, as illustrated in Figs. 31to 35.

The nonsuperconductive layer may act as a chargereservoir for the holes in the p-type superconductivelayers. This situation is found, for example, in theYBa2Cu3O6+x system, where the 0 ≤ x < 1 oxygen inthe BaO CuOx BaO layers order into Cu O Cu chains

FIGURE 31 Structures of (a) tetragonal YBa2Cu3O6 and (b) or-thorhombic, ideal YBa2Cu3O7.

Page 400: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

Superconductors, High Temperature 317

FIGURE 32 Tetragonal subcell of Bi2Sr3−xCaxCu2O8+y showingthe CuO2 layers. Metal atoms are shaded and only Cu–O bondsare indicated. Oxygen atom positions for the Bi layers are ideal-ized. [After Subramanian, M. A., et al. (1988). Science 239, 1015.]

for x > 0.4; the fully formed chains are more conduc-tive than the superconductive CuO2 sheets and they be-come superconductive with the CuO2 sheets. Moreover,displacement of the apical oxygen regulates the distri-bution of holes between the chains and the sheets. On

FIGURE 33 Structure of YBa2Cu4O8 showing double chains inthe BaO–Cu2O2–BaO layer.

the other hand, the superconductive (Tc = 16 K) RuSr2

GdCu2O8 = (CuO2GdCuO2)(SrORuO2SrO); is also fer-romagnetic, with a Curie temperature of 133 K. As inthe ferromagnetic perovskite SrRuO3, the Ru atoms carrya magnetic moment µRu ≈ 1 µB. Clearly the nonsuper-conductive, ferromagnetic layer in this compound mustbe electronically isolated from the superconductive layer,but the internal magnetic field lowers Tc.

2. System La2−xSrxCuO4

a. Phase identification. The intergrowth structureof La2CuO4 (Fig. 29) is the simplest that exhibits p-typesuperconductivity. Holes may be introduced into the (x2–y2) band by creating A-site vacancies or interstitial oxy-gen; more useful is the substitution of an alkaline-earth ionA2+ for La3+ in oxygen-stoichiometric La2−x Ax CuO4.The La2−x Ax CuO4 system is of particular interest for tworeasons: (1) the number x of holes per formula unit isunambiguously introduced into the (x2–y2) band of theCuO2 sheets; and (2) the solid-solution range 0 ≤ x ≤ 0.3spans the entire range of superconductive compositions,as shown in the phase diagram in Fig. 36.

Crystallographically, there are two distinguishablephases in Fig. 36, a high-temperature tetragonal (HTT)and a low-temperature orthorhombic (LTO) phase, result-ing from cooperative rotations of the CuO6/1.5 octahedra.In the La2−x Bax CuO4 system, cooperative rotations aboutthe [100] and [010] axes in alternate CuO2 sheets produce alow-temperature tetragonal (LTT) phase below about 60 Kin the range 0.12 ≤ x ≤ 0.15. In Fig. 36, the LTO–HTTtransition temperature Tt is seen to drop with increasingx , crossing Tc near x = 0.22.

The transport data distinguish three electronic phasesbelow room temperature: an antiferromagnetic phase inthe range 0 ≤ x ≤ 0.02, a superconductive phase in therange 0.1 < x < 0.22, and (3) an n-type metallic phase for0.26 < x ≤ 0.30. At x ≈ 0.125, there is a weak suppressionof Tc vs x ; in the La2−x Bax CuO4 system, superconductiv-ity is completely suppressed in the range 0.12 ≤ x ≤ 0.13by the stabilization of a static CDW. Tranquada et al. haveshown that the LTT phase of La2−x Bax CuO4 stabilizes atx ≈ 0.125, a static CDW having the form of alternatinghole-rich and antiferromagnetic stripes running parallelto the tetragonal [100] and [010] axes, respectively, inalternate CuO2 sheets. Under hydrostatic pressure, super-conductivity is restored to the compositions where it wassuppressed by the static CDW. From X-ray absorption finestructure (XAFS), Bianconi et al. previously found evi-dence suggesting mobile stripes in superconductive sam-ples, and an open question is whether the stripes are mobilein the superconductive phase or whether a related vibroniccoupling characterizes the charge carriers.

Page 401: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

318 Superconductors, High Temperature

FIGURE 34 Structures of (a) Tl2Ba2CuO6, (b) Tl2Ba2CaCu2O8, and (c) Tl2Ba2Ca2Cu3O10.

b. Underdoped. The “underdoped” region 0 < x <

0.1 supports polaronic conduction, but the thermoelec-tric power indicates that the nonadiabatic polorans arenot small, centered on one Cu atom, but embrace about

FIGURE 35 Structure of HgBa2Ca2Cu3O8+δ . The O4 and O5 arethe δ interstitial oxygen.

five Cu centers. A pseudo Jahn–Teller deformation of thein-plane square-coplanar coordination at a Cu(III) wouldbe resisted by an elastic energy, but this energy wouldbe reduced by cooperative deformations over several Cucenters. Calculation has shown that the gain in elasticenergy would result in a polaron that embraced five toseven Cu centers. However, some other vibronic mecha-nism may be responsible for preventing the polaron col-lapse to a single Cu center. Within the polarons, antifer-romagnetic order is suppressed, which indicates that thehole occupies a molecular orbital that includes all the Cucenters of the polaron. In this respect, it represents a mo-bile metallic phase in the antiferromagnetic matrix. Asthe volume of this second electronic phase increases, itbreaks up the long-range antiferromagnetic order of theparent phase, which causes TN to decrease precipitouslywith x from 340 K at x = 0. However, localized spinsin regions of short-range order persist into the supercon-ducting compositions; they give rise to a maximum in theparamagnetic susceptibility at a Tmax that decreases withincreasing x .

The appearance of a superconductive Tc that increaseswith x for compositions 0.05 ≤ x ≤ 0.10 indicates that thepolarons condense at lower temperatures into supercon-ductive filaments. The transition temperatures TF and Tρ

in Fig. 36 mark anomalies in the temperature dependenceof the transport properties; others have noted anomalies in

Page 402: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

Superconductors, High Temperature 319

FIGURE 36 Phase diagram of La2−xSrxCuO4.

NMR and thermodynamic measurements at similar tem-peratures T ∗ (not shown in Fig. 36) that indicate the open-ing of a psuedogap, i.e., the lowering of the density ofstates, at εF. These anomalies appear to reflect interac-tions between polarons and their ordering into hole-richstripes. Hunt et al. have used 63Cu nuclear quadruple res-onance (NQR) to reveal the presence of slowly fluctuat-ing, quasitatic charge (i.e., hole-rich) stripes in the range1

16 ≤ x ≤ 18 ; the stripes become increasingly ordered on

lowering the temperature, but the ordering temperaturedecreases with increasing x , vanishing at x = 1

8 . The po-laron ordering apparently occurs within a parent phase thatdecreases in volume with increasing x ; it is replaced by asingle superconductive phase.

c. Optimally doped. The superconductive compo-sitions 0.14 ≤ x ≤ 0.20 exhibit a nearly temperature-independent thermoelectric power α(T ) above Tl ; thereis no dramatic change in the evolution of α(T ) with xabove Tl . However, below Tl there is an abrupt changein the character of α(T ) between x = 0.10 and x = 0.15.On cooling below Tl in the range 0.15 ≤ x ≤ 0.22, α(T )increases relatively steeply to a maximum value at about140 K, too high a temperature to be due to phonon drag.Zhou and Goodenough have shown that this unusual fea-tures is present in all the single-phase copper-oxide su-perconductors, and only where there are superconductiveCuO2 sheets. This feature reflects the appearance of itiner-ant quasiparticles of momentum hk that have an unusualdispersion εk(k) of their one-particle energies. Mihailovicet al. have used femtosecond time-domain spectroscopy todemonstrate a change from a polaronic to an itinerant char-acter of the mobile holes on passing from the underdoped

to the optimally doped compositions in the YBa2Cu3O6+x

system, and the Fermi surface of the itinerant quasipar-ticles in optimally doped CuO2 sheets has been mappedwith photoemission spectroscopy (PES). Most significant,angle-resolved PES as a function of temperature by Nor-man et al. and Dessau et al. has revealed a massive transferof spectral weight on cooling from the π , π to the π , 0directions within a CuO2 sheet. These data indicate a pro-gressive stabilization of the itinerant quasiparticles with kvectors along the Cu O Cu bonds of a CuO2 sheet rel-ative to those directed along a tetragonal [110] direction.The εk(k) dispersion becomes extremely flat at the Fermienergy εF in the direction of the Cu O Cu bonds, indi-cating that the quasiparticles of the dominant populationat εF have an unusually heavy mass m∗.

The origin of the heave mass m∗ has not been re-solved. Tc is expected to reach a maximum at the crossoverfrom Cooper pairing to Bose condensation of bipolarons.Alexandrov and Mott have explored the bipolaron optionmost thoroughly. Markiewicz has argued extensively fortrapping of the Fermi energy in a van Hove singularity.Although there is considerable evidence that the cupratesare close to the Bose–Einstein condensation regime, thePES data show the charge carriers are itinerant in thesuperconductive phase. Goodenough and Zhou have sug-gested that the large m∗ is due to an unusual electron–lattice (or electron–polarization) interaction that gives riseto vibronic itinerant quasiparticles. The transfer of spec-tral weight in the angle-resolved PES spectra are consis-tent with the latter view, as are the data on the pressuredependence of Tc. A vibrational or polarization wave thatis hybridized with a traveling-electron wave would be sen-sitive to changes in the bending angle φ of a (180◦ − φ)Cu O Cu bond. The hydrostatic pressure P decreases φ,and the Tc of the LTO phase increases with P , whereas adTc/d P = 0 is found for the tetragonal (φ = 0◦) phase.Moreover, epitaxial La1.85Sr0.15CuO4 films on SrTiO3

have their CuO2 sheets under tension and the Tc is lowered;those on LaSrAlO4 have their CuO2 sheets under compres-sion and the Tc is raised. The compressive stress built intothe films on LaSrAlO4 allows an added hydrostatic pres-sure achievable in a Cu Be pressure cell to access at lowtemperature the tetragonal phase of the optimally dopedLa1.85Sr0.15CuO4; Tc increased with P to 47 K, where itbecame P-independent on going from the orthorhombicto the tetragonal phase.

Below Tc, NMR Knight shift and other measure-ments have established that superconductive particlesconsist of two spin-paired electrons as in a conven-tional superconductor. However, a short coherence lengthξ◦ ≈ 15 A means that the coulomb repulsion betweenpaired electrons is much stronger in the copper oxides. In aconventional superconductor, weak coulomb interactions

Page 403: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

320 Superconductors, High Temperature

result in pair wave functions with s-wave symmetry; thesuperconductive energy gap 20 is finite over the entireFermi surface. The pair wave functions in the CuO2 sheetsof the copper oxides have (x2–y2) d-wave symmetry; theenergy gap 20 has nodes along the tetragonal [110] and[110] axes. This symmetry reduces the coulomb repulsionbetween the paired electrons.

d. Overdoped. The overdoped compositions x >

0.25 are not superconductors, and a change from p-type to n-type conduction signals a transfer of spectraleight from the lower and upper Hubbard bands of thex = 0 parent compound to Fermi-liquid states in the gap(U–W ) of the parent. Nevertheless, the metallic resis-tivity remains high with an anomalous temperature de-pendence, which indicates that the transition from vi-bronic to Fermi-liquid states may not be complete byx = 0.3.

The decrease in Tc with increasing x in the range0.22 < x < 0.26 is not smooth; it is characterized by a se-ries of steps typical of phase segregation. Since Tt crossesTc near this compositional range, these steps may reflectsegregation of orthorhombic and tetragonal phases.

3. System YBa2Cu3O6+x

The possibility of practical superconductive devices oper-ating at the boiling point of liquid nitrogen (77 K) capturedthe imagination of the technical community on the discov-ery of a superconductive critical temperature of 90 K inYaBa2Cu3O6.95. Although other superconductors with ahigher Tc, a greater chemical stability, and cleavage planesthat simplify fabrication into tapes and wires have sincebeen discovered, the YBa2Cu3O6+x , 0 ≤ x < 1, systemcontinues to be of technical importance because, so farat least, films of YaBa2Cu3O6.95 have been able to sustainthe highest critical currents.

The structure, shown in Fig. 31, containsCuO2 Y CuO2 layers and BaO CuOx BaO layers.The oxygen atoms of the BaO buckled planes are c-axisapical oxygen atoms of the Cu in the CuO2 Y CuO2

layers; these Cu all have fivefold oxygen coordination.The Cu of the BaO CuOx BaO layers bridge theapical oxygen atoms with 180◦ O Cu O bonds orientedparallel to the c-axis. The x interstitial oxygen atoms inthe BaO CuOx BaO layers are mobile above 300◦C,and their equilibrium concentration depends on thetemperature and atomsphere. An air anneal at 400◦Cis used to obtain the optimally doped YaBa2Cu3O6.95

composition. Since the apical oxygen atoms participate inthe interstitial oxygen diffusion, it is important to ensurethat the thermal history does not leave apical oxygenvacancies, which perturb the periodic potential of the

superconductive CuO2 Y CuO2 layers and suppress Tc.It is interesting that the Y3+ ion may be substituted by anytrivalent rare earth ion (with the exception of Pr) withoutinfluencing Tc significantly. Only in the case of Pr isthere an important interaction between the lanthanide 4 forbitals and the (x2−y2) band of the CuO2 sheets.

YBa2Cu3O6 is an antiferromagnetic insulator withCu(II) in the CuO2 Y CuO2 layers and Cu(I) in theBaO Cu BaO layers; antiferromagnetic order betweenCu(II) ions sets in at a TN > 500 K. The initial interstitialoxygen atoms enter the BaO Cu BaO layers randomlyand oxidize the neighboring Cu(I) to Cu(II). However,a threefold-coordinated Cu(II) attracts a second intersti-tial oxygen atom to form square-coplanar coordination,which initiates the formation of a chain segment. Thetwofold-coordinated Cu remain Cu(I), so the formation ofchain segments initiates oxidation of the CuO2 Y CuO2

layers. For x < 0.3, the chain segments remain disor-dered, so the crystallographic symmetry is tetragonal; butTN drops precipitously with increasing oxidation of theCuO2 Y CuO2 layers in the interval 0.1 < x < 0.25 (seeFig. 37). For x ≥ 0.4, the crystallographic symmetry is

FIGURE 37 Phase diagram for the system YBa2Cu3O6+x .

Page 404: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

Superconductors, High Temperature 321

FIGURE 38 Ideal oxygen ordering of chains in YBa2Cu3O6.5.

orthorhombic as a result of an alignment of the chainsalong the orthorhombic b-axis. The orthorhombic compo-sitions are superconductors with a Tc that increases withx in two steps, a Tc ≈ 60 K plateau appearing in the inter-val 0.6 < x < 0.8. At x = 0.5, the chains order, alternatingwith Cu(I) b-axis rows as illustrated in Fig. 38. If the fullyformed chains contained all Cu(II), the number of holesper Cu atom in the CuO2 Y CuO2 layers would be 0.15at x = 0.5, close to the optimal doping. However, the fullyformed chains are oxidized beyond Cu(II), which reducesthe number of holes per Cu atom in the CuO2 Y CuO2

layers and makes the chains metallic conductors. As firstpointed out by Cava, the chains act as charge reser-voirs for the superconductive CuO2 Y CuO2 layers. InYBa2Cu3O6.95, the CuO2 Y CuO2 layers are optimallydoped with about 0.18 hole/Cu atom and the chains in theBaO CuO0.95 BaO layers are also superconductive.

Orthorhombic symmetry and superconductive chainsare not determinants of the superconductivity of theCuO2 Y CuO2 layers. By doping an equal amount ofCa for Y and La for Ba, the total hole concentration iskept constant. By 40% doping of Ca and La, the La inthe BaO CuOx BaO layers breaks up the chains intorandomly oriented chain segments, changing the symme-try to tetragonal and suppressing superconductivity in thechain segments. As a result, the mobility of H2O or CO2

species in the nonsuperconductive layers is reduced, whichsuppresses chemical degradation at room temperatureon exposure to the atmosphere, but the superconductivetransition temperature is reduced only from 90 to 78 K.

Figure 39 shows the phase diagram of applied mag-netic field H vs temperature T for clean YBa2Cu3O6.95

with H parallel to the c-axis. The copper-oxide super-conductors are all strongly Type II, and Hcl(T ) marks the

FIGURE 39 Upper and lower critical magnetic fields Hc 2(T)and Hc1(T ) and critical field for vortex melting Hm(T) forYBa2Cu3O6.95.

transition between the Meissner phase and the vortex state.The vortex solid consists of an array of stationary vortices.The ability to grow crystals with a good surface qualityhas allowed imaging of the vortex lattice with scanningtunneling microscopy (STM). The vortex lattice does notshow long-range order into the hexagonal close-packedstructure; it represents a glassy state rather than a regu-lar lattice. However, locally the flux lines are arranged inan oblique lattice, with approximately equal primitive lat-tice vectors forming an angle between them of 77 ± 5◦.Moreover, the shape of the vortex cores is elliptical, notcircular, with the long axis along an in-layer orthorhom-bic axis. These features reflect the anisotropy within ana–b plane that is induced by orientation of the chainsof the BaO CuOx BaO layers along the orthorhombicb-axis.

The strength of the pinning of the vortex solid in copper-oxide superconductors depends on the coupling betweenlayers. With weak coupling, the vortex lattice of one layermay be displaced relative to that of an adjacent layer,thereby bending the flux trajectory through the vortexcores. The interlayer vortex coupling is relatively strongin YaBa2Cu3O6.95. Nevertheless, the vortex solid meltsat an Hm(T ) < Hc2(T ); the melting transition is weaklyfirst-order. The vortices of the vortex liquid are mobile;moving vortices dissipate energy and introduce a finiteresistance. Associated with Hm(T ) is an irreversibilityline Hirr(T ) < Hm(T ). Both Hirr(T ) and Hm(T ) decrease

Page 405: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

322 Superconductors, High Temperature

sharply with loss of oxygen from the BaO CuOx BaOlayers.

Pinning of the vortex solid is an extrinsic phenomenon,and considerable effort has been given to finding ways toincrease the pinning. In films, surface roughness and thedeposition of a protective (Y1−x Cax )(Ba2−x Lax )Cu3O7−δ

overlayer increase Hm(T ) and therefore the criticalcurrent.

The brittleness of the ceramics and the two-dimensionalsuperconductive layers makes fabrication of flexible tapesor wires a formidable challenge. Alignment of the layersfrom grain to grain is a critical requirement that is mosteasily achieved by deposition of films on a flexible metallictape.

4. Layers with Three CuO2 Sheets

The highest values of Tc have been obtained with struc-tures containing layers with three CuO2 sheets. Figure 35shows the structure of HgBa2Ca2Cu3O8+δ , which has aTc = 135 K at ambient pressure and a Tc = 164 K undera quasi-hydrostatic pressure of 30 Gpa. The lattice oxy-gen atoms of the BaO HgOδ BaO nonsuperconductivelayers supply the apical oxygen atoms of the two outersheets of the CuO2 Ca CuO2 Ca CuO2 superconduc-tive layers; the Cu atoms of the inner CuO2 plane have squ-are-coplanar oxygen coordination. The Cu O Cu bondangles within the outer CuO2 sheets approach the optimal180◦. The value of Tc at ambient pressure decreases from135 K to 94 K as the number of O4 interstitial oxygen inthe HgOδ planes decreases from 0.18 to 0.10 per Hg atom.The O4 oxygen atoms oxidize the superconductive lay-ers, the O5 interstitial oxygen of the HgOδ planes do not.This result implies near-optimal doping with 0.36 hole performula unit in the superconductive layers. These holeswould be distributed predominantly in the outer sheetswith Cu in fivefold oxygen coordination, which wouldgive a maximum hole concentration of 0.18/Cu atom inthese sheets. Optimal doping is thus seen to correspondwell with that in other p-type copper-oxide superconduc-tors. Why pressure increases the Tc in HgBa2Ca2Cu3O8+δ

is not known, but it is reasonable to assume that a redistri-bution of holes within the superconductive layers increasesthe coupling between the outer CuO2 sheets.

5. Electron Superconductors

The equilibrium Cu O bond length for Cu(II) ions insquare-coplanar oxygen coordination is about 1.93 A. Todope n-type a CuO2 plane without apical oxygen atomsby reducing Cu(II) to Cu(I), it has been necessary to placethe CuO2 planes under tension so as to make the Cu Obond of the compound larger than 1.93 A. This feature is

FIGURE 40 Charge-transfer gap vs Cu O bond length forLn2CuO4 oxides.

illustrated by the two copper-oxide structures that exhibitn-type superconductivity.

a. T ′-Ln2−xCexCuO4. The parent T ′ phasesLn2CuO4 have been prepared at atmospheric pressurefor Ln = Pr,. . . , Gd; they have the structure of Fig. 30a,which has isolated CuO2 planes having no apical oxygen.However, care must be taken to order the oxygen inthe fluorite layers, as fivefold oxygen coordination ata few Cu atoms would perturb the periodic potential.Sensitivity to perturbations of the periodic potential isanother indicator that the charge carriers are itinerant inboth the p-type and the n-type superconductors.

The parent compounds contain only Cu(II) and are anti-ferromagnetic insulators such as the parent La2CuO4 com-pound with the T/O structure. Figure 40 shows, for roomtemperature, the magnitude of the energy gap Eg = U − Wvs the Cu O bond length, which remains longer than theequilibrium bond length over the entire series of n-type su-perconductors. In each T ′ system Ln2−x Mx CuO4, M = Ceor Th, n-type superconductivity is found only in a nar-row compositional range 0.10 ≤ x ≤ 0.18, and at largerx a nonsuperconductive metallic state persists to lowesttemperatures. Thus the n-type superconductors, like the p-type superconductors, appear as a distinguishable thermo-dynamic phase at a crossover from an antiferromagnetic-insulator to a metallic phase (Fig. 41). However, thereare also significant differences between the p-type andn-type superconductors. For example, the charge carriersin the underdoped T ′ systems are conventional small po-larons and TN decreases only slowly with increasing xin a manner typical of a simple dilution with nonmag-netic Cu(I) ions. Moreover, the transition from antifer-romagnetic semiconductor to superconductor appears tobe a conventional first-order phase change occurring at acritical charge-carrier concentration xc. With decreasingCu O bond length, there is a systematic increase in xc

Page 406: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

Superconductors, High Temperature 323

FIGURE 41 Variations of Tc with x for (a) LaNd1−xCexCuO4 and(b) Nd2−xCexCuO4. Shaded areas refer to two-phase regions.

and a decrease in the Ce solubility limit xl that results ina narrowing of the superconductive phase field until, inGd2CuO4, it disappears altogether. The appearance of n-type superconductivity is restricted not only to compoundswith Cu in fourfold, square-coplanar coordination, but alsoto those where the Cu O bonds of the CuO2 planes arestretched sufficiently beyond 1.93 A to give an xc < xl .

b. Infinite layers. The infinite-layer structure inFig. 42 was first stabilized in the compound Ca0.86

Sr0.14CuO2; it has an equilibrium Cu O bond length of1.93 A at room temperature. Synthesis at atmosphericpressure allows little variation in the Ca/Sr ratio, and at-tempts to dope the compound either p-type or n-type wereunsuccessful. On the other hand, SrCuO2 can be preparedunder high pressure; it has a Cu O bond length stretchedto 1.965 A, which satisfies the criterion for n-type dop-ing. Therefore, Sr1−x Lnx CuO2 (Ln = La, Pr, Nd) wereprepared under high pressure; they proved to be n-typesuperconductors with a Tc ≈ 30 K.

6. Mechanism: An Open Question

The pairing mechanism in the copper-oxide superconduc-tors remains an open question; a consensus on the char-

FIGURE 42 Comparison of the (a) atmospheric-pressure and (b)high-pressure forms of SrCuO2.

acter of the charge carriers in the normal state has yet tobe reached. Most theorists have investigated the role ofspin–spin exchange interactions without consideration ofelectron coupling to the lattice or to the oxygen polariza-tion. Since the Cu(III) are diamagnetic, these efforts havebeen able to justify the separation of holes into chargestripes, but the charge separation can be achieved by otherforces. To date, a convincing description of the high-Tc

phenomenon has yet to emerge. Nevertheless, experimenthas shown that spin fluctuations persist into the super-conductive phase, and inelastic neutron scattering has re-vealed a commensurate (π, π) resonance peak in the spec-trum of the antiferromagnetic susceptibility χ (q, ω) thathas a half-width in momentum space that varies linearlywith Tc. These measurements define a characteristic ve-locity that is lower than a typical electron velocity at theFermi energy and an order of magnitude smaller than aspin-wave velocity. These spin fluctuations could be as-sociated with either mobile stripes or with slowly movingelectron-density fluctuations as in a vibronic state. Thedata do not reveal what is the driving force for the forma-tion of these density fluctuations.

ACKNOWLEDGMENT

Support of this work by the R. A. Welch Foundation, Houston, Texas, isgratefully acknowledged.

SEE ALSO THE FOLLOWING ARTICLES

BONDING AND STRUCTURE IN SOLIDS • ELECTRONS

IN SOLIDS • SUPERCONDUCTING DEVICES • SUPER-CONDUCTIVITY

BIBLIOGRAPHY

Alexandrov, A. S., and Mott, N. F. (1996). “Polarons and Bipolarons,”World Scientific, Singapore.

Page 407: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GPB Final Pages

Encyclopedia of Physical Science and Technology EN016A-751 July 31, 2001 15:35

324 Superconductors, High Temperature

Blatt, J. M. (1964). “Theory of Superconductivity,” Academic Press,New York.

de Gennes, P. G. (1966). “Superconductivity of Metals and Alloys,”Benjamin, New York.

Ginsberg, D. M. (ed.) (1996). “Physical Properties of High-TemperatureSuperconductors,” Vol. 5, World Scientific, Singapore.

Ginzburg, V. L., and Kirzhmits, D. A. (eds.) (1982). “High TemperatureSuperconductivity” (A. K. Agyei, transl.; J. L. Birman, transl. ed.),Pergamon, Oxford.

Goodenough, J. B. (1972). Prog. Solid State Chem. 5, 145.Goodenough, J. B., and Longo, J. M. (1970). “Crystallographic and Mag-

netic Properties of Perovskite and Perovskite Related Compounds,in Landolt–Bornstein Tabellen,” New Series Group III/4a, No. 126,Springer-Verlag, Berlin, New York.

Kaldis, E. (ed.) (1994). “Materials and Crystallographic Aspects ofHTc—Superconductivity,” NATO ASI, Series E: Applied Sciences,Vol. 263, Kluwer Academic, Dordrecht.

Kittel, C. (1976). “Introduction to Solid State Physics,” 5th ed., Wiley,New York.

Kulik, I. O., and Yanson, I. K. (1972). “Josephson Effect in Supercon-ductive Tunneling Structures,” Halsted, New York.

Kuper, C. G. (1968). “Introduction to the Theory of Superconductivity,”Oxford University Press, London, New York.

London, F. (1950). “Superfluids,” Vol. I, Wiley, New York.Lynton, E. A. (1971). “Superconductivity,” 3rd ed., Halsted, New York.Markiewicz, R. S. (1997). J. Phys. Chem. Solids 58, 1179.

McMillan and Rowell (1969). In “Superconductivity” (R. D. Parks, ed.),p. 561, Dekker, New York.

Manousakis, E. (1991). Rev. Mod. Phys. 63(1), 1–62.Mendelssohn, K. (1966). “Quest for Absolute Zero,” McGraw–Hill, New

York.Newhouse, V. L. (ed.) (1975). “Applied Superconductivity,” Academic

Press, New York.Rickayzen, G. (1965). “Theory of Superconductivity,” Wiley (Inter-

science), New York.Saint-James, D., Sarma, G., and Thomas, E. J. (1969). “Type II Super-

conductivity,” Pergamon, Oxford.Scalapino, D. J. (1995). Phys. Rev. 250, 329.Schrieffer, J. R. (1964). “Theory of Superconductivity,” Benjamin, New

York.Solymar, L. (1972). “Superconductive Tunnelling and Applications,”

Halsted, New York.Taylor, A. W. B. (1970). “Superconductivity,” Wykeham, London,

Winchester.Tinkham, M. (1975). “Introduction to Superconductivity,” McGraw–

Hill, New York.Wallace, P. R. (ed.) (1969). “Superconductivity,” Gordon and Breach,

New York.Williams, J. E. C. (1970). “Superconductivity and Its Applications,”

Arrowsmith, Bristol, England.Ziman, J. M. (1972). “Principles of the Theory of Solids,” 2nd ed.,

Cambridge University Press, London, New York.

Page 408: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT Final Pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN016B-774 July 31, 2001 17:48

ThermoelectricityTimothy P. HoganMichigan State University

I. Introduction and Basic Thermoelectric EffectsII. Thermodynamic RelationshipsIII. Thermodynamics of an Irreversible ProcessIV. Statistical RelationshipsV. Applications

VI. Summary

GLOSSARY

Boltzmann equation An equation based on the Fermidistribution equation under nonequilibrium conditions.The Boltzmann equation describes the rate of changeof the distribution function due to forces, concentrationgradients, and carrier scattering.

Fermi distribution A function describing the probabilityof occupancy of a given energy state for a system ofparticles based on the Pauli exclusion principle.

Fermi level The energy level which exhibits a 50% prob-ability of being occupied.

Joule heating Heating due to I 2 R losses.Onsager relations A set of simultaneous equations

that describe the macroscopic interactions bet-ween “forces” and “flows” within a thermoelectricsystem.

Peltier effect Absorption or evolution of thermal en-ergy at a junction between dissimilar materials throughwhich current flows.

Seebeck effect Open-circuit voltage generated by acircuit consisting of at least two dissimilar conduc-tors when a temperature gradient exists within the

circuit between the measuring and the referencejunctions.

Thermocouple A pair of dissimilar conductors joined atone set of ends to form a measuring junction.

Thermoelectric cooler A heat pump designed from ther-moelectric materials typically configured in an array asa series of thermocouples with the junction exposed.

Thermopower This is defined here as the absolute See-beck coefficient and corresponds to the rate of changeof the thermoelectric voltage with respect to the temper-ature of a single conductor with a temperature gradientbetween the ends.

Thompson effect The absorption or evolution of thermalenergy from a single homogeneous conductor throughwhich electric current flows in the presence of a tem-perature gradient along the conductor.

THE FIELD of thermoelectricity involves the study ofcharacteristics resulting from electrical phenomena occur-ring in conjunction with a flow of heat. It includes flows ofelectrical current and thermal current and the interactionsbetween them.

681

Page 409: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016B-774 July 31, 2001 17:48

682 Thermoelectricity

I. INTRODUCTION AND BASICTHERMOELECTRIC EFFECTS

In 1822, Seebeck reported on “the magnetic polarizationof metals and ores produced by a temperature difference”(Joffe, 1957). By placing two conductors in the configu-ration shown in Fig. 1, Seebeck observed a deflection ofthe magnetic needle in his measurement apparatus (Gray,1960). The deflection was dependent on the temperaturedifference between junctions and the materials used forthe conductors. Shortly after this, Oersted discovered theinteraction between an electric current and a magneticneedle. Many scientists subsequently researched the re-lationship between electric currents and magnetic fieldsincluding Ampere, Biot, Savart, Laplace, and others. Itwas then suggested that the observation by Seebeck wasnot caused by a magnetic polarization, but due to a ther-moelectric current flowing in the closed-loop circuit.

Seebeck did not accept this explanation, and in an at-tempt to refute it, he reported measurements on a numberof solid and liquid metals, alloys, minerals, and semicon-ductors. The magnetic polarization hypothesis was incor-rect as can be seen in the open-circuit configuration of hisexperiment.

Experimentally, a voltage (�V ) at the open-circuit ter-minals is measured when a temperature gradient existsbetween junctions such that

�V =∫ T2

T1

SAB dT , (1)

where SAB is the Seebeck coefficient for the two conduc-tors, which is defined as being positive when a positivevoltage is measured for T1 < T2. The voltage is measuredacross terminals maintained at a constant temperature T0.For this voltage to appear in the open-circuit configura-tion (Fig. 2), there must exist a current which flows inthe closed-circuit configuration. Furthermore, in the open-circuit configuration, Seebeck would no longer observe adeflection of the magnetic needle, which is not expectedif a magnetic polarization is taking effect.

The diligence of his measurements was vertified by theconfirmation of his values years later by Justi and Meisner

FIGURE 1 Closed-circuit Seebeck effect.

FIGURE 2 The open-circuit Seebeck effect.

as well as by Telkes, who showed, 125 years after See-beck’s measurements, that the best couple for energy con-version was formed using ZnSb and PbS, which were twomaterials examined by Seebeck.

Twelve years after Seebeck’s discovery, a scientist andwatchmaker named Jean Peltier reported a temperatureanomaly at the junction of two dissimilar materials as acurrent was passed through the junction. It was unclearwhat caused this anomaly, and while Peltier attempted toexplain it on the basis of the conductivities and/or hardnessof the two materials, Lenz removed all doubt in 1838 withone simple experiment. By placing a droplet of water in adimple at the junction between rods of bismuth and anti-mony, Lenz was able to freeze the water and subsequentlymelt the ice by changing the direction of current throughthe junction. In a way, Lenz had made the first thermo-electric cooler. The rate of heat ( ) absorbed or liberatedfrom the junction was later found to be proportional to thecurrent, or

= � · I, (2)

where the proportionality constant (�) was named thePeltier coefficient.

Near this time, the field of electromagnetics was beingformed and captured much attention in the scientific com-munity. Therefore, another 16 years passed before Thom-son (later called Lord Kelvin) reasoned that if the currentthrough the two junctions in Fig. 1 produced only Peltierheating, then the Peltier voltage must equal the Seebeckvoltage and both must be linearly proportional to the tem-perature. Since this was not observed experimentally, hereasoned that there must be a third reversible process oc-curring. This third process is the evolution or absorption ofheat whenever current is passed through a single homo-geneous conductor along which a temperature gradientexists, or in equation form,

= IdT

dx, (3)

where is the rate of heat absorbed or liberated along theconductor, is the Thomson coefficient, I is the currentthrough the conductor and dT /dx is the temperaturegradient maintained along the length of the conductor.

Page 410: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016B-774 July 31, 2001 17:48

Thermoelectricity 683

Thomson then applied the first and second laws of ther-modynamics to the Seebeck, Peltier, and Thomson effectsto find the Kelvin relationships

� = SABT, (4)

d SAB

dT= A − B

T, (5)

where the subscripts A and B correspond to the two mate-rials in Fig. 2. The second Kelvin relation suggests that theSeebeck coefficient for two materials forming a junctioncan be represented as the difference between quantitiesbased on the properties of the individual materials makingup the junction. Integration of the second Kelvin relationgives∫

d SAB =∫

A − B

TdT =

∫A

TdT −

∫B

TdT

(6)

or

SAB =∫

A

TdT −

∫B

TdT . (7)

Defining the first term on the right-hand side as the “ab-solute” Seebeck coefficient of material A and the secondterm as the “absolute” Seebeck coefficient of material B,we find that the Seebeck coefficient for a junction is equalto the difference in “absolute” Seebeck coefficients of theindividual materials making the junction. This is a verysignificant result, as measurements of the individual ma-terials can be used to predict how junctions formed fromvarious combinations of materials will behave, thus re-moving the need to measure every possible combinationof materials. The “absolute” Seebeck coefficient or ther-moelectric power of a material, hereafter referred to sim-ply as the thermopower of the material, can be found formaterial A if the thermopower of material B is known orif the thermopower of material B is zero. A material inthe superconducting state has a thermopower of zero, andonce a material is calibrated against a superconductor, itcan then be used as a reference material to measure morematerials. This has been done for several pure materialssuch as lead, gold, and silver (Roberts, 1977; Wendlinget al., 1993).

Further understanding of the basic thermoelectric prop-erties and the relationships between them can be foundthrough comparisons of macroscopic and microscopicderivations. The Onsager relations formulate various flows(consisting of matter or energy) as functions of the forcesthat drive them, thus describing macroscopic observationsof materials. Another useful technique for understand-ing these basic thermoelectric properties utilizes semi-classical statistical mechanics to describe the microscopicprocesses. Comparisons between the macroscopic and the

microscopic analyses can be used in deriving many use-ful formulas for calculating thermoelectric properties ofvarious materials. The following sections are dedicated todeveloping the macroscopic and microscopic analyses.

II. THERMODYNAMIC RELATIONSHIPS

As shown by the Seebeck effect, when a temperature gra-dient is placed over the length of a sample, carrier flowwill be predominantly from the hot side to the cold side.This indicates that a temperature gradient, �T , is a forcethat can cause a flow of carriers. It is well known thatapplying the force of an electric potential gradient, �V ,can also induce carrier flow. In 1931, Onsager developeda method of relating the flows of matter or energy within asystem to the forces present. In this method the forces areassumed to be sufficiently small so that a linear relation-ship between the forces, Xi , and the corresponding flows,Ji , can be written.

J1 = L11X1 + L12X2 + · · · L1nXn,

J2 = L21X1 + L22X2 + · · · L2nXn, (8)

J3 = L31X1 + L32X2 + · · · L3nXn,

or

Ji =n∑

m=1

LimXm (i = 1, 2, 3, . . . , n). (9)

For carrier and heat flow as described above, the Onsagerrelationships can be written

J = L11∇V + L12∇T,(10)

JQ = L21∇V + L22∇T,

where J is the current density (electric charge flow), and JQ

is the heat flux density (heat flow). Without a temperaturegradient (�T = 0), a heat flux of zero would be expected,contrary to what Eq. (10) would indicate. It is, therefore,important to understand further the primary coefficientsLii and interaction coefficients Li j (i �= j) linking theseequations. To do so requires a consideration of the ther-modynamics of an irreversible process (one in which thechange in entropy is greater than zero �S > 0).

III. THERMODYNAMICS OF ANIRREVERSIBLE PROCESS

The general application of the Onsager relationship wasderived by Harman and Honig (1967) and is summarizedhere. For a constant electric potential, V , throughout thesample,

Page 411: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016B-774 July 31, 2001 17:48

684 Thermoelectricity

d Q = T d S = dU + P dV −∑

i

µi dni , (11)

where Q is the heat energy density, S is the entropy density,U is the internal energy density, P is the pressure, µi isthe chemical potential of the particle species, and ni is theparticle density. The magnitude of the differential volume,dV , is zero since each quantity has been specified perunit volume. This can be combined with the total energydensity, E , given by

E = U + V∑

i

Zi qni , (12)

where q is the magnitude of the electronic charge[1.602 × 10−19(C)], V is an externally applied bias, andZi is the number and sign of the charges on the i th particlespecies. For example, an electron would have charge Zeq,where Ze = −1. The time derivative of Eq. (12) gives

∂ E

∂t= ∂U

∂t+ ∂V

∂t

∑i

Zi qni + V∑

i

Zi q∂ni

∂t. (13)

From Eq. (11) with dV = 0,

dU = T d S +∑

i

µi dni . (14)

Taking the time derivative of (14) gives

∂U

∂t= T

∂ S

∂t+ S

∂T

∂t+

∑i

µi∂ni

∂t+

∑i

ni∂µi

∂t. (15)

Using (15) in (13) yields

∂ E

∂t= T

∂ S

∂t+ S

∂T

∂t+

∑i

µi∂ni

∂t+

∑i

ni∂µi

∂t

+ ∂V

∂t

∑i

Zi qni + V∑

i

Zi q∂ni

∂t. (16)

This can be simplified by considering the Gibbs–Duhemrelation (Guggenheim, 1957),

S∂T

∂t+

∑i

ni∂µi

∂t= 0 (17)

and

µi = µi + Zi qV, (18)

where µi is the electrochemical potential, µi is the chem-ical potential, and Zi qV is the electrostatic potential en-ergy. The relationship among the chemical potential, µ,the electrochemical potential, µ, and the temperature forelectrons is shown in Fig. 3, where the right side of thesample is at a potential of −V1 relative to the left.

Equation (16) then reduces to

∂ E

∂t= T

∂ S

∂t+

∑i

µi∂ni

∂t+ ∂V

∂t

∑i

Zi qni . (19)

FIGURE 3 The density of states for a metal at a temperatureT1 > 0 K on the left and at a lower temperature, T2 < T1, on theright.

The rate of change in the particle density ni , is governedby the equation of continuity,

∂ni

∂t=

(∂ni

∂t

)s

− ∇ · Ji , (20)

which states that the total rate of change in ni is equalto the local particle generation rate, or source rate, minusthe transport of the i th species across the boundary of thedifferential volume (or local system) of interest. The firstterm on the right-hand side of the equation is the sourceterm and represents the particle generation (or capture)rate through chemical reactions, for example. The last termis found using Gauss’s theorem,∫∫

Ji · n d A =∫∫∫

∇ · Ji dV, (21)

where Ji is the flux vector equal to the number of parti-cles of type i moving past a unit cross section per unittime in the direction of Ji , and n represents a unit vec-tor outward normal from an element of area d A on theboundary surface. This represents the total outward fluxof the i th particle species from the differential volume ofinterest. The particle species over which the summation inEq. (19) is evaluated includes core species, L , which formthe host lattice; neutral donors, D; ionized donors, D+;neutral acceptors, A; ionized acceptors, A−; electrons inthe conduction band, n; and holes in the valence band, p.Therefore,

1

T

∑i

µi∂ni

∂t= 1

T

(µL

∂nL

∂t+ µD

∂nD

∂t

+ µD+∂nD+

∂t+ µA

∂n A

∂t+ µA−

∂n A−

∂t

+ µn∂nn

∂t+ µp

∂n p

∂t

). (22)

Equation (20) can now be used for each term on the right-hand side of (22). Some simplification can be readily

Page 412: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016B-774 July 31, 2001 17:48

Thermoelectricity 685

made, however, when the species L , D, D+, A, and A−

are assumed to be immobile such that

JL = JD = JD+ = JA = JA− = 0. (23)

Furthermore, the lattice will not be affected by local trans-formations, and (

∂nL

∂t

)s

= ∂nL

∂t= 0. (24)

Additional relationships can be found to simplify (22) fur-ther by identifying the different mechanisms for genera-tion of electrons, n, or holes, p, as follows:

D ⇒ D+ + n,

A ⇒ A− + p,

⇒ n + p.

(25)

(26)

(27)

These reactions are reversible such that the time rate ofchange of ionized and unionized donors and acceptorsmust be considered in (22). Identifying the reactions in(25), (26), and (27) as I, II, and III, respectively, the fol-lowing reaction velocities can be written

−(

∂nD

∂t

)s

=(

∂nD+

∂t

)s

=(

∂nn

∂t

)I

= νI,

−(

∂n A

∂t

)s

=(

∂n A−

∂t

)s

=(

∂n p

∂t

)II

= νII,(∂nn

∂t

)s

=(

∂n p

∂t

)III

= νIII.

(28)

Therefore, (22) becomes

1

T

∑i

µi∂ni

∂t= 1

T

{µD

(∂nD

∂t

)s

+ µD+

(∂nD+

∂t

)s

+ µA

(∂n A

∂t

)s

+ µA−

(∂n A−

∂t

)s

+ µn

[(∂nn

∂t

)I

+(

∂nn

∂t

)III

− ∇ · Jn

]

+ µp

[(∂n p

∂t

)II

+(

∂n p

∂t

)III

− ∇ · Jp

]}.

(29)

From the relations in (28), Eq. (29) can be written in termsof the reaction velocities, νI, νII, and νIII as follows:

1

T

∑i

µi∂ni

∂t= 1

T{(−µD + µD+ + µn)νI + (−µA

+ µA− + µp)νII + (µn + µp)νIII

− µn∇ · Jn − µp∇ · Jn} (30)

This can be substituted into (19) to give

∂ E

∂t= T

∂ S

∂t+ 1

T{AIνI + AIIνII + AIIIνIII − µn∇ · Jn

− µp∇ · Jn} + ∂V

∂t

∑i

Zi qni , (31)

where AI, AII, and AIII affinities are defined as

AI ≡ −µD + µD+ + µn, AII ≡ −µA + µA− + µp,

(32)AIII ≡ µn + µp.

These general derivations can now be applied to morespecific cases by solving for the energy flux term on theleft-hand side of the equation using the appropriate ap-proximations for the material under consideration.

A. Metals

In metals, the energy density term ∂ E/∂t can be viewedas composed of four contributions.

� The rate at which an externally applied field deliversenergy to the local system.

� Two terms arise from the rate of change in theelectrostatic energy either due to a change in the chargeconcentration or due to a change in the potential, V .

� Electrons in the higher-energy states [the energiesabove µ(T1) in Fig. 3] can transition to the availablelower-energy states by giving up this excess energy tothe lattice, resulting in a heat flux, JQ .

As an externally applied electric field accelerates chargedcarriers, they do not continue to increase in velocity as theywould in free space, but attain some average drift velocity.Therefore, an internal force must exist to counterbalancethe external force. This internal force is caused mainly bycollisions of the carriers with the lattice, thus providinga mechanism of energy transfer from the applied electricfield to the lattice. The first contribution is given by(

∂ E

∂t

)I

= E · (−nnqvn + n pqvp) = J · E = −J · ∇V

= q(Jn − Jp) · ∇V, (33)

where Jn and Jp represent the particle flux densities, whileJ represents the current density such that

J = q(Jp − Jn). (34)

The electrostatic energy density is given by V∑

i Zi qni .The time rate of change of the electrostatic energy densityis

∂t

(V

∑i

Zi qni

)= ∂V

∂t

∑i

Zi qni + V∑

i

Zi q∂ni

∂t.

(35)

Page 413: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016B-774 July 31, 2001 17:48

686 Thermoelectricity

FIGURE 4 The density of states at a finite temperature. Only theexcitation energy can be transferred to the lattice.

This gives the second and third contributions to the totalrate of change in energy density,(

∂ E

∂t

)II

= ∂V

∂t

∑i

Zi qni , (36)

(∂ E

∂t

)III

= V∑

i

Zi q∂ni

∂t= −V (∇ · J)

= qV [∇ · (Jn − Jp)], (37)

where Eq. (20) was used, with the assumption of no gen-erative sources. The fourth contribution comes from theexcitation energy depicted in Fig. 4, which gives rise toa heat flux, JQ . Relative to the bottom of the conductionband, the total heat flux density, Ju , is

Ju = JQ − µ

qJ, (38)

thus giving the fourth contribution to energy flow throughFourier’s law of heat conduction,(

∂ E

∂t

)IV

= −∇ · Ju = −∇ ·(

JQ − µ

qJ)

. (39)

Summing contributions I through IV gives the total energydensity rate of change as

∂ E

∂t= −J · ∇V − V (∇ · J) + ∂V

∂t

∑i

Zi qni − ∇ · Ju

= −∇ · V J + ∂V

∂t

∑i

Zi qni − ∇ · Ju

= ∂V

∂t

∑i

Zi qni − ∇ · JE , (40)

whereJE = Ju + V J (41)

is the total energy flux density. Substituting (40) into (19)gives

∂V

∂t

∑i

Zi qni − ∇ · JE = T∂ S

∂t+

∑i

µi∂ni

∂t

+ ∂V

∂t

∑i

Zi qni , (42)

or, after cancellation and using (20), assuming no genera-tive sources,

−∇ · JE = T∂ S

∂t+ µ

q∇ · J = T

∂ S

∂t+ µ∇ · Jq . (43)

The rate of change of entropy can, therefore, be written

∂ S

∂t= −∇ · JE

T− µ

qT∇ · J = −∇ ·

(JE

T

)− ∇ ·

(Jµ

qT

)

+ JE · ∇(

1

T

)+ J · ∇

qT

), (44)

or using an entropy flux, Js , defined as

T Js = JE + µ

qJ, (45)

gives

∂ S

∂t= ∂ S0

∂t+ ∂ Ss

∂t

= −∇ · Js + JE · ∇(

1

T

)+ J · ∇

qT

), (46)

where the total entropy is given by the sum of theequilibrium entropy plus additional entropy sources,or S = S0 + Ss . The irreversible process for which�S = (S − S0) = Ss > 0 then consists of the last two termsin the above equation such that

∂ Ss

∂t= JE · ∇

(1

T

)+ J · ∇

qT

). (47)

Using Eq. (45) to substitute for JE , (47) becomes.

∂ Ss

∂t= −Js

T· ∇T + J

qT· ∇µ, (48)

or using µ = µ + qV along with (41) and (45) to give

T Js = JQ, (49)

then Eq. (48) becomes

∂ Ss

∂t= −JQ

T 2· ∇T + J

qT· ∇µ. (50)

These three equations, (47), (48), and (50), could each bewritten in the general form of

∂ Ss

∂t=

∑i

Ji · Xi . (51)

This is a necessary condition for using the Onsager reci-procity relation that L12 = L21 in Eq. (10). Three sets of

Page 414: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016B-774 July 31, 2001 17:48

Thermoelectricity 687

Onsager relations can then be written, by extracting theforces, Xi , from Eqs. (47), (48), and (50).

J = Z11

q∇

T

)+ Z12∇

(1

T

),

JE = Z21

q∇

T

)+ Z22∇

(1

T

),

(52)

J = B11

qT∇µ − B12

T∇T,

Js = B21

qT∇µ − B22

T∇T,

(53)

J = L11

qT∇µ − L12

T 2∇T,

JQ = L21

qT∇µ − L22

T 2∇T,

(54)

thus relating the electrical current density, J, to the energyflux density, JE , the entropy flux density, Js , and the heatflux density, JQ . Equations (52), (53), and (54) can nowbe used to identify various thermoelectric properties.

IV. STATISTICAL RELATIONSHIPS

Within crystalline materials, electron behavior can be de-scribed by the wave nature of electrons and Schrodinger’sequation,

∇2� + 2m

h2(E − V )� = −h

j

∂E

∂t, (55)

where � is the electron wave function, E is the total en-ergy, and V is the potential energy of the electrons. Thesolution to this equation is

�(r, t) = ψ(r)e− jωt , (56)

where � is the time-independent solution to Schrodinger’sequation. This solution forms a wave packet with a groupvelocity, v, equal to the average velocity of the particle itdescribes, such that

v = ∇kω = ∂ω

∂k= 1

h∇k E = 1

h

∂ E

∂k, (57)

where the use of Planck’s relationship, E = hν = hω, wasmade. Force times distance is equal to energy, or with atime derivative,

v · F = ∂ E

∂t= 1

h

∂ E

∂k· h

∂k∂t

, (58)

giving

F = h∂k∂t

. (59)

The electron wave function, �(r, t), itself does not havephysical meaning, however, the product of �∗(r, t)�(r, t)represents the probability of finding an electron at positionr and time t . As a probability implies, there is a factor ofuncertainty, which was quantified in 1927 by Heisenberg.

�px�x ≥ h, �py�y ≥ h, �pz�z ≥ h, (60)

where �px , �py , and �pz are the momentum uncertain-ties in the x , y, and z directions, respectively. The po-sitional uncertainties in the three directions are given by�x, �y, and �z. It is possible to utilize these uncertaintiesto define the smallest volume (in real space, or momentumspace) that represents a discrete electronic state. Within acube of material with dimensions L × L × L , the maxi-mum positional uncertainty for a given electron would be�x = �y = �z = L , since the electron must be locatedsomewhere within the cube. This would correspond to theminimum �px , �py , and �pz given by

�pxmin = h

�x= h

L, �pymin = h

�y= h

L,

(61)

�pzmin = h

�z= h

L.

Thus the product

�pxmin�pymin�pzmin = h3

L3(62)

gives the minimum elemental volume in momentum spaceto represent two discrete electronic states (one for spin-upand one for spin-down). The number of states, dg, per unitvolume in an element dpx dpy dpz of momentum spacecan be written

dg = 1

h3dpx dpy dpz . (63)

Schrodinger’s time-independent equation for a freeelectron (V = 0) is

∇2ψ + 2m

h2Eψ = 0, (64)

which has the solution

ψ = Ae jk·r. (65)

Substituting back into (64) gives

E = h2

2mk2 = p2

2m, (66)

where p2 = hk was used. Within a crystal, a similar for-mula can be found when the concept of effective mass, m∗,is utilized to account for internal forces on the electronsdue to the ion cores at each lattice point. Electrons withenergies below some value E are then defined by a spherein momentum space with radius p = √

2m∗E . The number

Page 415: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016B-774 July 31, 2001 17:48

688 Thermoelectricity

of electronic states within the material cube (L × L × L)is found by dividing the total momentum space volume bythe volume per state, or

N = 2

[4/3πr3

h3/L3

]= 2

[4/3π (

√2m∗E)3

h3/L3

]

= 8π (2m∗E)3/2

3h3L3. (67)

The factor of 2 is included to account for electrons of bothspin-up and spin-down.

The density of states is defined as the number of statesper unit energy per unit volume, or

g(E) = (d N/d E)

L3= 4π (2m∗)3/2

h3E1/2

= 1

2π2

(2m∗

h2

)3/2

E1/2, (68)

where h = h/2π is Plank’s reduced constant. This equa-tion describes the number of available states for electronsto go into, but it does not describe the way the electronsfill those available states.

A. The Fermi Distribution

To determine the number of electrons in a given band, itis necessary to find the probability of a given state be-ing occupied by an electron and then integrate over allavailable states. A more realistic result for metals, whichdoes not assume spherical constant energy surfaces ink-space, thus allowing for the electron energy to devi-ate from E = h2k2/2m∗ would be found using the densityof states from (63).

Within a crystalline material, charge carriers are knownto follow the Pauli exclusion principle, which states thatonly one electrons can occupy a given energy state. Theprobability that an electron occupies an energy state canbe found by considering a simple statistical exercise. If asystem is defined to have three allowed energy levels (E1,E2, and E3), two electrons, and a total energy of 4 eV asshown in Fig. 5, with the three energy levels defined asE1 = 1 eV, E2 = 2 eV, and E3 = 3 eV, it would be expectedthat 80% of the time, a distribution of one electron inenergy level E1, zero electrons in E2, and one electron inE3, or a distribution of (1, 0, 1), would occur. The entropyof a system is related to the most probable arrangement,Wm , of the particles through Boltzmann’s definition,

S = k ln Wm . (69)

When only electrons are considered, the entropy is relatedto the internal energy of the system, U , the total numberof electrons, N , and the volume, V , of the system throughEuler’s equation

FIGURE 5 The number of ways, W, two electrons can be dis-tributed in three energy levels to obtain a total energy of 4 eV.

U = T S − PV + µN − qVN , (70)

where V represents the internal electrostatic potential.For a simple system with just two available energy states(energy = 0 or energy = E), the probability of finding thesystem with energy E to that of finding it with energy 0 is

(E)

(0)= W (U0 − E)

W (U0)= eS(U0−E)/k

eS(U0)/k. (71)

Using the approximation S(U0 − E) ≈ S(U0) − E( ∂ S∂U0

)and

∂ S

∂U= 1

T= ∂

∂U

(U + PV − µN + qVN

T

)(72)

simplifies (71) to

(E)

(0)= e(S(U0)/k)−(E/kT )

eS(U0)/k= e−E/kT . (73)

To determine the probability of a system in energy state E ,and that the state is occupied by an electron, then the influ-ence of the total number of electrons, N , must also be takeninto consideration. Then the ratio of the probability thatthe system is occupied by one electron at energy E to theprobability that the system is unoccupied with energy 0 is

(1, E)

(0, 0)= W [(U0 − E), (N0 − 1)]

W [U0, N0]

= eS[(U0−E),(N0−1)]/k

eS[U0,N0]/k. (74)

Using S[(U0 − E), (N0 − 1)] ≈ S[U0, N0] − E(∂ S/∂U0) −(∂ S/∂ N0) yields

(1, E)

(0, 0)= e(S[U0,N0]/k) − (E/kT ) + ((µ−qζ )/kT )

eS[U0,N0]/k= e(EF −E)/kT ,

(75)where the Fermi level is defined as EF = µ − qV . Since

(1, E) + (0, 0) = 1,

(1, E) = f (E) = 1

1 + e((EF −E)/kT ). (76)

This is the Fermi–Dirac distribution and represents theprobability of occupancy of an energy state in equilibrium.

Page 416: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016B-774 July 31, 2001 17:48

Thermoelectricity 689

B. Carrier Concentrations

In a free-electron approximation, the total number of elec-trons in a given energy band can then be found by integrat-ing the product of the density of states and the probabilityof occupancy of the state for spherical energy surfaces:

n =∫ Etop

Ebottom

g(E) f (E) dE = (2mkT )3/2

2π2h3F1/2(η), (77)

where η = EF/kT , and Fν(η) is the Fermi–Dirac function,

(η) =

∫ ∞

0

ην

1 + ex−ηdx . (78)

Here, the bottom of the energy band was taken to be zeroenergy corresponding to k = 0, and the integration wasallowed to extend to ∞ since the Fermi–Dirac distributionfalls to zero at high energy levels. In the degenerate limit,when EF � kT , a series expansion of (78) leads to thefollowing approximations for metals:

EF ≈ EF0

[1 − π2

12

(kT

EF0

)2

+ · · ·],

(79)

n ≈ (2m)2/3

3π2h3

(EF

kT

)3/2[

1 + π2

8

(kT

EF

)2

+ · · ·].

At T = 0 K,

EF = EF0 = π2h2

2m

(3n

π

)2/3

. (80)

For nonspherical energy surfaces, the number of elec-trons per unit volume within an element of momentumspace, dpx dpy dpz , is found using (63) and the relationp = hk = h

2πk,

dn = 2

h3f (p, r) dpx dpy dpz = 1

4π3f (k, r) dkx dky dkz,

(81)where the factor of 2 accounts for two electrons of op-posite spin. The total electron density is then found byintegration.

C. The Boltzmann Function

If the material is disturbed from equilibrium, then the dis-tribution will vary, in general, as a function of wavevector,k, position, r, and time, t , or f (k, r, t). At a time t + dt ,the probability that a state with wavevector k + dk isoccupied by an electron at position r + dr can be found,using Eq. (59), to be

f (k + dk, r + dr, t + dt)

= f

(k + 1

hFt · ∇k dt, r + v dt, t + dt

). (82)

The total rate of change of the distribution function nearr is then

d f

dt= 1

hFt · ∇k f + v · ∇r f + ∂ f

∂t, (83)

which is Boltzmann’s transport equation. The first term onthe right-hand side of this equation accounts for contribu-tions from forces, Ft , including externally applied forces,F, and collision forces, Fc. The middle term adds the con-tributions from concentration gradients, and the last termis the local changes in the distribution function about thepoint r. Equation (83) is equal to zero since the total num-ber of states in the crystal is constant, thus

∂ f

∂t= −1

hFt · ∇k f − ν · ∇r f

= −1

hFc · ∇k f − 1

hF · ∇k f − ν · ∇r f

=(

∂ f

∂t

)c

− 1

hF · ∇k f − ν · ∇r f. (84)

With external forces applied, the distribution function, f ,will be disturbed from the equilibrium value, f0. Uponthe removal of those external forces, equilibrium will bereestablished through collisions, (∂ f /∂t)c. Calculation ofthis collision term is a formidable task dependent largelyon the scattering mechanisms for the material investigated.For small disturbances, however, a relaxation-time ap-proximation is often used which assumes that(

∂ f

∂t

)c

= −( f − f0)

τk= − f1

τk, (85)

where τk is the momentum relaxation time. In steady state,∂ f/∂t = 0 and Eq. (84) becomes

0 = − f1

τk− 1

hF · ∇k f − v · ∇r f

orf1 = −τk

hF · ∇k f − τkv · ∇r f.

(86)

The electric and heat current densities are given by

J = −qvn = −q∫

v dn = − q

4π3

∫f1(k) dk,

(87)

JQ = 1

4π3

∫v(E − EF ) f1(k) dk.

Substituting Eq. (86) into Eqs. (87) starting with the elec-tric current density, J, gives

J = − q

4π3

∫ (−τk

hvF · ∇k f − τkvv · ∇r f

)dk. (88)

Assuming parabolic bands, the gradient of the distributionfunction in k space can be written

∇k f = ∂ f

∂ E∇k E = ∂ f

∂ Ehv. (89)

Page 417: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016B-774 July 31, 2001 17:48

690 Thermoelectricity

Also, the following can be shown by direct substitution ofthe Fermi–Dirac distribution (76):

∂ f

∂x= ∂ f

∂ET

[E

∂x

(1

T

)− ∂

∂x

(EF

T

)], (90)

with similar results in the y and z directions. Substitutingthese results into Eq. (88) gives

J = − q

4π3

∫ (−τk

hvF

∂ f

∂Ehv − vvτk

∂ f

∂ ET

×[

E∇r

(1

T

)− ∇r

(EF

T

)])dk (91)

or

J = q

4π3

∫τkvvF

∂ f

∂ Edk

+ q∫

τkvv∂ f

∂ ET (E − EF )∇r

(1

T

)dk. (92)

For example, the applied force might include a contri-bution from an external electric field (−qE), plus a con-tribution caused by a temperature gradient (see Fig. 3),or in general as ∇µ = ∇(µ + qV ) = ∇µ − qE . Then (92)would be

J = q

4π3

∫τkvv

∂ f

∂ E(∇µ − qE) dk

−q∫

τkvv(E − EF )∂ f

∂ E

1

T∇r T dk, (93)

where ∇r (1/T ) = −(1/T 2)∇r T was used. The electricalcurrent density can be simplified and put into a formatsimilar to the Onsager relations as shown in (54) by usingtransport integrals defined as

Kn = − 1

4π3

∫τkvv(E − EF )n ∂ f0

∂ Edk, (94)

where it is assumed that the deviations from equilibriumare small, such that ∂ f /∂ E in Eq. (93) may be replacedwith ∂ f0/∂ E . This leads to an electrical current density of

J = −qK0∇µ + q

TK1∇T . (95)

Similarly, the heat current density, JQ , follows the samederivation to arrive at

JQ = 1

4π3

∫τkvv

∂ f

∂ E(∇µ − qE)(E − EF ) dk

− 1

4π3

∫τkvv(E − EF )2 ∂ f

∂ E

1

T∇r T dk (96)

or

JQ = −K1∇µ + 1

TK2∇T . (97)

These derivatives form the link between the macroscopicOnsager equations and the atomistic derivations from

Boltzmann’s equation. Comparison of Eqs. (95) and (97)with Eq. (54) shows the following relations:

L11

qT= −qK0,

−L12

T 2= q

TK1,

L21

qT= −K1,

−L22

T 2= 1

TK2,

or

L11 = −q2T K0,

L12 = −qT K1,

L21 = −qT K1,

L22 = −T K2.

(98)

This shows the Onsager reciprocity relation, in that L12 =L21. The thermoelectric properties can now be determinedthrough an evaluation of the transport integrals and theappropriate boundary conditions of isothermal (∇T = 0),isoelectric (∇V = −E = 0), static (J = 0), or adiabatic(JQ = 0). For example, under isothermal conditions,where ∇T = 0 and thus ∇µ = 0 (for a homogeneousmetal),

J = q2K0E = σE, (99)

and the electrical conductivity is

σ = q2K0. (100)

The electronic contribution to the thermal conductivityis defined for static conditions as JE |J=0 = −κe∇T , orwhen JE = JQ as in a one-band material, JQ =κe∇T ,where (95) becomes

0 = −qK0∇µ + q

TK1∇T . (101)

Solving for ∇µ and substituting into (97) gives

JQ = −K1

(1

T

K1

K0∇T

)+ 1

TK2∇T

= 1

T

(K2 − K1K1

K0

)∇T (102)

or

κ = 1

T

(K2 − K1K1

K0

). (103)

The absolute Seebeck coefficient, or thermopower, S, canalso be found from the static condition, where the use ofEq. (101) gives

S = 1

q

∇µ

∇T= 1

qT

K1

K0. (104)

The Peltier coefficient, �, can be found by evaluating theheat current density, Eq. (97), for isothermal conditions:

JQ = qK1E . (105)

Page 418: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016B-774 July 31, 2001 17:48

Thermoelectricity 691

TABLE I Combined Results from Macroscopic and Atomistic Analysis

Thermoelectric property Transport integral Onsager coefficient

σ= q2K0 K0 = σ

q2L11 = −σT

S = 1

qT

K1

K0K1 = T

σ

qS L12 = L21 = −T 2σS

κe = 1

T

(K2 − K1K1

K0

)K2 =κeT + T 2σS2 L22 = −T 2κe − T 3σS2

Substituting for the electric field, E , from Eq. (99) givesthe direct relationship between heat current density andelectric current density,

JQ = K1

qK0J = ΠJ, (106)

where the proportionality constant is simply the Peltier co-efficient, �. Comparing the Peltier coefficient (106) withthe thermopower (104) leads to Kelvin’s second relation:

Π = T S. (107)

Results of the transport integrals are summarized inTable I. Thus Eqs. (54) can be rewritten in terms of thethermoelectric properties as

J = −σ

q∇µ + σS∇T,

JQ= −T

qσS∇µ + (

κe + TσS2)∇T .

(108)

Substitution of the transport integrals can be used to eval-uate further the thermoelectric properties. Estimations canbe made through a series expansion of the transport inte-grals using a Sommerfeld expansion,

Kn = −∫ ∞

0φn(E)

∂ f0

∂ Ed E

= φn(EF ) + π2

6(kT )2 d2

d E2F

φn(EF ) + · · · . (109)

For the electrical conductivity,

σ = q2K0 = − q2

4π3

∫τkvv

∂ f0

∂ Edk. (110)

In its simplest form for cubic symmetry, this reduces to

σ = nq2τk

m∗ , (111)

where n is the electron density with energies near EF , andm∗ is the effective mass of the electrons. Both the electrondensity near the Fermi level and the relaxation time arefunctions of energy, such that the electrical conductivitycan be approximated as σ = const · E ξ , where ξ is somenumber.

A relationship between the electrical conductivity andthe thermopower can be found by series expansion K1,which gives

K1 = − π2

3q2(kT )2 ∂σ

∂ E

∣∣∣∣E=EF

, (112)

along with σ= q2K0 and substituting into (104). Thisleads to the Mott–Jones equation (Barnard, 1972):

Sd = −π2

3

k2T

q

(∂ ln σ

∂ E

)EF

. (113)

A distinction of the diffusion thermopower, Sd , has beenmade here to separate it from a low-temperature effectthat has not been considered above. The low-temperatureeffect typically appears as a peak in the measured ther-mopower (near 60 K for monovalent noble metals) andis the result of an increased electron–phonon interaction.When a temperature gradient exists across a crystal, heatwill flow from the hot side to the cold side through lat-tice vibrations (phonons) and through electron flow. Vari-ous interactions among phonons, lattice defects, and elec-trons can be described by scattering times for each type ofinteraction. At high temperatures, phonon–phonon inter-actions are more frequent than electron–phonon interac-tions (τp,p < τp,e). At these high temperatures (above theDebye temperature, T > θD), τp,e is approximately tem-perature independent, while τp,p ∝ 1/T . Under theseconditions, the total thermopower is dominated by thediffusion thermopower as given in Eq. (113). At low tem-peratures (T < θD), τp,e ∝ 1/T and τp,p ∝ eθD/2T , there-fore, as the temperature drops, τp,p increases more rapidlythan τp,e. When this occurs, τp,p > τp,e and electron–phonon interactions will occur more frequently, caus-ing electrons to “dragged” along with the phonons. Thisgives rise to a larger gradient of carrier concentrationacross the sample and is additive to the diffusion ther-mopower such that S = Sd + Sg , where Sg is the phonon-drag component of the thermopower described above. Atstill lower temperatures, phonon-impurity interactions candominate, causing the magnitude of the thermopower todecrease toward zero. For the remainder of this chapter, the

Page 419: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016B-774 July 31, 2001 17:48

692 Thermoelectricity

temperature is assumed to be much higher than the Debyetemperature, such that S ≈ Sd and the diffusion subscriptis dropped.

When the electrical conductivity can be writtenσ = const · E ξ , this can be used in the Mott–Jones equa-tion to give

S = −π2

3

k2T

q EFξ = −0.0245

T

EFξ

(µV

K

). (114)

1. Normal Metals

In monovalent noble metals (Cu, Ag, and Au), ξ ≈ − 32 has

been measured, giving the positive quantity

S = 0.03675T

EF

(µV

K

). (115)

It is instructive to compare the thermopower of noble met-als to the electronic heat capacity, Cel, which is dependenton the density of states, g(EF ), evaluated at the Fermilevel. Substituting ξ ≈ − 3

2 into Eq. (114) gives

S = π2

2

k2T

qEF(116)

and

Cel = π2

3g(EF )k2T = π2 N

2

k2T

EF, (117)

where N is the total number of carriers. Then it can beseen that the electronic heat capacity per carrier is simplythe electronic charge times the thermopower,

Cel

N= q S. (118)

2. Transition Elements

The electronic properties of transition metals are usuallyconsidered to have contributions from two bands that over-lap at the Fermi level: the s-band, from the s levels of theindividual atoms, and the d-band, consisting of five indi-vidual overlapping bands. The s-band is broad and typi-cally approximated as free electron-like, while the d-bandis narrow, with a high density of states and high effec-tive mass, thus the s electrons carry most of the current.The relaxation time is, however, greatly affected by thehigh density of states of the d-band. This comes aboutthrough the inverse proportionality of the relaxation timeto the probability of scattering from one wavevector, k,to another, k′. The occupancy and availability of each ofthese wavevectors are, in turn, proportional to the densityof states at the Fermi level. This leads to the relationshipof the inverse proportionality of the relaxation time to thedensity of states:

1

τ∝ g(E)|E=EF . (119)

Due to the relatively high density of states in the d-band,the relaxation time of the highly responsive s-band elec-trons is dominated by s–d transitions, or

1

τs≈ 1

τs−d∝ gd (E)|E=EF . (120)

Neglecting the d-band contribution to the electrical con-ductivity and rewriting Eq. (110) in terms of the densityof states gives

σ = 2

3q2ν2

s τs gs(E)|E=EF

= const · ν2s

gs(E)

gd (E)

∣∣∣∣E=EF

. (121)

Defining the bottom of the s-band as zero energy, and thepartially filled d-band in terms of the holes in the bandso it can be referenced to the top of the d-band, such thatE0 is the energy at the top of the d-band, and gd (E) =const · (E0 − EF )1/2, then approximating the s-band elec-trons as free electrons gives

∂ ln σ

∂ E

∣∣∣∣E=EF

= 3

2EF+ 1

2(E0 − EF ). (122)

Typically EF � (E0 − EF ), such that approximating theabove equation as the second term on the right-hand sideand using this in the Mott–Jones equation (113) gives

S = −π2

6

k2T

q(E0 − EF ). (123)

Again, the electronic heat capacity can be compared tofind

Cel = π2

6

Nk2T

(E0 − EF ), (124)

and the relationship between the magnitude of the elec-tronic heat capacity and the thermopower remains

Cel

N= q S. (125)

3. Semimetals

The petavalent elements of As, Sb, and Bi are semimetalswith rhombohedral crystal structures. This leads to non-spherical Fermi surfaces and anisotropic scattering suchthat τ ∝ ks

x for a given crystallographic direction, where saccounts for the anisotropy. Likewise, the density of statesg(E) ∝ k3

x , and kx ∝ (E0 − EF )1/2. Using the density ofstates and the relaxation time for the electrical conductiv-ity in an equation similar to (121) gives

Page 420: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016B-774 July 31, 2001 17:48

Thermoelectricity 693

σ = const · (E0 − EF )(3+s)/2 (126)

or, in (113),

S ∼= −π2

6

k2T

q(E0 − EF )(3 + s), (127)

where (3 + s) < 0. There is an exception to (127) in bis-muth, which shows the expected anisotropic thermopo-wer, but an unexpected negative thermopower (S⊥ ≈−50 µV/K, and S|| ≈ −100 µV/K at 273 K). For bismuth,a value of ξ = (3 + s)/2 should be used in (114) describingelectron conduction, instead of using Eq. (127), which isfor conduction by holes.

4. Alloys

Matthiessen’s rule states that the total resistivity of an alloyformed by two metals can be found by

ρ = 1

σ= ρi + ρ j , (128)

where ρi is the resistivity of the pure solvent metal dueto scattering of carriers by thermal vibrations, and ρ j rep-resents scattering of carriers from impurities. This rule isoften used for approximations but is not widely applicablesince many cases exhibit anisotropic scattering of carri-ers, causing a large deviation from (128). Assuming thevalidity of Matthiessen’s rule, (113) can be written

S = π2

3

k2T

q

(∂ ln (ρi + ρ j )

∂ E

)EF

, (129)

which can be written in terms of the difference betweenS for the alloy and the thermopower of the pure solventmetal, Si , or �S = S − Si leads to

�S

S= − 1 − (x j/xi )

1 + (ρi/ρ j ), (130)

where

xi = −(

∂ ln ρi

∂ E

)EF

and x j = −(

∂ ln ρ j

∂ E

)EF

.

(131)

Using the Gorter–Nordheim relation for the impurity com-ponent of Mattheissen’s rule, ρi = C X

(1 − X

), where C

is the Nordheim coefficient and X is the atomic fraction ofthe solute atoms in a solid solution, yields a more usefulrelationship:

S = Sj + ρi

ρ(Si − Sj ), (132)

where Sj is the thermopower for the impurity.The third thermoelectric parameter listed in Table I is

thermal conductivity. This can likewise be determined

using (109) and substituting the transport integrals into(103). Series expansion of K2 gives

K2 = −π2

3

k2T 2

q2σ. (133)

The thermal conductivity is given by (103), repeated herefor convenience:

κe = 1

T

(K2 − K1K1

K0

). (103)

In metals, (∂/∂ E)σ (E)|E=EF ≈ (σ/EF ), thus K1 ≈−(π2/3q2)(kT )2(σ/EF ), or

K1K1

K0≈

[(π2/3q2)(kT )2(σ/EF )

]2

σ/q2

=[

π2

3q2(kT )2σ

] [π2

3

(kT )2

E2F

], (134)

giving

κe ≈ π2k2T

3q2σ

(1 + π2

3

(kT )2

E2F

)≈ π2k2T

3q2σ, (135)

where the approximation of (π2/3)((kT )2/E2F ) � 1 was

used, thus arriving at the Wiedemann–Franz law, orκe/σ T = 2.443 × 10−8 ((W · �)/K2).

The total thermal conductivity, κ , must also include alattice contribution, κL , such that

κ = κL + κe. (136)

The lattice thermal conductivity for metals is generallymuch lower than the electronic contribution.

D. Semiconductors

The above analysis is applicable to normal metals, where itis assumed that the carriers are electrons and ∇µ is a func-tion of temperature only. Furthermore, the Onsager rela-tions were developed using four contributions, (33), (36),(37), and (39), to the energy density rate of change, how-ever, two additional contributions exist for semiconduc-tors. These contributions account for transitions of elec-trons across the bandgap, or the rate of change in carrierconcentrations in each band, and for positional gradientsof the band edges (valence band and conduction band).The last contribution could arise from temperature gradi-ents and/or compositional variations, for example. Theseadditional contributions have the form(

∂ E

∂t

)V

= −q�C (−∇ · Jn) + q�V (−∇ · Jp) (137)

and(∂ E

∂t

)VI

= Jn · ∇q�C − Jp · ∇q�V , (138)

Page 421: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016B-774 July 31, 2001 17:48

694 Thermoelectricity

where −q�C and q�V represent the internal potentialenergies of the electrons and holes at the bottom of theconduction band and the top of the valence band, respec-tively. This leads to the Onsager relations for a two-bandmodel, where, in a steady-state condition (Harman andHonig, 1967),

JQ = L11XQ + L12Xn + L13Xp,

J− = L21XQ + L22Xn + L23Xp, (139)

J+ = L31XQ + L32Xn + L33Xp,

where

XQ = − 1

T 2∇T,

Xn = − 1

T∇ϕC + ∇ µC

qT, (140)

Xp = − 1

T∇ϕV − ∇ µV

qT,

Also, µC and µV represent the difference between thechemical potential energy and the internal potential en-ergy of the carriers in the two bands. The total potentialenergy of the carriers in an applied field for a semicon-ductor must include the potential energy from the field aswell as the internal potential energies −q�C and q�V ,from the band edges. Contributions to the electrical cur-rent density come from electrons, J− = −qJn , and fromholes, J+ = qJp, for the total current density given byJ = J− + J+. Applying the same procedure for this caseas followed for metals above, with the additional consid-eration of the relative potential energies of the band edgesusing µV = −(EF + EV ) and µC = EF − EC , gives thefollowing formula for a two-band semiconductor:

σ = σn + σp,

S = Snσn + Spσp

σn + σp, (141)

κ = κL + κn + κp + σnσp

σ T q2

[K n

1

K n0

+ K p1

K p0

+ (EC − EV )

]2

.

Of course, as a semiconductor is doped n-type or p-type,the corresponding contributions, subscripted n or p, re-spectively, above will dominate. The last term in thethermal conductivity formula, when multiplied by −∇T,would relate to the transport of bandgap energy along thenegative temperature gradient and is defined as an am-bipolar transport mechanism.

V. APPLICATIONS

A. Thermocouples

Thermocouples are the most common application of ther-moelectric materials. Application of the Seebeck coeffi-cient (1), along with the Thompson relation (7), allowsone to determine the open-circuit potential for a circuitcontaining temperature gradients by integrating over tem-perature as one traverses through the circuit from one ter-minal of the open circuit to the other. For example, in thecircuit shown in Fig. 6 the open-circuit voltage can bewritten

�V =∫ T1

T0

SA dT +∫ T2

T1

SB dT +∫ T3

T2

SC dT

+∫ T4

T3

SC dT +∫ T5

T4

SD dT +∫ T0

T5

SA dT

=∫ T1

T5

SA dT +∫ T2

T1

SB dT +∫ T4

T2

SC dT

+∫ T5

T4

SD dT . (142)

When measuring this potential difference, care must betaken to include the contribution from the leads of themeter. This can be minimized by assuring that the thermo-couple-circuit open terminals (in Fig. 6) are at a constanttemperature T0 and the terminals on the voltage meter arealso at a constant temperature (not necessarily T0).

B. Generators and Coolers

Lenz first demonstrated a thermoelectric cooler by freez-ing water at the junction between two conductors formedby rods of bismuth and antimony; however, a more com-mon configuration for a thermoelectric cooler is shown in

FIGURE 6 Thermocouple circuit.

Page 422: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016B-774 July 31, 2001 17:48

Thermoelectricity 695

FIGURE 7 Thermoelectric cooler.

Fig. 7. Here the cooling (or warming) junction is mademore accessible for device cooling (or heating).

Since current is defined as positive in the direction ofpositive carrier flow (hole flow), and likewise for the di-rection opposite to negative carrier flow (electron flow),by using one p-type leg and one n-type leg to the cooler,the highest efficiency can be achieved (Fig. 8). In thissituation, all carriers flow in the same physical direc-tion (either top to bottom or bottom to top) in both legs.Since charge carriers also carry heat as shown throughthe Onsager relations, heat will flow through the devicein the direction of the carriers. Although the configura-tion shows a pn junction, these devices do not behave asdiodes and electrical current is reversible. This is due tothe fact that each of the legs is doped to degeneracy, ornear-degeneracy, such that ohmic contacts with the met-als are exhibited.

The goal in making a thermoelectric cooler is to max-imize the coefficient of performance, ϕ, of the device,defined as

ϕ = Q0

W, (143)

where Q0 is the rate of heat absorbed from the object beingcooled over the amount of power, W , it takes to drive thecooler. Assuming that the thermopower of materials A andB in Fig. 7 do not vary significantly over the temperaturerange T0 to T1, then the Thompson heat may be neglected,and

Q0 = Q� − QT, (144)

FIGURE 8 Thermoelectric cooler current flow.

where Q� is the Peltier heat absorbed at the cold junctionand QT is the thermal losses down the arms of the cooler.The Peltier heat absorbed is Q� = � · I , and the thermallosses down the arms consist of thermal conduction losses,K (T0 − T1), where K is the thermal conductance of thearms, and Joule heating losses, 1

2 I 2 R. A factor of 12 on

the Joule heating losses is due to half of this heat flowingto the cold end and half flowing to the warm end of thecooler. Substituting gives

Q0 = � · I − 12 I 2 R − K (T0 − T1). (145)

Maximizing Q0 with respect to current yields � = I · R,or Imax = �/R. Using the Kelvin relations,

Imax = (SA − SB)T1

R. (146)

In steady state, Q0 = 0, and the maximum temperaturegradient �Tmax = (T0 − T1) is

�Tmax = 1

2

(SA − SB)2

RKT 2

1 = 1

2Z T 2

1 , (147)

where Z is defined as the figure of merit for the cooler.Equation (147) clearly shows that the maximum temper-ature gradient is increased by choosing materials with thelargest difference in thermopower values. Therefore, thelogical choice is to use one n-type and one p-type materialas mentioned previously.

Continuing with the evaluation of the coefficient of per-formance for the cooler, the power absorbed by the deviceis simply the product of the current and voltage suppliedto the cooler, or

W = I V = I {I R + (SA − SB)(T0 − T1)}, (148)

where the voltage across the device includes the resistiveand thermoelectric voltage drops. Dividing this into Q0

yields the coefficient of performance,

ϕ = �I − 12 I 2 R − K (T0 − T1)

I 2 R + (SA − SB)(T0 − T1)I. (149)

Page 423: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016B-774 July 31, 2001 17:48

696 Thermoelectricity

Taking the derivative with respect to current and setting itequal to zero gives

d I= 0 = (�−IR)[S�T · I + I 2 R] − [S�T + 2IR]

[�I − 1

2 I 2 R − K (T0 − T1)]

[S�T · I + I 2 R]2, (150)

where the substitutions S = (SA − SB) and �T = (T0−T1)were used. After expansion and cancellation in thenumerator,

d I= 0

= I 2[−R�− 1

2 R · S�T]+ I [2KR(T0 −T1)]+ K · S�T 2

[S�T · I + I 2 R]2.

(151)

Substituting� = (SA − SB)T1 for the Peltier heat removedat the cold junction gives

0 = I 2{−R(SA − SB)

[T1 + 1

2 (T0 − T1)]}

+ I [2K R(T0 − T1)] + K (SA − SB )(T0 − T1)2.

(152)

Solving this quadratic equation yields the maximum co-efficient of performance at the optimum current,

Iopt = (SA − SB)(T0 − T1)

R(√

1 + Z T − 1), (153)

where T is the average temperature 12 (T0 + T1). Using this

in Eq. (143) yields

ϕopt = T1

(T0 − T1)

√1 + Z T − (T0/T1)√

1 + Z T + 1, (154)

where the first term represents the coefficient of perfor-mance for an ideal heat pump. This shows that both ϕ and�T are directly dependent on the figure of merit, Z . Thusmaximizing the figure of merit for the individual materials,

Z = S2

ρκ= S2σ

κ, (155)

maximizes the efficiency of the cooler. Desirable materi-als have large-magnitude thermopowers, S (one n-typeand one p-type), and low electrical resistivities, ρ, or,equivalently, high electrical conductivities, σ , and lowthermal conductivities, κ . Since the figure of merit hasunits of K−1, the unitless quantity of Z T is often reported.

It should also be noted that the Peltier heat, Q� = � · I ,is either absorbed or liberated based on the current di-rection. Therefore, the same configuration can be used aseither a thermoelectric cooler or a heater.

For comparison, and to illuminate the present challenge,the coefficient of performance for standard Freon-based

refrigeration systems is 1.2 to 1.4, for a refrigerator oper-ating at a cold temperature of 263 K while the outside (hot

temperature) is at 323 K. Freon-based cooling systemshave coefficients of performance that would correspondto a thermoelectric device with Z T between 3 and 4. Alsoshown is the COP for the present value of Z T ∼ 1. Theadvantages of thermoelectric devices includes size scal-ability without loss of efficiency, robustness, low main-tenance, a relatively small electromagnetic signature, andthe ability both to heat and to cool from a single device, andthey are environmentally cleaner than conventional CFC-based coolers. Many thermoelectric companies presentlyexist, indicating an existing market such that any increasein Z T through a new material and/or configuration couldhave a direct impact; however, a significant increase inthe market is anticipated for an increase in Z T to 2. This,therefore, represents the current goal in Fig. 9.

These devices are heat pumps, in that it is also pos-sible to remove the electrical power source, and forcea temperature gradient across the thermoelectric device,by contacting one end of it to an external heat source.With a load connected to the device instead of the elec-trical power source, it then functions as a thermoelectricgenerator. Thus, the application of an electrical potentialgradient causes the generation of a temperature gradient(thermoelectric cooler) and the application of a temper-ature gradient causes the generation of electrical power(thermoelectric generator).

FIGURE 9 The figure of merit versus the coefficient of perfor-mance.

Page 424: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016B-774 July 31, 2001 17:48

Thermoelectricity 697

In the case of a generator, the efficiency, η, of the deviceis defined as the ratio of the power supplied to the load tothe heat absorbed at the hot junction:

η = TH − TC

TH

√1 + Z T − 1√

1 + Z T + (TC/TH ). (156)

This is again dependent on the figure of merit of the device.Through Thompson’s relations we can split the figure ofmerit for the device into a figure of merit for each of the twolegs. When each of these has been maximized individually,then the total device figure of merit will also be maximizedassuming that one leg is n-type and one p-type.

C. New Directions

Traditional materials used in thermoelectric devices arelisted in Table II. Near-room-temperature devices havebeen designed largely for cooling applications, whilehigher-temperature materials have been generally used inelectrical power generation. Research on thermoelectricswas highly active during the decade following 1954,with the United States showing a great interest in high-temperature power generation applications, such as theSi–Ge-based generators used on the satellites Voyager Iand II. Recently there has been a resurgence of interestin thermoelectrics, spurred on partly by predictions of thehigh ZTs possible in quantum confined structures (Hicksand Dresselhaus, 1993). It was predicted that in such struc-tures, both the electrical conductivity and the thermopowercould be simultaneously increased due to the sharpen-ing of the density of states as confinement increases from3D → 2D → 1D → 0D (Broido and Reinecke, 1995). Theinfluence of such sharpening can be seen clearly within theMott–Jones equation for thermopower (113). An indica-tion of the effect from a rapidly varying density of statescomes from mixed-valent compounds such as CePd3 andYbAl3, which have shown the largest power factor, σ S2,among all known materials. Unfortunately, the high ther-mal conductivity in these materials prevents them fromhaving a correspondingly high figure of merit. An ad-ditional increase in Z T for quantum confined materialscomes from a decrease in the thermal conductivity due toconfinement barrier scattering.

Another avenue for investigating thermoelectric mate-rials has been coined the “phonon glass electron crystal”

TABLE II The Most Widely Used TE Materials

Zmax (K−1) Useful range (K) Tmax (K)

Bi2Te3 3 × 10−3 <500 300

PbTe 1.7 × 10−3 <900 650

Si–Ge 1 × 10−3 <1300 1100

TABLE III Desirable Material Properties for ThermoelectricApplications (Kanatzidis, 2001)

1. Many valley bands near the Fermi level, but located away from theBrillouin zone boundaries.

2. Large atomic number elements with large spin–orbit coupling.

3. Compositions with two or more elements such as ternaries andquaternaries.

4. Low average electronegativity differences between elements.

5. Large unit cells.

6. Energy gaps near 10 kT.

(PGEC) method (Slack, 1995), in which short phononmean free paths and long electron mean free paths aresimultaneously sought in a material. A suggested way fora material to exhibit PGEC behavior is making a materialthat incorporates cages and/or tunnels in its crystal struc-ture large enough to accommodate an atom. The cagedatom provides strong phonon scattering by rattling withinthe cage. Electrons would not be significantly scatteredby such “rattlers” since the main crystal structure wouldremain intact.

Additional guidance has been provided by identifyinga B parameter defined as

B = γ1

3π2

(2kT

h2

)3/2√mx mymz

k2

qκLµx , (157)

where γ is the degeneracy parameter (Hicks andDresselhaus, 1993). This function should be maximizedfor optimal Z T . With a large number of valleys within aband, fewer carriers can exist in each valley, thus increas-ing the contribution to the thermopower from that valley.At the same time, the total number of carriers can be main-tained for a high electrical conductivity. High degeneracyparameters are generally found in highly symmetric crys-tal systems. Large effective masses, or large effective masscomponents in the axes perpendicular to the current flow,allow for a high electrical conductivity in the directionof interest while maintaining a high B parameter. Equa-tion (157) also indicates that high mobilities, µx , in thetransport direction, and a low lattice thermal conductivityare also desirable. It has recently been shown that semicon-ductors with bandgaps of approximately 10 kT best satisfythese criteria (Mahan, 1998). Six properties of thermoelec-tric materials that give the best results are listed in Table III.

VI. SUMMARY

Macroscopic and atomistic derivations of the thermoelec-tric properties of electrical conductivity, thermoelectricpower, and thermal conductivity have been presented, withapproximations for various material systems. The deriva-tions outlined have considered external forces of electric

Page 425: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GQT Final Pages

Encyclopedia of Physical Science and Technology EN016B-774 July 31, 2001 17:419678

698 Thermoelectricity

fields and temperature gradients. Additional effects arerealized when more forces such as magnetic fields areincluded. These include Hall, magnetoresistance, Nernst,Ettingshausen, and Righi–Leduc effects. These magneticfields also affect the operation of thermoelectric coolers,with significant enhancements of the efficiency possibleunder strong fields.

ACKNOWLEDGMENT

I wish to thank Sangeeta Lal, of Bihar University, for her helpful reviewof the manuscript.

SEE ALSO THE FOLLOWING ARTICLES

ELECTROMAGNETICS •ELECTRONS IN SOLIDS •SEMICON-DUCTOR ALLOYS • SUPERCONDUCTIVITY • THERMODY-NAMICS • THERMOMETRY

BIBLIOGRAPHY

Barnard, R. D. (1972). “Thermoelectricity in Metals and Alloys,”Halsted Press (Division of John Wiley & Sons), New York.

Broido, D. A., and Reinecke, T. L. (1995). “Thermoelectric figure ofmerit of quantum wire superlattices,” Appl. Phys. Lett. 67(1), 100–102.

Gray, P. E. (1960). “The Dynamic Behavior of Thermoelectric Devices,”Technology Press of the Massachusetts Institute of Technology/JohnWiley & Sons, New York.

Guggenheim, E. A. (1957). “Thermodynamics,” 3rd ed., North-Holland,Amsterdam.

Harman, T. C., and Honig, J. M. (1967). “Thermoelectric and Thermo-magnetic Effects and Applications,” McGraw–Hill, New York.

Hicks, L. D., and Dresselhaus, M. S. (1993). “Effect of quantum-wellstructures on the thermoelectric figure of merit,” Phys. Rev. B 47(19),12 727–12 731.

Ioffe, A. F. (1957). “Semiconductor Thermoelements and Thermoelec-tric Cooling,” Infosearch, London.

Kanatzidis, M. G. (2001). The role of solid-state chemistry in the dis-covery of new thermoelectric materials. In “Solid State Physics”(H. Ehrenreich and F. Spaepen, eds.), Vol. 69, pp. 51–100, AcademicPress, New York.

Mahan, G. D. (1998). Good thermoelectrics. In “Solid State Physics”(H. Ehrenreich and F. Spaepen, eds.), Vol. 51, pp. 82–157, AcademicPress, New York.

Roberts, R. B. (1977). “The absolute scale of thermoelectricity,” Philos.Mag. 36(1), 91–107.

Slack, G. A. (1995). New materials and performance limits for ther-moelectric cooling. In “CRC Handbook of Thermoelectrics” (D. M.Rowe, ed.), CRC Press, New York.

Wendling, N., Chaussy, J., and Mazuer, J. (1993). “Thin gold wires asreference for thermoelectric power measurements of small samplesfrom 1.3 K to 350 K,” J. Appl. Phys. 73(6), 2878–2881.

Page 426: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

Transmission ElectronMicroscopy

S. AmelinckxJ. Van LanduytUniversity of Antwerp

I. Bragg’s LawII. Reciprocal Lattice—Ewald ConstructionIII. Convergent-Beam Electron DiffractionIV. Kinematical Diffraction TheoryV. Dynamical Theory

VI. Multiple-Beam DiffractionVII. Diffraction Contrast ImagesVIII. Column Approximation

IX. Moire FringesX. Dislocation ContrastXI. Weak-Beam ImagingXII. Computer Simulation of Dislocation ImagesXIII. Diffraction Contrast at Planar Interfaces

XIV. Image Formation in an Ideal MicroscopeXV. Image Formation in a Real Microscope

XVI. Image Formation of a Weak-Phase ObjectXVII. Optimum Defocus ImagesXVIII. Lattice Images

XIX. Imaging ModesXX. Scanning Electron MicroscopesXXI. High-Voltage Electron MicroscopyXXII. Analytical Electron MicroscopyXXIII. Specimen Preparation for Transmission

Electron MicroscopyXXIV. Examples of Applications

GLOSSARY

Absorption contrast Image formation mechanism re-sulting from local differences in density or absorptionof the material.

Anomalous absorption Orientation-dependent absorp-tion of electrons described by a complex extinctiondistance.

Bormann effect Maximum in the rocking curve of thetransmitted beam for a slightly positive excitation

error. The phenomenon was first discovered in X-raydiffraction.

Column approximation Approximation made with theassumption that electrons propagate through the speci-men along narrow columns. These columns can beconsidered as the “picture elements” of the diffractioncontrast image.

Diffraction contrast Image formation mechanism intransmission electron microscopy resulting from lo-cal differences in orientation, thickness, or structure

53

Page 427: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

54 Transmission Electron Microscopy

factor. The image consists of the intensity distributionin a single diffraction spot. If this spot is the transmit-ted beam, the image is called a bright-field image. If itis a diffracted beam, the resulting image is a dark-fieldimage. A dark-field image in a reflection for which theexcitation error s has a large value is called a weak-beam image.

Dynamical diffraction theory Diffraction theory thattakes into account multiple scattering events.

Ewald construction Geometrical construction that al-lows the directions of scattered beams to be derivedbased on the concept of reciprocal lattice.

Ewald sphere Sphere in reciprocal space with a radius of1/λ used in the Ewald construction (λ is the electronwavelength).

Excitation error Distance in reciprocal space from a re-ciprocal lattice node to the Ewald sphere, describingthe deviation from the exact Bragg condition.

High-resolution electron microscopy The image isformed by the interference between a large numberof diffraction beams; with modern instruments, thisallows the resolution of well-separated atom columnsin crystal structures.

Kinematical diffraction theory Diffraction theory thattakes into account only single scattering events. It is areasonable approximation for neutron and X-ray scat-tering but not for electron scattering.

Pendellosung effect Periodic transfer of intensity fromthe incident beam to the scattered beam in a two-beamsituation. At the exact Bragg condition, the depth periodof this transfer is the extinction distance, which dependson the reflection.

Reciprocal lattice Lattice of which the base vectors A1,A2, and A3 are related to the base vectors a1, a2, anda3 of the direct lattice by the relation

A1 = (a2 × a3)

V, A2 = (a3 × a1)

V,

A3 = (a1 × a2)

V.

Rocking curve Functional dependence of the transmittedand scattered intensity on the excitation error (i.e., onthe specimen orientation).

Scherzer focus Focusing condition leading to optimalimage resolution in high-resolution transmission elec-tron microscopy. The Scherzer focus does not coin-cide with the Gaussian focus considered in geometricaloptics.

TRANSMISSION electron microscopy combined withelectron diffraction has become an important tool in thestudy of the structural geometry of solids, which in many

respects is complementary to other diffraction techniquessuch as neutrons and X rays. Electron microscopy has beenespecially successful in the study of crystal defects (e.g.,dislocations, stacking faults, antiphase boundaries, do-main boundaries, inversion boundaries, and discommen-surations) as well as long-period superstructures resultingfrom the periodic arrangement of such defects (i.e., modu-lated structures, long-period antiphase boundary struc-tures, and polytypes).

The existence of quasi-crystals, that is, objects with-out translation symmetry but with noncrystallographicrotation symmetry elements, such as 5-fold and 10-foldaxes, was first demonstrated in 1984 by means of electrondiffraction and electron microscopy.

Transmission electron microscopy finds applications ina wide variety of disciplines (e.g., solid-state physics,solid-state chemistry, physical metallurgy, mineralogy,and geology); for instance, the mineralogy of moon rockswas studied mainly by means of electron microscopy.

In recent years the application of high-resolution elec-tron microscopy to problems of crystal chemistry has beenexpanding very rapidly, and recently it has, in particular,been applied with considerable success to the study ofhigh-critical temperature (T c) superconductors. The studyof the atomic structure of surfaces is another very recentapplication.

In 1986 the Nobel prize for physics was attributed tothe inventors of the electron microscope. Although theinvention dates back to 1937, it was not until 1958 thattransmission electron microscopy was used in the studyof solids. Rapid progress, following closely the progress inelectron optics, has been made since then; during the 1980satomic resolution became possible in many materials.

I. BRAGG’S LAW

The contrast of image produced in transmission electronmicroscopy of crystalline solids is due mainly to diffrac-tion effects and, to a smaller extent, to absorption. In directspace the diffraction of electrons by a crystal lattice canbe described in terms of Bragg’s law (Fig. 1c):

2dH sin θn = nλ, (1)

where dH is the interplanar spacing of the lattice planeswith Miller indices H (hkl), θn the Bragg angle, n an inte-ger (i.e., the order of the reflection), and λ the de Brogliewavelength of the monokinetic electrons. This relation de-fines the angles θn (i.e., the orientations of the crystal forwhich the waves scattered by successive lattice planes arein phase and produce a peak in scattered intensity). Thepeaks are very sharp because many unit cells contributeto the interference phenomenon.

Page 428: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

Transmission Electron Microscopy 55

FIGURE 1 (a) Ewald construction representing the diffractionconditions in reciprocal space. (b) A small deviation from the ex-act diffraction conditions is present. (c) Bragg’s law represent-ing diffraction in real space as “reflection” against a set of latticeplanes. Constructive interference occurs if 2dH sin θn = nλ. [FromAmelinckx, S. (1964). “The Direct Observation of Dislocation,”Solid State Physics, Suppl. 6, Academic Press, New York.]

The electron wavelength λ is related to the acceleratingvoltage V by the relation

λ = 1[2m0Ve

(1 + eV

/2m0c2

)]1/2 , (2)

which takes into account relativistic effects, where h is thePlanck constant, m0 the electron rest mass, e the electroncharge, and c the velocity of light in vacuum. For a typi-cal value V = 100 kV the wavelength is λ = 0.0037 nm.Since dH is of the order of 0.1 nm, the Bragg angles asdeduced from Eq. (1) are very small, of the order of 10−2

to 10−3 rad.

II. RECIPROCAL LATTICE—EWALDCONSTRUCTION

An alternative description makes use of the notion of re-ciprocal lattice. The basic vectors A1, A2, and A3 of thereciprocal lattice are given in terms of the basic vectorsa1, a2, and a3 of the direct lattice by the expressions

A1 = (a2 × a3)

V, A2 = (a3 × a1)

V,

A3 = (a1 × a2)

V,

where V = (a1 × a2) a3, is the volume of the unit cell ofthe lattice.

The geometry of the diffraction phenomenon is nowdescribed by the Ewald construction (Fig. 1a), which hasthe same physical contents as Bragg’s law. If k0 = (1/λ)e0

(e0 is an unit vector) is the wave vector of the inci-dent beam, and g is a reciprocal lattice vector such thatd = |1/g|, the diffraction condition is given by

k = k0 + g, (3)

where k is the wave vector of the scattered wavek = (1/λ)es, with es the unit vector along the scatteredbeam. This leads to the geometrical construction repre-sented in Fig. la, where 0 is the origin of the reciprocallattice. Diffraction maxima will occur in the direction ofk each time a node point of the reciprocal lattice touchesa sphere with radius 1/λ, the Ewald sphere, and when thecenter C is situated in −k0 drawn from the origin 0.

In electron diffraction the wavelength λ ≈ 0.0037 nmis small compared to the unit cell edge (≈0.2 nm), andas a result, the radius of the Ewald sphere (1/λ) is largecompared to the unit cell edges (1/g) of the reciprocallattice. The Ewald sphere can be approximated by a planenormal to the wave vector k0 of the incident beam.

Since electrons are absorbed in solid matter, it is ne-cessary to use very thin foil specimens (<200 nm). Thediffraction conditions are then significantly relaxed sinceonly a small number of unit cells along the normal to thefoil can contribute to the diffraction phenomena. How-ever, in directions parallel to the foil the peaks remainsharp. This relaxation of the diffraction conditions is rep-resented in reciprocal space by letting the reciprocal latticenodes become thin rods perpendicular to the foil plane,instead of points (Fig. 1b). It is now sufficient that theEwald sphere intersects such a rod to produce a diffractedbeam of which the direction is obtained by joining thecenter C of the Ewald sphere with this intersection point.As a result, many beams can be excited simultaneously,especially in thin foils, and the diffraction pattern alonga simple zone axis becomes a two-dimensional array ofspots, which reflects the crystal symmetry and can, in fact,be considered a planar section of the reciprocal lattice(Fig. 2).

The relaxation of the diffraction conditions makes itpossible to have a diffracted beam even though the rela-tion in Eq. (1) is not exactly fulfilled. The intensity is thena function of the deviation from the exact Bragg condition.In reciprocal space this deviation can be described by thevector s leading from the reciprocal lattice node point tothe intersection point of the rod with the Ewald sphere.Conventionally s will be positive when the reciprocal lat-tice node point is inside the Ewald sphere and negativewhen it is outside the sphere (Fig. 1b).

Page 429: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

56 Transmission Electron Microscopy

FIGURE 2 Electron diffraction pattern of a thin foil reflecting thecrystal symmetry along the zone axis. The diffraction pattern is aplanar section of the reciprocal lattice. (a) Reciprocal lattice con-struction and (b) corresponding observed diffraction pattern.

Apart from sharp diffraction spots produced by verythin foils, diffraction patterns consisting of lines (Kikuchilines) are produced in thicker specimens (Fig. 3). Theselines form along the intersection lines of a flat doublecone with the photographic plate. They are, in fact, hy-perbola, which can be approximated by their asymptotes.Each double cone is formed by the geometrical locus ofthe diffracted beams due to one family of lattice planesproduced by randomly and inelastically scattered elec-trons in the entrance part of the specimens. The semiapexangle of the cones is 90◦ − θ (i.e., they are, thus, veryflat since θ is small) (Fig. 4). They occur in pairs, oneproduced by each blade of the double cone. The one due

FIGURE 3 The Kikuchi line pattern is a relatively thick foil. Notethe pairs of bright and dark lines emphasized by the drawn lines.

FIGURE 4 (a) Formation of Kikuchi lines as the intersection linesof a double cone of Bragg reflected beams with the photographicplate. (b) Relation between the positions of Kikuchi lines andBragg spots.

to forward scattering is brighter than the background; theother one is darker. Their angular separation is 2θ . Sinceeach cone pair is strictly bound to one set of lattice planes,rotating the specimen over an angle α causes a paralleldisplacement of the lines over the same angular range.The Kikuchi lines are of practical importance since theyallow the determination of the magnitude and the sense ofa s from the relative positions of the sharp Bragg spot andthe corresponding Kikuchi line (see Fig. 4).

III. CONVERGENT-BEAM ELECTRONDIFFRACTION

All diffraction techniques described so far make use of aparallel incident beam of electrons. Recently it has beenrealized that more information, which is more accurate,can be obtained from convergent beam electron diffractionpatterns. A convergent beam of electrons is focused on thespecimen, and the crossover is situated in the specimen,which is flooded by electrons incident within a wide solidangle of orientations. Bragg scattering then takes placesimultaneously with many sets of lattice planes, and ismuch like the case of Kikuchi lines, but where Kikuchilines are formed by inelastically scattered electrons, theseare due to elastic scattering.

Page 430: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

Transmission Electron Microscopy 57

FIGURE 5 Convergent-beam electron diffraction pattern; the pat-tern indicates threefold symmetry. [From Tanaka, M. (1978). JEOLNews 16E, 13.]

Convergent-beam patterns yield information on thespace group of the crystal, including the presence or ab-sence of a center of symmetry, and allow structure am-plitude measurements and study of the lattice potential.Furthermore, they allow precise lattice parameter mea-surements of very small areas.

Convergent-beam electron diffraction patterns are theelectron analogue of Kossel lines in X-ray diffraction(Fig. 5).

IV. KINEMATICAL DIFFRACTION THEORY

According to the kinematical theory of diffraction, the in-cident radiation is scattered only once, and although thescattered beam is in a position to produce Bragg scatter-ing with the same set of lattice planes, such processesare neglected. Moreover, it is assumed that the incidentbeam is not depleted so each scatterer sees the same in-tensity of incident radiation. This approximation is ac-ceptable in X-ray and neutron diffraction, where the prob-ability of multiple scattering is relatively low. But it isonly a poor approximation in electron diffraction, wherethe interaction between the atoms and the incident elec-trons is very strong. Nevertheless, this theory allows aqualitative understanding of a wide variety of phenom-ena, and moreover, the geometrical features of the diffrac-tion patterns are, in general, correctly predicted. Also,other features of the diffraction phenomena are qualita-tively well described provided s is not very small. How-

ever, the intensities of diffraction spots predicted by thekinematical theory are usually not in agreement withthe observations due to the absence of a simple rela-tion between intensity and structure factor in electronscattering.

According to this theory the amplitude of the scatteredbeam is given by the expression

AS = F sin πst /πs , (4)

where s is the deviation parameter, t the foil thickness, andF the structure amplitude. It is obvious from this expres-sion that the amplitude of the scattered beam depends ons in the oscillatory way represented in Fig. 6. This curveis called the rocking curve since it gives the scattered in-tensity as a function of the angle of incidence. This curvecan be visualized directly in the electron microscope bymaking a dark-field image in a diffracted beam producedby a cylindrically bent crystal; a wide range of angles ofincidence (i.e., of s values) is then present simultaneously.The geometrical loci of the points with equal inclination(i.e., equal s values) are lines parallel with the axis of thecylinder. The separation of the zeros is 1/t and the centralpeak is 2/t wide (Figs. 6 and 7a).

A rocking curve for the transmitted beam can be ob-tained by subtracting IS from unity, IT = 1 − IS, sinceabsorption is neglected.

Similarly, by making a dark-field image of a wedge-shaped crystal, the intensity scattered in a fixed direction(i.e., in a given reflection) can be imaged as a functionof thickness. The geometrical loci of the points of equalthickness (i.e., of equal intensity) are lines parallel withthe wedge edge (Fig. 7b). The depth period is given by 1/sand becomes infinite for s → 0 . However, for s = 0 thekinematical theory is no longer valid and the dynamicaltheory has to be used.

FIGURE 6 Scattered intensity as a function of the excitation error(rocking curve) according to the kinematical theory, valid in verythin crystals only. [From Amelinckx, S. (1964). “The Direct Ob-servation of Dislocation,” Solid State Physics, Suppl. 6, AcademicPress, New York.]

Page 431: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

58 Transmission Electron Microscopy

FIGURE 7 Extinction contours: (a) equi-inclination extinctioncontours in a cylindrically bent graphite foil of constant thick-ness; (b) thickness extinction contours in a wedge-shaped crys-tal (bright-field image); (c) dark-field image corresponding to b.[From Amelinckx, S. (1964). “The Direct Observation of Disloca-tion,” Solid State Physics, Suppl. 6, Academic Press, New York.]

V. DYNAMICAL THEORY

We summarize the main results for the case of a perfectcrystal and assume that only one beam, apart from the in-cident beam, is strongly excited (i.e., has a s value closeto 0); this is called the two-beam case. For quantitativestudies of defects, working conditions which closely ap-proximate a two-beam case are practically required.

According to the two-beam dynamical theory of elec-tron diffraction, in a perfect crystal there is a constantinterplay between incident and diffracted beams. This canbe described by a set of two coupled differential equa-tions, one for each beam, and which is similar to the equa-tions describing the motion of two coupled pendulums(sometimes called Pendellosung). The simplest form ofthe equations is

(dT/dz) + π isT = (π i/tg)S,

(d S/dz) − π isS = (π i/tg)T,(5)

where T and S are the amplitudes of the transmitted andscattered beam, respectively; tg is a parameter dependingon the material and on the reflection g with dimensions

of a length, which is inversely proportional to the strengthof the considered reflection g (i.e., to its structure factor);and z is the distance in the foil measured from the entranceface.

So-called anomalous absorption (i.e., orientation-dependent absorption) can be taken into account phe-nomenologically by assuming the extinction distance tobe complex, replacing 1/tg with 1/tg + i/τg , where τg isthe absorption length belonging to the reflection g. This isa standard procedure also used in other branches of physicsto account for absorption. The normal absorption furtherleads to an exponential damping factor depending only onthickness, and not on orientation. These equations furtherimply the column approximation, which is discussed later.

The solution of these equations leads to the rockingcurve for the transmitted and scattered beam representedin Fig. 8. The oscillating character of Fig. 6 is stronglyattenuated, mainly as a result of anomalous absorption.Moreover, the symmetry with respect to s of the curvesfor the transmitted beam is lost (A in Fig. 8b). The strongtransmission for s > 0 relative to that for s < 0 is calledthe Bormann effect, which also occurs in X-ray diffraction.For the scattered beam the symmetry, with respect to s, isstill conserved (B in Fig. 8b).

The depth period of the scattered and transmitted beamsis now

t ′g = tg/

√1 + (stg)2; (6)

FIGURE 8 Scattered (B) and transmitted (A) intensity as a func-tion of the excitation error s (rocking curve) according to the dy-namical theory (a) without absorption and (b) taking into accountanomalous absorption.

Page 432: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

Transmission Electron Microscopy 59

FIGURE 9 Intensity of scattered and transmitted beams as afunction of specimen thickness (Pendellosung effect) accordingto the dynamical (a, b) and kinematical (c, d) theory. Periodicity islost in c. [From Amelinckx, S. (1964). “The Direct Observation ofDislocation,” Solid State Physics, Suppl. 6, Academic Press, NewYork.]

in particular, for s = 0 it reduces to tg . This gives a simplephysical meaning to the extinction distance tg . After hav-ing covered a distance tg in the foil for s = 0, the incident-beam intensity is completely transferred into the scatteredbeam, and vice versa. This repeats periodically with pe-riod tg (Fig. 9). This effect is called the Pendellosung effectand it is responsible for the thickness extinction contoursin wedge crystals, as described previously. The dynami-cal theory (Figs. 9a and b) gives a better description ofthe observations than does the kinematical theory (Figs.9c and d). In particular, for s = 0 the thickness extinctioncontours are periodic, whereas the kinematical theory pre-dicts a loss of periodicity.

VI. MULTIPLE-BEAM DIFFRACTION

Under most conditions, a large number of beams is excitedsimultaneously; the beams with the smallest s values aremost strongly excited. The diffraction pattern is a two-dimensional section of the reciprocal lattice. The beams

all interact one with another; there is a continuous transferof electrons from one beam to all other beams. The simplesystem of Eq. (5) must now be generalized into a system ofN coupled linear differential equations between N waveamplitudes,

dψ/dz = 2π iszSψ, (7)

which is derived from Schrodinger’s equation for a peri-odic potential taking the column approximation into ac-count. Here ψ is a column vector representing the am-plitudes of the transmitted and scattered beams, and S asquare matrix of which the diagonal elements are the ex-citation errors s for the different beams. The off-diagonalelements are associated with pairs of beams; they are re-lated to transition probabilities from beam n to beam mand are of the form 1/tnm (n �= m), where tnm is a length(i.e., extinction distance). This set of equations reduces inthe two-beam case to Eq. (5). The simple quasi-periodiccharacter for the intensity as a function of depth (4 or 6) islost. Analytical solutions are possible only in certain sym-metrical situations and for a small number of beams. Theequations have to be solved numerically using a procedurewhereby the crystal is dissected into slices parallel to thefoil plane, and the propagation of electrons followed fromslice to slice. Efficient computer programs have been de-veloped to perform such calculations in either reciprocalor direct space.

VII. DIFFRACTION CONTRAST IMAGES

Diffraction contrast results from the spatial dependenceof scattered or transmitted intensity along a foil. A singlebeam produces a diffraction spot to which different parts ofthe foil contribute differently; however, this is not apparentin an overfocused diffraction spot. A single beam can beselected by means of an aperture in the back focal planeof the objective lens and then magnified by the electronoptical system. In this way a highly magnified map ofthe intensity distribution along the exit face of the foil isobtained in a given diffraction spot (i.e., along a givendirection). This map is what we call an image.

Local variations of the orientation of the crystal lat-tice are revealed in this map as intensity differences ac-cording to the rocking curve. A bent perfect foil producesequi-inclination extinction contours or bent contours (seeSection IV). Also, local changes in thickness give rise tolocal intensity variations (i.e., produce contrast). In par-ticular, a wedge-shaped perfect crystal produces thicknessextinction contours.

Local composition changes lead to local structure factordifferences and, hence, to brightness differences in theimage; this is called structure factor contrast.

Page 433: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

60 Transmission Electron Microscopy

FIGURE 10 Schematic beam path in an electron microscope illustrating different imaging modes: (a) diffractioncontrast bright-field image, (b) diffraction contrast dark-field image, and (c) lattice image mode. [From Van Dyck, D.(1978). In “Diffraction and Imaging Techniques in Material Science,” p. 355, North-Holland, Amsterdam.]

Defects, such as dislocations, produce a strain field thatcauses local lattice curvature and, hence, produce imagesby strain contrast.

The relative displacement of two crystal parts along aplanar interface introduces phase shifts between the elec-tron beams diffracted by the two crystal parts. The in-terference between these electron beams produces quasi-periodic intensity variations in the scattered as well as inthe transmitted beam. In a foil with constant thickness,the intensities of the emerging beams vary with the depthposition of the interface in the foil in a quasi-periodic fash-ion; the quasi-period being the extinction distance tg (seeSections V and VI).

The selected beam can be either a transmitted or a scat-tered beam. In the first case the image is called a bright-field image (Fig. 10a) and in the second a dark-field imagein the selected reflection (Fig. 10b).

VIII. COLUMN APPROXIMATION

The possibility to obtain a relatively undistorted image thatis in fact a highly magnified part of the diffraction spot isdirectly related to the small magnitude of the Bragg an-gles in electron diffraction. As a consequence of this, itis a good approximation to assume that electrons travelthrough the foil along narrow columns, the lateral dimen-sions of which are determined by the foil thickness t , bythe Bragg angles θ , and the upper limit of the lateral spreadw given by w = θ t .

It is, therefore, justified to associate a local intensitywith the exit point in the back surface of a column, whichhas sampled a columnar volume of the foil. In a sense thecolumns can be considered as picture elements (pixels) ofthe image, which diffract quasi-independently. This pro-cedure is known as the column approximation and its main

importance is in computing and interpreting the images ofdefects.

IX. MOIRE FRINGES

Superposed crystals that differ slightly in either orienta-tion or lattice parameter (or both) produce moire fringes.The interference between the beam diffracted first by crys-tal 1 and subsequently transmitted through crystal 2 andthe beam transmitted first through crystal 1 and subse-quently diffracted by crystal 2 produces dark-field moirefringes. The bright-field moire fringes result from the in-terference between the doubly transmitted beam and thedoubly scattered beam.

Let the active diffraction vectors in crystal 1 and 2 be,respectively, g1 and g2, which are equal in magnitude butdiffer slightly in orientation. The moire fringes are thenperpendicular to ∆g = g1 − g2 (i.e., roughly parallel tog1 g2), and their spacing is � = 1/|∆g|; ∆g can be con-sidered as the wave vector of the moire fringes.

The formation of moire fringes can be simulated bymeans of the optical analogues in Fig. 11, in which twoidentical parallel line systems are superposed with a smallangular difference. It is clear that much wider coincidencefringes are generated, which are the rotation moire fringes(Fig. 11b).

The parallel superposition of two line gratings withslightly different spacings, d1 = 1/|g1| and d2 = 1/|g2|,also produces coincidence fringes perpendicular tog1 g2 (i.e., to ∆g) and with a spacing � = d1d2/

(d1 − d2). The formation of parallel moire fringes is dem-onstrated by means of the optical analogue in Fig. 11a.

Such fringes can be realized in the electron micro-scope using sandwiches of two thin metal layers grownepitaxially one onto the other. It is clear that if ∆g is

Page 434: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

Transmission Electron Microscopy 61

FIGURE 11 Optical analogues for the production of moire fringesin the electron microscope: (a) spacing difference moire fringesand (b) rotation moire fringes.

small or if d1 ≈ d2, the Moire spacing becomes large andcan easily be resolved in the microscope.

If one of the two components of such a sandwich con-tains an edge dislocation perpendicular to the foil, themoire pattern also exhibits a dislocationlike configuration(i.e., one or more supplementary half-fringes) (Fig. 12).The number of supplementary fringes is given by g · b,where b is the Burger vector of the dislocation. In a ro-tation moire pattern the supplementary half-plane in themoire pattern is perpendicular to that in the actual dis-location. In a parallel moire pattern both supplementaryhalf-planes are parallel, and the sign of the moire dislo-cation is the same as that of the crystal dislocation if theperfect crystal has a larger lattice parameter than the dislo-cated one. The sign of the moire dislocation is the oppositeof that of the crystal dislocation if the perfect foil has thesmaller lattice parameter.

A stacking fault in one of the components of the sand-wich causes a fractional shift of the moire fringes, whichis given by g · R (R is the displacement vector).

Moire fringes can be considered as a means of produc-ing a geometrical magnification of the crystal lattice and ofthe defects contained in it. They can be used to derive smalldifferences in lattice spacing of superposed crystals (e.g.,of a precipitate particle relative to the surrounding matrix).

FIGURE 12 Moire images of dislocations in metal sandwiches.[Courtesy of P. Delavignette.]

X. DISLOCATION CONTRAST

The following intuitive reasoning explains the origin ofcontrast at dislocations and allows us to determine theimage side. Let the foil contain an edge dislocation in E .In the perfect part of the foil the orientation is such thatthe transmitted and the scattered beams are comparablein intensity for reflection against the lattice planes shownin Fig. 13. The line width is a rough measurement for theintensity of the beams. The s value of the active diffractionvector that points to the right is positive.

At the left side of the dislocation in E1 the consideredlattice planes are sloping in such a way that the Braggcondition is locally better satisfied than in the perfect partof the foil. As a result, locally more electrons will be scat-tered into the diffracted beam than in the perfect part, anda lack of electrons will be noted all along a line slightlyto the left of the dislocation (i.e., a dark line will be ob-served in the bright-field image). In this approximationa bright line would be observed in the dark-field image.The opposite would apply to the crystal part slightly rightof the dislocation, that is, in E2 since the lattice planesare sloping in the opposite sense compared to the orien-tation in the perfect part. A similar reasoning is applica-ble to screw dislocations since lattice planes are also in-clined in opposite senses left and right of the dislocationline.

The diffraction contrast at dislocations is thus one-sidedfor s �= 0 and not too small s values, that is, dislocationsare observed as dark lines situated slightly on one sideof the actual position of the dislocation core. The side ofthe dislocation on which the image line occurs, referredto here as the image side, is determined by the sign of the

FIGURE 13 Diffraction contrast at a dislocation in a thin foil (intu-itive picture). The thickness of the lines is a measurement of theirintensity. [From Amelinckx, S. (1964). “The Direct Observation ofDislocation,” Solid State Physics, Suppl. 6, Academic Press, NewYork.]

Page 435: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

62 Transmission Electron Microscopy

FIGURE 14 Schematic representation of a dislocation imagecrossing an extinction contour; note the change of image sidefor s > 0 and s < 0. [From Amelinckx, S. (1964). “The Direct Ob-servation of Dislocation,” Solid State Physics, Suppl. 6, AcademicPress, New York.]

product p = (g · b)s. Accepting the FS/RH convention1

for the definition of the Burger vector b and the previousdefinition of s and arranging a normal positive print (i.e.,one viewed from below the specimen) so as to define aleft and right side of the dislocation, then for a dislocationline direction defined from bottom to top on the print,the image will be on the right for p > 0 and on the leftfor p < 0. Similarly, for a closed dislocation loop wherethe sense of the dislocation line is defined by a clockwisecircuit, the image will be inside the loop for p > 0 andoutside for p < 0.

Since the actual position of the dislocation line cannotbe observed, the image side can be determined only byproducing successively a left and a right image. There aretwo means of achieving this: (1) by producing an imagewith s > 0 and another with s < 0 for the same g or (2) byproducing images for g and −g, both with an s value ofthe same sign (Fig. 14).

The same intuitive reasoning also demonstrates thatBragg reflection from the family of lattice planes that areleft undeformed by the dislocation will not reveal the pres-ence of the dislocation (Figs. 15a and b). Thus, the condi-tion for extinction of the image is g · b = 0. This is an ap-proximation, however, and the lattice planes parallel withthe glide plane of an edge dislocation, for instance, sat-isfy the extinction criterion, but some contrast is observed,which is due to the slight deformation of such planes. Inthe case of an edge dislocation this slight displacementis perpendicular to the glide plane. This effect is clearlyvisible for pure edge prismatic loops observed with g inthe plane of the loop. In the latter case the extinction willbe complete only along those parts of the loop where theradial displacement is perpendicular to g (Fig. 16), that is,there will be a line of no contrast only perpendicular to g.

1FS/RH or finish–start/right-hand convention refers to the way inwhich a Burger circuit is defined around a dislocation line.

From the one-sided nature of the contrast at dislocationsit is possible to determine the sign of dislocations. Intuitivereasoning easily demonstrates that changing the sign of thedislocation (i.e., of the Burger vector) changes the imageside; the same is true for edge and screw dislocations. Thisresult follows from the change in sign of p as b changessign (Fig. 17).

The problem in determining whether a prismatic loopis due to the precipitation of vacancies or interstitials isequivalent to determining the sign of the dislocation bor-dering the loop. Several practical methods, based on de-termining the sign of (g · b)s, have been described. Suchmethods have been used extensively in the study of radia-tion damage, of quench defects in metals and alloys, andof loops due to nonstoichiometry.

For small s values, dislocations that are not parallelto the foil plane produce an image that changes perodi-cally with the level in the foil, the period being the effec-tive extinction distance. This is called oscillating or dotted

FIGURE 15 Extinction conditions for dislocations. (a) Edge dis-locations: planes perpendicular to g1 are deformed and planesperpendicular to g3 remain flat. (b) Screw dislocations: planesperpendicular to g1 acquire screw shapes and planes perpendi-cular to g2 remain flat. [From Amelinckx, S. (1964). “The DirectObservation of Dislocation,” Solid State Physics, Suppl. 6, Aca-demic Press, New York.]

Page 436: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

Transmission Electron Microscopy 63

FIGURE 16 Diffraction contrast of prismatic dislocation loops inthe (0001) plane of zinc. The line of no contrast perpendicular tog is indicated. [Courtesy of A. Fourdeux.]

contrast (Fig. 18). In sufficiently thick foils where anoma-lous absorption is important, the oscillations in bright- anddark-field images are similar (i.e., in phase) at the top ofthe foil and complementary at the bottom of the foil (i.e.,in antiphase).

Small dislocation loops or very small precipitates mayproduce black or bright dots as images. Dislocations seenend-on produce characteristic contrast effects that are dueto a large extent to surface relaxation along their emer-gence points. In cases where the dislocation image is ex-tinct (Figs. 19a and b), the emergence points may stillproduce contrast (Fig. 19c).

XI. WEAK-BEAM IMAGING

The image width of defects decreases with increasing s.This is a consequence of the fact that the effective ex-tinction distance becomes very small for large s values,as follows from Eq. (6). As a result, the image width ofdislocations becomes much narrower. It is therefore pos-sible to increase the resolution of defect images by usingthe so-called weak-beam method, which consists essen-

tially of making a dark-field image in a weakly excitedreflection. The exposure time is correspondingly longerof course. Under these conditions (i.e., large s values), thekinematical theory is a reasonable approximation.

Weak-beam images are used mainly to study the finestructure of dislocations, that is, to study the splitting ofperfect dislocations into multiribbons of partials (Fig. 20).The separation of partials is determined, among other fac-tors, by the magnitude of the stacking fault energy. Weak-beam images therefore offer a unique method for the quan-titative determination of stacking fault energies.

For large s values the effective extinction distance be-comes smaller. Under such conditions planar interfacesproduce many fringes, which are sometimes useful, suchas for the study of antiphase boundaries in ordered al-loys. The extinction distances corresponding to the super-structure reflections, which are needed to image antiphaseboundaries, are, in general, large, thus producing only asmall number of fringes, unless large s values are used.

XII. COMPUTER SIMULATIONOF DISLOCATION IMAGES

Computer programs have been developed to simu-late two-beam dislocation images. Identification of the

FIGURE 17 Image side and sign of dislocation. When the signof the dislocation changes, the image side changes: (a, b) edgedislocations; (c, d) screw dislocations.

Page 437: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

64 Transmission Electron Microscopy

FIGURE 18 Oscillating contrast at dislocations that are inclinedwith respect to the foil plane and finally emerge in the surface.[Courtesy of P. Delavignette.]

characteristics of dislocations proceeds through compar-ison of observed and simulated images. The “strength”of dislocation contrast depends on n = g · b, which is asmall integer for perfect dislocations (n = 1, 2, 3), and onthe diffraction variables. For partial dislocation n adoptsnonintegral values. For n = 2

3 the dislocation will usu-ally still be visible, whereas for n = 1

3 visibility becomesquestionable.

FIGURE 19 Contrast due to surface relaxation at the emergencepoints of dislocations in a platinum foil. The dislocations them-selves are out of contrast in c; only the surface relaxation is visible.

FIGURE 20 Weak-beam images of ribbons of partial disloca-tions in the basal plane of RhSe2. [From Amelinckx, S., and VanLanduyt, J. (1976). In “Electron Microscopy in Mineralogy” (H. R.Wenk, ed.), pp. 68–112, Springer-Verlag, Berlin.]

It is possible to determine the magnitude of the Burgervector from the knowledge of n, g, and the direction of b.

The images of close-neighboring parallel dislocationswith the same Burger vector may be quite different; this isdue to the fact that the combined displacement field of thetwo dislocations produces the contrast. Therefore the im-age is not the superposition of the two images that wouldbe produced by two isolated single dislocations. This“vicinity” effect is especially striking in layer structureswhere it extends far along the layer planes as a result ofthe elastic anisotropy of such materials. In graphite, for in-stance, triple ribbons containing three partial dislocationswith the same Burger vector are frequently observed. Nev-ertheless, the three partials produce quite different images,and the image of these ribbons is, furthermore, stronglydependent on the sign of s.

XIII. DIFFRACTION CONTRASTAT PLANAR INTERFACES

Different classes of planar interfaces can be distin-guished from characteristic image features under two-beam diffraction contrast conditions.

A. Translation Interfaces

The two parts of the crystal are related by a pure translationdescribed by the constant displacement vector R. If R isnot a lattice vector, such a defect is called a stacking fault(Fig. 21a). If the displacement vector R is a lattice vec-tor, but not a superlattice vector (e.g., in an ordered alloy)(Fig. 21b), the interface is called an antiphase boundary,

Page 438: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

Transmission Electron Microscopy 65

FIGURE 21 Schematic representation of different planar defects: (a) stacking fault (SF) in elementary crystal,(b) antiphase boundary (APB) in an ordered alloy, (c) coherent twin boundary (TB), and (d) inversion boundary (IB).[From Amelinckx, S., and Van Landuyt, J. (1976). In “Electron Microscopy in Mineralogy” (H. R. Wenk, ed.), pp. 68–112,Springer-Verlag, Berlin.]

but strictly speaking only if R is one-half of a latticevector. If this is not the case, the same terminology maybe used although out-of-phase boundary would be morecorrect.

The lattice planes with diffraction vector g in one partof the crystal are shifted with respect to those in the secondpart of the crystal over a fraction of the interplanar dis-tance given by g · R. As a result, the electrons diffractedby the second part of the crystal undergo a phase shiftover α = 2πg · R with respect to those diffracted by thefirst part of the crystal. The phase of the transmitted beamis not affected by this shift. Interference between (1) thebeam transmitted by the first and, again, by the secondpart of the foil T1T2 and (2) the beam scattered by the firstpart and, again, scattered in the incident direction by thesecond part, S1S−

2 , gives rise to periodic variations of thefinal transmitted intensity with the position of the inter-face within the foil (Fig. 22). The period is the extinction

distance, corrected for deviations from the exact Braggconditions, t ′

g [i.e., as given by Eq. (6)].The final scattered beam results from the interference

between (1) the beam transmitted through the first part andscattered by the second part T1S2 and (2) the beam scat-tered by the first part and transmitted through the secondS1T −

2 . Also, this beam periodically varies in intensity withthe position of the interface in the foil, the period beingthe same as for the transmitted beam (Fig. 22).

If the interface is inclined with respect to the foil sur-faces, one observes a set of fringes with a depth periodequal to t ′

g , in the bright-field image as well as in the dark-field image, and the projected period depends on the in-clination of the interface. We now discuss, in some detail,the properties of the images of faults for which α = ± 2

3π ,which occur in close-packed structures.

In foils sufficiently thick for absorption effects to be im-portant, the bright-field image is symmetrical with respect

Page 439: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

66 Transmission Electron Microscopy

FIGURE 22 (a) Schematic beam path occurring upon diffractionby a planar interface between crystal parts 1 and 2. T and S arethe amplitudes of transmitted (T ) and scattered (S ) beams. (Theminus sign in the superscript indicates that the sign of s shouldbe changed in the corresponding expression.) (b) Reciprocalspace construction illustrating the reversal of the sign of the exci-tation error s upon reversal of the sense of the diffraction vector g.(c) Geometry for an inclined interface; (z1 + z2) equals the speci-men thickness.

to the projection of the foil center, but asymmetrical in thedark-field image. In the bright-field image the first fringeat the entrance face is bright if sin α > 0 and dark if sinα < 0; the same is true for the dark-field image. On theother hand, at the exit face the last fringes are opposite innature in bright- and dark-field images. Table I summa-rizes these characteristics.

In wedge-shaped foils the fringes are parallel with theintersection lines of the interface and the nearest foil sur-face. As the foil thickens additional fringes are formed inthe center of the foil (Fig. 23b).

FIGURE 23 Fringe patterns at planar stacking faults with(a) α = 180◦ and (b) α = 120◦.

Fringes associated with antiphase boundaries, for whichα = π , have somewhat different properties. Bright- anddark-field images are now complementary. The centralfringe is bright in the bright-field image and dark in thedark-field image. The fringes are parallel to the foil centerrather than with the foil surface. As a result, in a wedge-shaped crystal, new fringes are created at the surfaces ofthe foil (Fig. 23a). The contrast within the domains oneither side of a translation interface is always the samesince the lattices are parallel.

It is clear that for these reflections g, for which g · Ris an integer, no fringes will be produced since the latticeplanes with diffraction vector g, in both crystal parts, arethen in register again. As a result, in face-centered crystalswhere R = 1

6 [112] (or the equivalent, 13 [111]), reflections

Page 440: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

Transmission Electron Microscopy 67

for which h + k + l = 3n (i.e., the sum of the indices is athreefold) will not produce any fringes at stacking faults;for the other reflections α = ± 2

3π. For antiphase bound-aries in ordered alloys, α = ±π for superstructure re-flections, and α = 0 (mod 2π ) for basic reflections. Theextinction distances associated with superstructure reflec-tions are mostly large. As a result, antiphase boundariesare usually imaged by a smaller number of fringes thanstacking faults in the same foil thickness.

Sometimes weak fringes are visible along stackingfaults, and antiphase boundaries for reflections for whichg · R is an integer. This is a consequence of the fact thatthe displacement vector is not a simple vector but dif-fers from this by a small vector ε as a result of relaxationalong the interfaces. It is possible to deduceε from contrasteffects.

B. Coherent Twin Interfaces

After a phase transition, single crystals are usually bro-ken up in domains. The structures within the domains arerelated by symmetry operations lost during the transfor-mation. Coherent twin boundaries (Fig. 21c) often resultafter a displacive transformation. The two parts of the crys-tal on either side of such a twin boundary are then relatedby a mirror operation or by a 180◦ rotation.

In many cases one can alternatively derive one part ofthe crystal from the other part by means of the displace-ment field shown in Fig. 21c, that is, the displacementvector has a constant direction and sense but increases inmagnitude with distance away from the interface. If thedisplacement per atom plane is a small fraction of the in-teratomic distance, the difference ∆g of simultaneouslyactive diffraction vectors g1 and g2 in the two crystal partsis a small vector (i.e., �g � g), which is perpendicularto the interface. The two crystal parts diffract simultane-ously under two-beam conditions, although with differentexcitation errors s1 and s2. Also, the extinction distancesare, in general, different in the two crystal parts for si-multaneously excited reflections. However, in diffractioncontrast such interfaces are also imaged as fringe patternsof which the properties are different from those producedby translation interfaces.

The contrast is now determined by δ = s1tg1 − s2tg2 . Thenature, bright or dark, of the outer fringes is given inTable I. The depth period may now be different close tofront and back surfaces if the extinction distances in thetwo crystal parts are significantly different. If tg1 = tg2 , thefringe pattern is symmetrical in the dark-field image pro-vided |s1| = |s2|. In general, the domains on either sideof the boundary have different contrast. No fringes areproduced if δ = 0 (i.e., for reflections for which �g = 0).

C. Inversion Boundaries

Noncentrosymmetrical crystals often contain domainsbuilt on a common lattice, but of which the structuresare related by an inversion operation (Fig. 21d). Such do-mains can be made to produce a different brightness in adark-field image made in a multiple-beam situation alonga zone that does not produce a center of symmetry in pro-jection along the zone axis. In the bright-field image thedomains always have the same brightness. The method isbased on the violation of Friedel’s law in multiple-beamsituations. The interfaces can also be imaged as fringe pat-terns of the same nature as those produced by translationinterfaces.

D. Permutation Twins

In certain crystals the symmetry of the structures is lowerthan that of the lattice. The lattice for an orthorhombicstructure may, for instance, be tetragonal (e.g., δ − NiMo)or a rhombohedral structure may be based on a hexago-nal lattice (e.g., α-quartz). In such cases the crystal axismay be “permuted” in adjacent regions, domains can thenbe revealed by structure factor contrast. Although the twodomains produce reflections in the same directions, sincethey are built on the same lattice, the intensities of certainreflections may be significantly different for the two do-mains. The interfaces are again imaged as fringe patterns,the origin of the contrast being the difference in phase ofsimultaneously active reflections.

XIV. IMAGE FORMATION IN ANIDEAL MICROSCOPE

Let the incident electron beam be described by a planewave of amplitude 1. Diffraction occurs in the object,and electrons emerge from the exit face. The object ischaracterized by a two-dimensional transmission functionq(x, y), which describes the amplitude and phase of theemerging beams at each point (x , y) (Fig. 24). The backsurface of the object can be considered as a planar assem-bly of point sources of spherical wavelets in the sense ofHuyghens. The interference between these wavelets gen-erates the diffracted beams in the case of a crystallinespecimen and produces a diffraction pattern in the backfocal plane of the objective lens. This diffraction patterncan, to a good approximation, be described by Fraunhoferdiffraction, because of the relative dimension of the lensesand the electron wavelength and because of the paraxialnature of most of the diffracted beams. This is a conse-quence of the fact that in electron diffraction, Bragg anglesare very small, as already mentioned. Thus the diffraction

Page 441: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

68 Transmission Electron Microscopy

FIGURE 24 Image formation in an ideal microscope. The diffraction pattern is the Fourier transform of the objectand the image is the inverse Fourier transform of the diffraction pattern.

amplitude is the Fourier transform of the function q(x, y).In turn, the diffraction pattern in this back focal plane actsas a source of Huyghens spherical wavelets, which in-terfere to produce an enlarged image of the transmissionfunction. This image is, again, the Fourier transform of thediffraction pattern. We can, therefore, conclude that theideal microscope acts as an analogue computer and per-forms a double Fourier transformation apart from a linearmagnification, and, thus, reproduces the object. Unfortu-nately ideal microscopes are not available and the actualsituation is somewhat more complicated.

XV. IMAGE FORMATION IN A REALMICROSCOPE

Real microscopes are subject to a number of limitationsthat induce deviations from the ideal imaging conditionsjust described.

A. Spherical Aberration

In real magnetic lenses the paraxial approximation, whichleads to point-to-point representation in the Gaussian fo-

cal plane (i.e., the focal plane considered in geometricaloptics), breaks down. This is due to the fact that the valueof sin β, which enters into the expression for the Lorentzforce on a moving charge, can no longer be approximatedby the angle β. This is analogous to approximating sin β

by β in Snell’s law for paraxial rays in ordinary optics, inwhich higher-order terms, up to the third power in β, arerequired.

The radius of the disk of confusion in object space re-sulting from this lens aberration is then given by

ρS = CSβ3, (8)

where CS is the spherical aberration constant, which hasa value between 1 and 10 mm. A typical high-resolutionmicroscope operating at 200 kV has a value of CS = 1.2mm. As a result of spherical aberration, electron beamsinclined at an angle with the optical axis suffer a phaseshift χS with respect to the central beam (β = 0), which isgiven by

χS = 2π (�/λ),

where � is the path difference caused by the beam thatdoes not pass along the axis. From Fig. 25a it can be con-cluded that � = ρS sin β ρSβ and hence d� = ρ dβ

Page 442: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

Transmission Electron Microscopy 69

FIGURE 25 Microscope aberrations. Phase shifts χ due to (a) spherical aberration and (b) defocusing. [FromAmelinckx, S. (1986). In “Examining the Submicron World,” p. 71, Plenum, New York.]

and dχS = 2πρS dβ/λ = 2πCSβ3 dβ/λ. After integra-

tion from 0 up to the angle β,

χS = (1/2)πCSβ4/λ. (9)

B. Aperture

The microscope contains an objective aperture that elim-inates beams that enclose an angle β with the optical axisexceeding an angle βA to reduce the spherical aberration.This imposes a limit to the theoretically achievable resolu-tion called the Abbe limit. A geometrical point is imagedas a circle (the disk of confusion) with the radius

ρA = 0.61λ/βA, (10)

which means, in practice, that only points separated by atleast this distance in the object can be observed as separatepoints in the final image.

C. Defocus

Most high-resolution images are automatically made un-der conditions where visual contrast is best. It turns out thatin the exact Gaussian focal plane the contrast is smallest,at least, for a phase object, that is, an object that changesonly the phase of the incident beam. One therefore usuallyworks under somewhat defocused conditions. Also, defo-cusing causes phase shift and a disk of confusion, whichwe can estimate with reference to Fig. 25b.

When defocusing the electron microscope by an amountε, leaving the plane of observation unchanged, the objectbeing situated near the first focal plane, results in an ap-parent displacement ε of the object plane (Fig. 25b). Oneclearly has

ρD = ε sin β ≈ εβ,

and for the value of �,

� = ε/(cos β) − ε ≈ 1/2β2ε,

and hence,

χD = 2π�/λ = πεβ2/λ. (11)

ε > 0 means lens strengthening and ε < 0 means lensweakening.

D. Chromatic Aberration

As a result of high-voltage instabilities of the microscope,the incident electron beams exhibit a wavelength spread,since λ is related to the acceleration potential V by thenonrelativistic approximate relation [Eq. (2)]:

λ = h(2meV )−1/2. (12)

Moreover, variations in the lens currents �I/I also causeaberrations which are of the same nature.

A third origin of aberration is the inelastic scattering inthe specimen, which is equivalent to a change in energyof the electrons entering the lens system �E/E . The neteffect of all these phenomena on the image formation is aspread � f on the focal distance f of the objective lens.The latter is proportional to E I −2, assuming that �E and�I are uncorrelated; � f is given by

� f = Cc[(�E/E)2 + 4(�I/I )2]1/2.

The corresponding disk of confusion in object space hasa radius

ρc = β� f.

The constant Cc is called the chromatic aberration con-stant. Since instabilities at high voltages and lens currentscan be reduced to smaller than 10−6, � f takes a typical

Page 443: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

70 Transmission Electron Microscopy

value of 10 nm, which corresponds with a Cc value smallerthan 10 mm.

E. Beam Divergence

Because of the finite dimensions of the electron sourceand the condensor lense aperture, the incident beam issomewhat divergent. Under the intense illumination con-ditions used in high-resolution imaging, the apex angle ofthe illumination cone may reach a value of the order of∼10−3 rad.

The influence of incoherent beam divergence on theimage can be described as being due to the superpositionof independent images (i.e., intensities) corresponding todifferent incident directions within the divergence cone.

F. Ultimate Resolution

Apart from the lens imperfections discussed previously,which lead to image blurring and phase shifts, a numberof other imperfections occur, but these are unimportantcompared to those discussed. Furthermore, resolution alsodepends on mechanical stability (e.g., vibration and drift);these effects can be eliminated to a large extent by a propermicroscope design.

The ultimate resolution is limited mainly by three, up tothe present, inevitable phenomena: finite aperture, spheri-cal aberration, and chromatic aberration.

The final disk of confusion has a radius given by

ρ = (ρ2

A + ρ2s + ρ2

c

)1/2.

Because of the difference in angular dependence of thedifferent aberrations, it turns out that in present-day high-resolution electron microscopes, chromatic aberration hasonly a relatively small influence. The limiting factor atsmall angles (β < 5 × 10−3 rad) is the aperture, whereasin the range β > 5 × 10−3 rad, the limiting factor is thespherical aberration. (This is true for E = 100 kV, Cs =8.2 mm, and Cc = 3.9 mm.)

The curves ρA and ρs versus β [i.e., Eqs. (8) and(10)] have opposite slopes. There is therefore a mini-mum value for ρ, which occurs for ∂ρ/∂β = 0, where ρ =(ρ2

A + ρ2s )1/2. This minimum, which corresponds to the op-

timum compromise β0 between spherical aberration andaperture effects, occurs for β0 = (0.61λ/Cs

√3)1/4. The

corresponding radius of the confusion disk is then ρ0 =0.9λ3/4C1/4

s . Representative values are β0 = 5 × 10−3 radand ρ0 ≈ 0.5 nm. This expression makes it clear that morecan be gained in resolution by decreasing the wavelength(i.e., by increasing the accelerating voltage) than by de-creasing Cs.

FIGURE 26 Dependence of the phase shift χ on the angle β.[From Amelinckx, S. (1986). In “Examining the Submicron World,”p. 71, Plenum, New York.]

G. Phase Shift

We have pointed out that spherical aberration and defocuscause phase shifts of the nonaxial electron beams withrespect to the axial beam. These phase shifts depend on β

in the following manner:

χ (β) = (12πCsβ

4 + πεβ2)/

λ. (13)

Note that χ = 0 for the defocus value

ε = − 12 Csβ

2.

This value still depends on β can be approximately satis-fied in only a limited range of β values. It is therefore nec-essary to take the phase shifts into account when perform-ing image calculations. The general aspect of the curve isas presented in Fig. 26.

XVI. IMAGE FORMATION OF AWEAK-PHASE OBJECT

The amplitude distribution in the back focal plane of theobjective lens is given by the Fourier transform (F) ofthe object function. In the case of a crystalline specimenthe object function is the electron wave function at theexit face of the thin foil. The amplitude distribution in thediffraction pattern is the Fourier transform of the wavefunction. The final image amplitude is the Fourier trans-form of the diffraction amplitude. However, the electronsare now moving in a lens system and therefore undergo thephase shifts χ(β) discussed previously. Moreover, an aper-ture is limiting the number of beams transmitted throughthe system. This can be taken care of by introducing anaperture function in the plane of the diffraction pattern.This function is 1 over the surface of the aperture and 0outside of this.

Page 444: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

Transmission Electron Microscopy 71

The main features of the image formation can be illus-trated in the simplest case, where the specimen can be as-similated with a weak-phase grating. Since the wavelengthof the electron is different in vacuum, and in the specimenthe passage of the electron beam through the specimencauses a phase shift that can be writtenχ (x, y) = σφ(x, y),where φ is the projected lattice potential along the propa-gation direction of the electrons, x and y are coordinates inthe specimen plane, and σ = π/λE (λ is the wavelength ofthe electrons in vacuum, and E the accelerating potential).The object function is then

q(x, y) = eiσφ(x,y) 1 + iσφ(x, y). (14)

The image amplitude U (x, y) obtained by Fourier trans-formation followed by the inverse Fourier transformationthen becomes of the form

U (x, y) = 1 + σφ(x, y) sin χ + iσφ(x, y) cos χ, (15)

where χ is supposed to be a constant. In reality, χ dependson β (i.e., on x and y), but we shall show that the imag-ing conditions require that sin χ is at least approximatelyconstant to obtain a directly interpretable image.

In the particular case where sin χ = −1 (and, thus,cos χ = 0), the intensity distribution (i.e., the image)becomes

I = UU ∗ or I (x, y) = 1 − 2σφ(x, y), (16)

which clearly has a direct relationship with the object rep-resented by its projected potential φ(x, y). The image con-trast, defined as (I − I0)/I0 2σφ(x, y), turns out to bedirectly proportional to the projected potential.

If one could make sin χ = +1, one would obtain

I (x, y) = 1 + 2σφ(x, y). (17)

The intensity is now larger for a larger projected potential.The situation is somewhat like positive and negative phasecontrast. The lenses have introduced phase shifts of π/2,similar to what the quarter-wavelength ring does in op-tical phase contrast microscopy. The lens aberrations areexploited to produce phase contrast that would be absentin a perfect microscope.

XVII. OPTIMUM DEFOCUS IMAGES

If the image is to be a “faithful” representation of theprojected potential, then sin χ ≈ ±1, not just for a singlebeam but for as many diffracted beams contributing to theimage as possible.

The value of sin χ will not vary rapidly in the vicinityof a stationary point of χ (i.e., a minimum or a maxi-mum). We have found that χ adopts a stationary valueχ = −πε2/2λCs for β = (−ε/Cs)1/2 (Fig. 26). Since this

value of χ is essentially negative, we cannot satisfy si-multaneously the requirement sin χ = +1 (i.e., χ = π/2),but we can do so for sin χ = −1 (i.e., χ = −π/2).It is sufficient to choose the defocus ε in such a waythat −πε2/2λCs = −π/2 [i.e., εs = −(λCs)1/2]. Sincethe sin χ function is stationary around χ = ±π/2, thesin χ -versus-β curve will present a flat part in the regionof β = (−ε/Cs)1/2 provided ε = εs. The defocus valueεs = −(λCs)1/2, which corresponds to the optimum imag-ing conditions of a phase grating, is called the Scherzerdefocus, and the quantity (λCs)1/2 is used as a unit ofdefocus. A more complete expression is εs = − 4

3 [λCs]1/2.The dependence of sin χ on the diffraction angle β is

represented in Fig. 27 for a typical situation close to theScherzer defocus. It is a rapidly oscillating function andtherefore it is not possible to fulfill the required conditionfor all β values. The curve in Fig. 27, which was drawn forthe Scherzer defocus value of −210 nm in this particularcase, does exhibit a region where sin χ is approximately−1 as required. Beams that are diffracted in this angularrange give rise to an image that is a direct representation ofthe object. If not enough beams can be passed through this“window” in the sin χ curve, the image may be rudimen-tary in the sense that it can give true detail only up to somemaximum spatial frequency. This limiting frequency cor-responds roughly with the β angle for which sin χ goesthe first time through zero at the Scherzer focus, that is,βmax = (−2ε/Cs)1/2; for χ (βmax) = 0 with ε = εs, thisbecomes βmax = 2(λ/3Cs)1/4, and the radius of the cor-responding disk of confusion is ρ = (λ/2)3/4C1/4

s , to becompared with the expression just given. Figures 27a andb show two image transfer functions for two instruments;the advantage of a high voltage becomes quite apparentfrom the width of the window.

FIGURE 27 Image transfer function sin χ (β) for the Scherzer de-focus. Accelerating voltage: (a) 100 kV (Cs = 4.2 mm, � f = 720 A)and (b) 1000 kV (Cs = 1.4 mm, � f = 860 A). [From Amelinckx, S.(1986). In “Examining the Submicron World,” p. 71, Plenum, NewYork.]

Page 445: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

72 Transmission Electron Microscopy

It is worth noting that the image (i.e., the positive or thenegative deviations from the background) is proportionalto the projected potential φ(x, y).

XVIII. LATTICE IMAGES

In images made using the diffraction contrast mode, onlyone beam is used. A quite different type of image is ob-tained when admitting more than one beam through theselector aperture. In the simplest situation, two beams,usually the transmitted beam and one scattered beam, areselected and then made to interfere. The resulting interfer-ence pattern behind the exit surface of the crystal consistsof straight sinusoidal fringes with a spacing equal to theinterplanar spacing corresponding to the chosen reflec-tion. Their formation is illustrated in Fig. 28, where thetransmitted and a single scattered beam are represented,assuming the Bragg condition to be satisfied exactly. Thesuccessive planar wave fronts (spacing λ) of maximumelongation associated with the two beams ovelap in thespace behind the foil and produce maxima (and minima)in a set of parallel planes. Although the plane waves arepropagating, these parallel planes of maximum (or mini-mum) amplitude form a stationary pattern that can be

FIGURE 28 Intuitive model for the formation of lattice fringes in(a) a perfect crystal and in (b) a crystal containing a stacking fault.[From Amelinckx, S. (1986). J. Electron Microsc. Tech. 3, 131.]

observed after magnification by the electron optical sys-tem. Elementary geometrical considerations show that thespacing between the planes in this pattern is equal to thespacing of the lattice planes for which the Bragg condi-tion is satisfied. The direction of the fringes is perpen-dicular to the acting diffraction vector and thus parallelwith the lattice planes; their intensity distribution is sinu-soidal since they are two beam interference fringes. Theobserved fringes represent admittedly rudimentary imagesof lattice planes, in terms of an intensity variation of theelectron beam. They can also be considered to representone Fourier component of the lattice potential.

In a sense, the crystal foil acts in this simple case onthe incident electron beam in very much the same waythat certain optical devices act on a light wave to produceinterference fringes by beam splitting (the Fresnel biprismor mirror experiment).

If the crystal contains a planar defect (such as a stackingfault, an antiphase boundary, or a discommensuration wallwith displacement vector R), an intersecting set of latticefringes undergoes a lateral fractional shift along the inter-section line, given by g·R, where g is the acting diffractionvector. The model allows a simple understanding of thiseffect. Suppose that in the second part of the crystal thelattice planes are displaced over a distance y with respectto the position in the first part, so as to occupy the dottedposition (Fig. 28). This shift will not affect the phase of thetransmitted beam but it will cause a phase shift of the scat-tered beam; the path difference f is given by f = 2y sin θ .This quantity also determines the relative phase shift of thewave fronts diffracted by the displaced part with respect tothat diffracted by the undisplaced part. The stationary in-terference pattern formed by T and Sd (dotted wave front)will now be displaced sideways over a distance δ (into thedotted position). From simple geometrical considerationsone can conclude the δ/� = (g · R)/d, which is the rela-tion used in practice to determine the displacement vectorR of stacking faults and of antiphase boundaries from thefringe shifts along the trace of the interface.

Likewise every experimental parameter or instrumentalfactor (e.g., defocusing and beam tilt) that influences therelative phase of T and S will also shift the fringe positions.It is, in general, not possible to associate a fringe positionwith a plane in the crystal structure. The fringe spacing1/g has a structural significance—it is directly related tothe interplanar spacing in the crystal.

If the foil is oriented exactly perpendicular to the inci-dent beam, the reflection +g and −g are both excited to thesame extent (the same s value). Lattice images can nowbe obtained by selecting the three beams, −g, 0, and + g,which exhibit fringes with a spacing 1/g as well as 1

2 g witha different brightness (Fig. 29). The latter fringes (the sec-ond harmonic) arise as a result of the interference between

Page 446: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

Transmission Electron Microscopy 73

FIGURE 29 (a) Two-dimensional interference fringes due to(0, g). (b) Three beam fringes, due to −g, 0, and +g (asymmet-rical incidence). (c) Symmetrical incidence. [From Amelinckx, S.(1986). In “Examining the Submicron World,” p. 71, Plenum, NewYork.]

the beams +g and −g. These fringes are remarkable in thatthe angle-dependent aberrations of the microscope do notdisturb the image formation since the interfering beams+g and −g enclose the same angle with the optical axis ofthe instrument. Fringes of this type are sometimes used bymicroscope manufacturers to demonstrate the capabilitiesof their instruments, but they must be evaluated having inmind this latest remark.

In cases where a number of reflections in a linear array(. . . −2g, − g, 0, +g, +2g . . .) are excited and selected,the fringes are still straight but their profile becomes moreand more complicated since higher harmonics now play arole.

If several beams are admitted through the selector aper-ture, they each contribute one (or more) Fourier com-ponents (i.e., planar sinusoidal image waves or intensitywaves). The electron microscopic image can be consideredas being due to the superposition of these different “imagewaves,” one corresponding to each diffraction vector.

The wave vectors of these waves are given by the vec-tors joining the diffraction spots, corresponding to thedifferent admitted interfering beams, also among eachother as a consequence of multiple scattering. An im-age wave has an amplitude proportional to the intensityof the diffraction spot, which corresponds to the consid-ered wave vector. In this simple case where only the cen-tral beam 0, one first-order g, and one second-order 2greflection in a linear arrangement are admitted, one ob-tains straight fringes with a periodic intensity distributioncontaining two Fourier terms, one with period 1/g andthe second with period 1/2g, with corresponding vectorsg = 0 A = AB and 2g = 0B. The more beams used in thelinear arrangement, the more Fourier components con-tribute, and the more detail can be imaged.

The generalization to two dimensions is obvious. Forthe simple diffraction pattern where, apart from the direct

beam, only four scattered beams forming a square areadmitted, one obtains the superposition of four beams(i.e., four sets of fringes with wave vectors g1, g2, g3,and g4) (Fig. 30). Moreover, also the higher harmonics(i.e., g5 = AC and g6 = BD) are inevitably present, and,finally, g7 = AB, g8 = BC , g9 = CD, and g10 = DA areproduced. The superposition of all these waves (i.e., theirinterference) is sufficient to produce a rudimentary imagerevealing the lattice without structural details on a subunitcell level, unless the structure is very simple (e.g., a face-centered-cubic element). In the particular case representedin Fig. 30 the number of beams is sufficient to representthe structure. As more beams are admitted, the structuraldetail revealed in the image becomes finer. There is, ofcourse, a limit to the detail that can be represented, whichis imposed by the width of the window through which thebeams can be transmitted in the correct phase relation-ship, and this in turn is determined by the resolution of themicroscope.

This consideration implicitly assumed that all consider-ed waves interfere “in phase” (i.e., in the correct phase re-lationship) with the incident beam and among themselves.

Unfortunately, with increasing order of the Fouriercomponents, corresponding beams enclose angles of in-creasing magnitude with the optical axis. As long as wemake use of only beams that pass through the “window”or “plateau” in the image transfer function (i.e., the sin χ -versus-β curve), the different components interfere withthe correct phase relationship and hence produce a directlyinterpretable image for a properly chosen defocus value(the Scherzer defocus). For high-resolution studies it isimportant to have an instrument with a wide plateau in theimage transfer function and to eliminate the beams outsideof it by an aperture. From Fig. 27 the advantages of usinga high voltage become apparent in this respect.

FIGURE 30 Two-dimensional lattice image formation. (a) Re-flections used in imaging along a centered square zone and (b)Fourier components giving rise to the image.

Page 447: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

74 Transmission Electron Microscopy

One can also use beams corresponding to β angles out-side of the window, but then the resulting image is nolonger directly interpretable except in very special cases,such as in the aberration-free focus method. For simplecrystals (e.g., Si) with a small unit cell it is possible that anumber of reflections outside of the plateau still keep thecorrect phase by using a small number of beams and bychoosing defocus and an appropriate Cs value. An imageis then obtained that represents the true structure with aresolution that may exceed the point resolution.

The interpretation usually proceeds by the trial-and-error method, which consists of comparing the calculatedimage for a given structural model to the observed image.In this calculation the phase shifts introduced by the lenssystem must be taken into account properly. Usually thecalculation is made for different defocus values and fordifferent specimen thicknesses, since we know that theseparameters affect the phases of the different Fourier com-ponents and, hence, the final image.

A number of computational methods are in use. Ex-amples of the application of this method are reproducedin Fig. 31. The calculated images have been plotted on acathode-ray screen, which simulates images of the samenature as the ones observed in the microscope.

XIX. IMAGING MODES

A. One-Dimensional Images

If only a one-dimensional representation of the structureis required because the structure is a long-period one-dimensional superstructure, one can use the followingimaging techniques (Fig. 32a).

1. Mode 1

One can select two neighboring superstructure reflectionsbelonging to the same basic spot in a row of spots passingthrough the origin (i.e., in a central row). One uses only oneFourier component, and consequently the image revealsonly the long spacing.

2. Mode 2

If all superstructure reflections belonging to a central roware used, excluding the basic reflections in the row, oneobtains the distribution of long spacings.

3. Modes 3 and 4

If one selects basic reflections as well as superlattice re-flections in a central row (circle 3) or a noncentral row(circle 4), one images also the set of lattice planes of the

FIGURE 31 Computed and corresponding experimental struc-ture image of Au4Mn. (a) Four computed images at optimum de-focus but for different thicknesses. The left-bottom corner wasoccupied by a manganese column. The brightest dots are thus lo-cated at manganese columns. (b) Experimental image: Note thatin the thick part only manganese columns produced bright dots,whereas in the thinnest part all atom columns produce bright dots.The inset shows a model of the tetragonal structure. [From VanDyck, D., Van Tendeloo, G., and Amelinckx, S. (1982). Ultrami-croscopy 10, 263–280.]

basic structure that is parallel with the periodic interfacesthat produce the superstructure. Any variability of the longspacing is now imaged in terms of the spacing of the basiclattice.

4. Mode 5

One can also use a sequence of superlattice reflectionsfrom a noncentral row. This is a useful mode if one wantsto reveal polysynthetic subunit cell twinning.

Page 448: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

Transmission Electron Microscopy 75

FIGURE 32 Imaging modes: The figure represents the diffrac-tion spots admitted through the selector aperture. (a) One-dimensional lattice fringes and (b) two-dimensional lattice fringes.[From Amelinckx, S. (1986). In “Examining the Submicron World,”p. 71, Plenum, New York.]

B. Two-Dimensional Images

For Modes 6–9 see Fig. 32b.

1. Mode 6

If only the superlattice needs to be imaged, it is sufficientto include only pairs of neighboring superlattice spots intwo directions. This can be done either in the dark field(6a) or in the bright field (6b). In the latter case the image isformed by the direct beam and the first shell of superlatticereflections around it; the contrast is usually lower thanfor 6a.

2. Mode 7

All superlattice reflections are selected that are presentwithin one mesh of the reciprocal lattice of the basicstructure, excluding the direct beam and the basic reflec-tions. This is clearly a multiple-beam dark-field image.The Fourier components contributing to the image are, ingeneral, just sufficient to locate the positions of the mi-nority atoms.

3. Mode 8

Except for the direct beam, all other basic reflections areexcluded, but as many shells of superlattice reflections as

feasible are admitted. The number of Fourier componentsrequired to image the columns of minority atoms is nowredundant, however, such columns will therefore be im-aged as sharper dots than in mode 7.

4. Mode 9

All beams originating from the basic as well as from thesuperstructure reflections are selected, provided they donot correspond to spacings that are smaller than the in-strumental resolving power of the microscope. For mostcurrent instruments this means that up to the first or, pos-sibly, up to the second shell of reflections, due to an FCCmatrix, can usefully be included. The face-centered-cubicmatrix will now be prominently revealed in the thin partsof the specimen, whereas in the thicker part the super-structure will be revealed.

Examples of high-resolution images are reproduced inSection XXIV.

C. High-Resolution Imaging Interpretationand Simulation

Image interpretation consists in relating an atomic struc-ture model to an HRTEM image. A very thin foil acts asa two-dimensional phase grating; in such a foil the lo-cal image brightness is directly related to the local pro-jected lattice potential and hence to the structure men-tioned in Section XVI. Unfortunately this in true onlyfor extremely thin specimens in which multiple diffrac-tion is not taking place. In real specimens the image isthe result of the dynamical interaction between the nu-merous diffracted beams, whose amplitudes also dependon the foil thickness. Moreover, the microscope intro-duces angle-dependent phase shifts between these beams,which depend on the focus and on instrumental parameters(Cs, Cc, beam divergence, etc.) as discussed above.

Image interpretation proceeds mostly by “trial and er-ror,” in much the same way as structure determination byX-ray diffraction in its early days. The image of a struc-ture is computer simulated, then compared with the ob-served image, and the model is refined until the correspon-dence between the observed and the simulated image isjudged satisfactory. This similarity can be quantified bya “goodness-of-fit” criterion similar to that used in X-raydiffraction.

The most frequently used simulation method is the“multislice” method. The specimen is dissected into sliceslimited by planes parallel to the foil plane that are thinenough so that each slice can be considered a purephase grating. The structure in each slice is representedby its projected potential; the slice then acts as a two-dimensional phase grating. The specimen now consists of

Page 449: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

76 Transmission Electron Microscopy

a succession of parallel two-dimensional phase gratings,separated by layers of vacuum with a thickness equal tothe slice thickness. Scattering of the electrons is followedslice after slice by computing the electron wave amplitude,alternatingly taking into account the phase shift due to thepropagation between two gratings, followed by comput-ing the diffracted amplitude by the successive gratings.The latter operation amounts to a Fourier transformationas shown above. In practice, the slices can be taken to beone unit cell thick. For each slice the exit waves of theprevious slice act as the input waves. The electron wavesemerging from the back surface of the specimen are sub-sequently assumed to suffer the angle-dependent phaseshifts introduced by the microscope before interfering toform the final image. Suitable computer programs usingthe multislice algorithm are available commercially.

More recently developed methods proceed by the di-rect retrieval of the projected structure. The main problemwith direct retrieval is similar to the phase problem inX-ray diffraction; it is a consequence of the fact that onlyintensities are recorded in the image (and in the diffractionpattern), not amplitudes, unless use is made of holographicmethods.

A recent retrieval method relies on the use of a sequenceof images taken at closely spaced foci (focus variationmethod). This allows us to eliminate the effects of themicroscopic optics and, thus, to obtain the corrected wavefunction at the exit face of the specimen. Using an analyti-cal approach, based on the channeling model, then allowsus to obtain the projected structure.

XX. SCANNING ELECTRONMICROSCOPES

So far we have discussed conventional transmission elec-tron microscopy (CTEM), in which the electron probe isa parallel stationary beam incident along a fixed directionwith respect to the specimen and in which the transmitted

FIGURE 33 Schematic ray paths of electron beams in an electron microscope in two modes: conventional electronmicroscopy (CTEM) and scanning transmission electron microscopy (STEM).

or scattered beam(s) produces the image on a recordingmedium (film, channel plate-CCD camera, etc.).

In a second class of electron optical instruments (so-called scanning electron microscopes), a fine electronprobe is scanned across the specimen and the signal ofinterest, produced locally by the probe, is selected, de-tected, amplified, and displayed by modulating the inten-sity of the electron beam of a TV monitor which is scannedsynchronously with the probe. The signal can be observedeither in backscattering, as in conventional scanning elec-tron microscopy (CSEM or simple SEM), or in transmis-sion (STEM). In this chapter we consider only the lattercase and compare it to CTEM (Fig. 33).

In a scanning transmission electron microscope(STEM) the signal often consists of the transmitted (orscattered) electron beam; however, other signals (e.g., Xrays) can also be detected, even in parallel with the elec-trons, provided that adequate detectors are available onthe instrument.

In CTEM the achievable resolution is determinedmainly by the quality of the imaging optics behind thespecimen; in STEM the resolution limit is determinedmainly by the probe size (which is usually of the orderof 1 nm or less), i.e., by the probe forming optics aheadof the specimen.

In STEM the magnification is purely geometrical, i.e.,it is given by the ratio of the area on the monitor scannedby the electron beam to the corresponding specimen areascanned by the probe. Whereas CTEM instruments areconceptually closely related to the classic light micro-scope, STEM instruments are not unlike TVs; they areessentially “mapping” or “plotting” devices of the spa-tial variation of the various signals captured by adequatedetectors. The analog signals can further be digitized andimages can thus be electronically treated and, for instance,magnetically stored.

The relationship between STEM and CTEM operat-ing modes is represented schematically in Fig. 33. Thediagram should be read from right to left in the case of

Page 450: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

Transmission Electron Microscopy 77

CTEM and from left to right in the case of STEM, i.e., theelectrons are traveling in opposite directions in the twocases.

In the STEM the source in A is often a field emissiongun, since this has a high brilliance. The objective lensis a demagnifying lens system producing a slightly con-vergent electron beam that becomes the probe, which isscanned over the specimen by means of the deflector fol-lowing a two-dimensional raster. Behind the objective lensa convergent-beam electron diffraction pattern is formed.Part of this pattern is selected, and the signal collected, am-plified, and displayed on a TV monitor. In this way bright-and dark-field images from elastically scattered electronscan be produced; moreover, a variety of other modes ofoperation is possible. Collecting many beams allows us toimage atom columns in crystals as a dot pattern.

In the Z -contrast mode an annular detector, capturingthe large-angle scattered beams only, allows us to obtainan image formed predominantly by incoherently scatteredelectrons. High-resolution images made in large-angle in-coherently scattered electrons have the important propertythat the bright dot image does not suffer contrast reversalwith changing defocus, i.e., the columns are always im-aged as bright dots, irrespective of the amount of defocusor foil thickness, whereas in CTEM the same atom col-umn can be imaged either as a bright or as a dark dot,depending on the thickness and focus. This difference inbehavior is a consequence of the difference in shape of theimage contrast transfer function (CTF). Under incoherentimaging conditions the CTF decreases monotonously withthe spatial frequency, whereas in the coherent case it de-pends on the spatial frequency in an oscillatory manner.The dot brightness, furthermore, increases with the aver-age Z value of the atoms along the column. Chemicallydifferent columns, are thus imaged as dots of differentbrightnesses.

Z -contrast high-resolution images can be interpreted in-tuitively; they are thus particularly well suited for imagingof the geometry of defect configurations. Often an electronenergy loss spectrometer (EELS) is fitted to the outcomingelectron beam, which makes it possible to produce imagesusing only electrons having suffered no or a characteristicenergy loss, thus allowing chemical mapping.

XXI. HIGH-VOLTAGE ELECTRONMICROSCOPY

Electron microscopes with an accelerating voltage signi-ficantly higher than the conventional 100 kV have comeinto use over the last decade. They offer the possibility ofgreater penetrating power and the use of thicker specimensthat are more representative of the bulk (Fig. 34). Due to

FIGURE 34 Array of dislocations in stainless steel situated ina glide plane observed by means of high-voltage electron mi-croscopy. [From Dupouy, G., Perrier, F., and Durieu, L. (1970).J. Microsc. 9, 575.]

the shorter wavelength of the electrons used, they allow abetter instrumental resolution to be achieved. Moreover,the contrast transfer function of the lens system can bedesigned so it produces roughly the same phase shift fora larger angular range of beams, and hence more faith-ful representation of crystal structures, than with 100-kV microscopes. The use of high-resolution, high-voltagemicroscopy offers, at present, perspectives for the directstudy of crystal structures. However, the displacementand ionization damage produced by high-energy electronsconstitute an intrinsic limitation that restricts the obser-vation time as well as the resolution. Medium-voltage(∼300- to 400-kV) electron microscopes may turn outto be the best compromise for a number of applications (a400-kV image is shown in Fig. 51).

XXII. ANALYTICAL ELECTRONMICROSCOPY

Apart from elastic scattering, which is responsible forelectron diffraction, inelastic scattering events also oc-cur as electrons pass through the foil. Inelastic processes

Page 451: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

78 Transmission Electron Microscopy

FIGURE 35 Various types of interactions of electrons with a spec-imen in an electron microscope, giving rise to signals (X rays, se-condary electrons, elastic and inelastic scattered electrons, etc.)that can be used for analytical electron microscopy.

can be caused by (1) single-electron excitations, such asX-ray and Auger electron production, and (2) collectiveexcitations, such as volume and surface plasma oscilla-tions (plasmons) and phonons. Collective excitation canbe revealed indirectly as characteristic energy losses ofthe incident electrons. These various types of electron–specimen interactions can be used to turn an electron mi-croscope into a versatile in situ microanalytical instrumentwhere atomic-scale imaging can be coupled with chemicalspectroscopy. The most important interactions are sum-marized in Fig. 35. The emerging signals carry, becauseof their origin, which lies within the atomic structure ofthe elements, chemical information on the irradiated area.The use of this information for chemical analysis on amicroscale was first suggested by Castaing. Instrumen-tal improvements and the advent of scanning techniqueshave resulted in applications whereby the various typesof interactions can be usefully exploited for obtainingchemical information down to the subnanometer scale.The major physical processes used in electron micro-scopes to obtain chemical information are characteristicX rays, characteristic energy losses, and Auger electronproduction.

A. Energy Loss Spectroscopy and Imaging

Energy loss is characteristic for an element, and if it can bedetected by energy analysis of the transmitted electrons,elemental analysis is possible. This is the case, however,only for sufficiently thin specimens, where other energyloss processes do not overwhelm the characteristic spec-

trum. Plasmons have energies in the range of 10–20 eV,whereas phonon energies are of the order of 10 meV.Individual quantized plasmons are thus much easier todetect than phonons. Since the positions of the plasmonloss peaks are characteristic of the materials, they can beused as analytical tools for aluminium and magnesium,for example.

However, for chemical analysis, the absorption edges inEELS curves are more important; they reflect the absorp-tion phenomena leading to X-ray production and exhibita fine structure on the high-energy side. This fine struc-ture is referred to as EXELFS (extended energy loss finestructure), which is the analogue of EXAFS (extended X-ray absorption fine structure). The steep rise on the low-energy side of the absorption edge is due to the excitationof inner-shell electrons and characterizes the element.

The fine structure is produced by the electron wave orig-inating from an inner shell that is partially back-reflectedby the surrounding atoms, which leads to a modulationof the excitation probability of inner-shell electrons. Thisfine structure can therefore provide information not onlyon the chemical nature of the absorbing atom, but alsoon the number of nearest neighbors and their distances.A typical spectrum of an electron beam analyzed for en-ergy loss after interaction with a TiC specimen is shownin Fig. 36.

Nowadays dispersion by a wedge magnet is used to ana-lyze the characteristic losses. For the latter applicationof electron loss spectroscopy, in particular, the bright-ness of the primary electron source is of great importancesince only a narrow energy band corresponding to theloss peak is filtered and used for the imaging. The slit-filtered characteristic loss beam then enters quadrupolelens configurations allowing HREM imaging, thus en-abling subnanoscale imaging related to the presence of

FIGURE 36 Example of an energy loss spectrum. The intensityof the electrons is reproduced versus energy loss (in eV). Charac-teristic peaks for the elements C and Ti are clearly distinguishedin the 300- to 500-eV energy loss range.

Page 452: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

Transmission Electron Microscopy 79

FIGURE 37 Imaging filter (GIF) as an attachment to an electron microscope. It produces energy-filtered electronimages and diffraction patterns, also known as electron spectroscopic images and diffraction patterns. It also produceselectron energy loss spectra.

particular elements. A scheme of the experimental setup(Gatan imaging filter) is shown in Fig. 37.

B. X-Ray Microanalysis

The higher energy state of the ionized atom can be reducedby an electron of an outer shell, e.g., L||, falling into theK shell. This process results in the radiative emission ofa photon with an energy equal to the difference betweenthe two excited states:

Eph = EK − EL.

The probability of these characteristic X-rays being emit-ted is called the X-ray fluorescence yield. It increases withatomic number and is larger for K-line emissions, than forL-line emissions.

For X-ray microanalysis energy-dispersive crystaldetectors have nowadays replaced the curved crystalwavelength-dispersive systems based on Bragg diffrac-tion. This technique employs a lithium-drifted silicon de-tector crystal, which has the advantage of covering a widerange of energies simultaneously. The convenience of dis-play and spectrum processing and acceptable resolutionhave made this technique very popular. An extensive ac-count of the instrumental aspects can be found in theBibliography.

It is clear that X-ray microanalysis equipment mountedon a TEM constitutes a powerful tool enabling observationof the substructure of interest at high magnifications andin situ analysis of the elemental composition. This facilitybecomes even more powerful in a scanning transmissionassembly (STEM) which has a very small electron probe

size available, enabling analysis of particles as small as afew nanometers. Furthermore, use can be made of scan-ning optics to make images in the signal of, for example,one particular line, thus mapping the distribution of cer-tain elements as in a micrograph, but on a much smallerscale.

C. Auger Electron Emission

In contrast with the radiative method of deexciting theion by emission of characteristic X rays, the energy lib-erated by an electron falling into the inner shell can beused to expel an electron from one of the higher shells.These have characteristic energies for each element andare called Auger electrons. These energies are rather low,resulting in easy absorption and the limitation of the use ofthese electrons to surface analysis only. Because of the ne-cessity of an ultrahigh vacuum for reducing inelastic colli-sions, this technique is not commonly used in conjunctionwith electron microscopes and we do not elaborate on itfurther.

Both electron energy loss spectroscopy (EELS) andenergy-dispersive X-ray spectroscopy (EDXS) enable(within their limits) convenient qualitative and quantita-tive elemental analysis. Whereas EELS is more specifi-cally suited for light elements, EDXS is not, however, itcan detect a wide spectrum of elements, from berylliumto uranium or from sodium to uranium, depending on thekind of spectrometer used (wavelength dispersive or en-ergy dispersive). The former, using curved crystal geom-etry, is more suitable for quantitative analysis. The detec-tion limit with EELS for an element in a matrix amountsto approximately 10−19 g, which means about 1000–100

Page 453: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

80 Transmission Electron Microscopy

atoms. For EDXS a sensitivity of 1% can be expected foran element detectable in the presence of another one.

Of the three techniques applicable in a TEM, X-raymicroanalysis appears to be the most widely used. It islimited mainly in the low-Z element side and in the resolu-tion of the imaging capabilities. Energy loss spectroscopyhas advantages in both these aspects, and its potential toprovide nanoscale information on the elemental beams ismost promising. Surface Auger spectroscopy has only lim-ited applications except in dedicated STEM instruments,where instrumental conditions such as UHV and a highgun brightness are available. The three techniques, hav-ing in common their origin of inner shell excitations bythe primary electron beam, appear to be complementaryrather than competitive.

XXIII. SPECIMEN PREPARATION FORTRANSMISSION ELECTRONMICROSCOPY

One of the major drawbacks of electron microscopy tech-niques is the necessity of thinning the sample down toextremely small thicknesses, depending on the accelerat-ing voltage used and on the resolution [diffraction contrast(100 A) or HREM (1 to 5 A) desired. The first is due mainlyto the higher penetration with increased voltage, and thesecond to increased inelastic energy loss, giving rise tospread and chromatic aberration, with a consequent lossof resolution. For 100-kV microscopes a thickness of 1000to 2000 A is typically required for the diffraction contrastmode, whereas for HREM a thickness of 100 A or lessis required. For 1000-kV microscopes thicknesses of upto 1 to 5 µm of Si are acceptable for diffraction contrastobservations.

These thickness requirements and the preparation ofsamples are undoubtedly destructive with respect to thebulk original material, which is indeed one of the dis-advantages of TEM techniques. Fortunately appropriatetechniques for specimen thinning have been developedfor specific types of materials whereby the destructive as-pects are kept under control and very valuable informationon defect and structure characterization can be obtained.

A. Thinning Methods

The main procedures for thinning various types of materi-als are summarized below. Sawing, slicing, and grindingor cold work (milling) are usually required as prepara-tory thinning procedures to obtain a starting thickness of100 µm, from which the final thinning method proceeds.

Metals and alloys are generally thinned by electropol-ishing. Trepanned disks are mounted in special inert hold-

ers to yield self-supporting disks that can be mounted di-rectly in the holder of the microscope.

Ceramics and semiconducting Si materials and like al-loys are successfully thinned by ion milling. Ar ions (5 kV)bombard a rotating 3-mm disk under grazing incidenceuntil perforation occurs, usually in the center of the disk.Chemical thinning is also used for these materials, but ionmilling is more universal and reproducible.

For high-resolution observations a number of materialswhich exhibit a conchoıdal fracture habit can be thinnedfor HREM by mechanical crushing. The fragments are dis-persed on holey carbon grids, and after appropriate tiltingin the goniometer stage of the object holder, the thin frag-ments provide acceptable HREM observation conditions.

Layered crystals can usually be sliced down to usefulthicknesses by repeated cleavage, and for some materialsthin films can be prepared directly by one of the vari-ous types of deposition techniques, e.g., evaporation andchemical vapor deposition, on a substrate which is subse-quently removed by dissolution. Multilayered-device ma-terials are often prepared by this type of deposition orby in-depth doping or chemical treatment. For this typeof material it is often of major importance to know thesuccession of the layers, their thicknesses, and detailedinformation on the interfaces. Therefore specimen prepa-ration in “cross section” is required. Figure 38 shows thesuccessive steps in sample preparation in comparison to“plan-view” specimen preparation as can be performedwith the above-mentioned thinning methods.

Nowadays highly sophisticated instrumentation for ionmilling semiconductor device samples allows viewing andthinning in the same instrument particular areas in a device

FIGURE 38 Scheme illustrating the successive steps to preparethin transmission electron microscope specimens in the plan-viewmode (left) and in the cross-section mode (right).

Page 454: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

Transmission Electron Microscopy 81

to diagnose and or characterize device or interface config-urations in loco down to a micrometer accuracy. This tech-nique consists in using a focused ion beam for thinningand for imaging and is called the FIB method.

XXIV. EXAMPLES OF APPLICATIONS

A. Dislocations

The first application of transmission electron microscopy(TEM) was the study, by means of diffraction contrast, ofdislocation configurations in materials heat treated in dif-ferent ways; TEM allowed the first moving dislocationsto be observed in situ. In Fig. 34 we reproduce an array ofdislocations confined to a well-defined glide plane in stain-less steel. Since the dislocations in this material are dis-sociated in Shockley partial dislocations, their movementis restricted to a well-defined glide plane, as cross-glideis difficult. This micrograph was made in a high-voltageelectron microscope, which allowed a rather thick foil tobe observed. The thickness can be deduced from the num-ber of oscillations in the dislocation image.

In Fig. 39 a network of dissociated dislocations ingraphite is shown. The configuration is situated in the(0001) plane, which is a glide plane and a cleavage plane.The cleaved specimen is also limited by (0001) planes.In Fig. 39a the extended dislocation nodes, which containa stacking fault, exhibit a dark contrast. Also, one set oftriple-ribbon dislocations is visible. In Figs. 42b and c, theconfiguration is imaged under three two-beam conditions.

FIGURE 39 Hexagonal network of dissociated dislocations in thebasal plane of graphite: (a) stacking fault contrast; (b–d) differentline contrasts. [From Amelinckx, S., and Delavignette, P. (1960).J. Appl. Phys. 31, 2126].

FIGURE 40 Dislocation loops in irradiated platinum. A contrastexperiment allows identification of the loop as being due to vacan-cies. [Courtesy of E. Ruedl.]

In each of the three images one set of dislocations is out ofcontrast; the Burger vector of the nonimaged dislocationsis perpendicular to the active diffraction vector.

Under irradiation, by means of neutrons, point defectsare formed that, with the appropriate heat treatment, ag-glomerate into small dislocation loops. Vacancies and in-terstitials may precipitate into separate disks, which, aftercollapsing, give rise to dislocation loops of two types. Itis possible to distinguish these two types of dislocationloops by means of contrast experiments since the Burgervectors of the bordering dislocations have opposite signs(Fig. 40).

On quenching, similar dislocation loops, resulting fromthe agglomeration of vacancies, are observed in a numberof metals (Fig. 41). In metals with a low stacking fault en-ergy, such as gold and cobalt, the vacancies introduced byquenching are found to aggregate in stacking fault tetra-hedra, a defect first discovered by means of TEM. Theedges of such a tetrahedron are formed by stair rod dis-locations, whereas the faces are stacking faults (Fig. 42).High-resolution electron microscopy allows one to distin-guish between tetrahedra due to vacancies and tetrahedradue to interstitials.

Page 455: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

82 Transmission Electron Microscopy

FIGURE 41 Dislocation loops introduced by quenching in alu-minum. [From Hirsch, P. B., Silcox, J., Smallman, R. E., andWestmacott, K. H. (1958). Phil. Mag. 3, 897.]

Figure 43 shows an application in the field of micro-electronics. The sequence of dislocations results from thestresses caused by the fabrication process of field-effectdevices. From the geometry of this dislocation array, themagnitude of the stress can be estimated.

The unambiguous interpretation of dislocation imagesshould be based on a comparison of computer-generatedimages, using dynamical diffraction theory with experi-mental images.

The atomic arrangement around the core of edge dislo-cations can be imaged by means of high-resolution elec-tron microscopy when viewed along a direction parallelto the dislocation line. Figure 44 shows a 60◦ dislocationin silicon.

B. Planar Interfaces

As explained, stacking faults are images of different shadeor fringe patterns. The first case arises when the fault planeis parallel to the foil plane (see, e.g., Fig. 39). Fringes are

FIGURE 42 Stacking fault tetrahedra in gold introduced byquenching. [From Hirsch, P. B., Cotterill, R. M. J., and Jones,M. W. (1962). Proc. Int. Conf. Electron. Microsc., 5th, Philadel-phia, 1, F-3.]

FIGURE 43 Procession of dislocations in a silicon substrateformed during the fabrication process of a field effect transistor.[From Vanhellemont, J., Amelinckx, S., and Claeys, C. (1987). J.Appl. Phys. 61, 2176.]

produced when the fault plane is inclined with respectto the foil plane. Figure 45 shows a bright- and a dark-field image of the same stacking fault, limited to Shock-ley partials in a face-centered cubic copper–aluminum al-loy. From the nature of the outer fringes in the dark- andbright-field images, one can deduce whether the stack-ing fault is intrinsic (i.e., of the type ABCA–CABC) orextrinsic (i.e., of the type ABCABA–CABC). The dis-placement vector R of an interface, which relates the twocrystal parts, can be determined by making images in dif-ferent two-beam situations and the condition for the ab-sence of contrast (i.e., so-called extinction) when g · R isan integer. The vector R describes the displacement of thecrystal part last met by the electrons with respect to thefront part. The problem of determining the nature of thefault consists in determining the displacement vector R.

FIGURE 44 High-resolution image of a 60◦ dislocation in siliconas viewed along the dislocation line. The edge component of theBurger vector is indicated. [Courtesy of H. Bender.]

Page 456: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

Transmission Electron Microscopy 83

FIGURE 45 Stacking fault fringes in a low-stacking fault energyalloy. (a) Bright-field image: the outer fringes are both bright.(b) Dark-field image: the outer fringes are opposite in nature.[Courtesy of A. Art.]

In face-centered-cubic crystals this displacement vector Rcan be written either 1

6 [112] or 13 [111] and the fault plane

is called (111). In the particular case in Fig. 45 the fault isintrinsic.

The high-resolution image in Fig. 46 shows a stackingfault in zinc sulfide; the crystal is viewed along the close-packed directions of the structure. The bright dots can be

FIGURE 46 High-resolution image of stacking faults in zinc sul-fide as viewed along the close-packed rows of atoms. [FromAmelinckx, S. (1986). In “Examining the Submicron World,” p. 71,Plenum, New York.]

FIGURE 47 Fringes associated with shear planes in nonstoichio-metric rutile (TiO2). [From Amelinckx, S., and Van Landuyt, J.(1978). In “Diffraction and Imaging Techniques in Material Sci-ence,” p. 107, North-Holland, Amsterdam.]

assimilated with atom columns. The stacking sequence ofthe layers can be read off from the image. Stacking faultsare conservative interfaces, that is, their presence does notchange the chemical composition of the crystal. This isno longer the case for crystallographic shear planes thatoccur in nonstoichiometric oxides and accommodate devi-ations from the ideal stoichiometry. Figure 47 shows shearplanes in rutile; they accommodate a slight deficiency ofoxygen with respect to TiO2. Similar shear planes occurin many nonstoichiometric oxides. Their contrast effectsare similar to those of stacking faults, except that now thedisplacement vector is not necessarily a simple fraction ofa lattice vector, but may be more complicated due to relax-ation effects along the shear planes. As a result, the simpleextinction criterion g·R = integer is no longer valid; weakresidual fringes are observed for most g vectors.

Antiphase, or out-of phase, boundaries in ordered alloysare planar interfaces with a displacement vector that is alattice vector but not a superlattice vector. In diffractioncontrast such interfaces produce contrast features that aresimilar to those of stacking faults, except that the depth pe-riod of the fringes is much larger than in the case of stack-ing faults. As a result usually only one fringe can be formedin the foil thickness. Antiphase boundaries in Cu3Pd areimaged in the diffraction contrast mode in Fig. 48.

Periodic antiphase boundaries in Au4Mn, giving rise toa one-dimensional long-period structure as imaged in thehigh-resolution mode, are reproduced in Fig. 49 next to acrystal area exhibiting the normal Au4Mn structure.

Periodic stacking faults lead to the formation of poly-types, that is, crystals with the same chemical compositionbut different stacking sequences of the same close-packed

Page 457: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

84 Transmission Electron Microscopy

FIGURE 48 Diffraction contrast image of antiphase boundariesin Cu3Pd. [Courtesy of D. Broddin.]

layers. Depending on the imaging mode used, differentdetails are revealed. In Fig. 50a, which refers to the 15Rpolytype of SiC, only two successive spots along the cen-tral row of superstructure spots were used. The image con-tains only one-dimensional fringes revealing only the longperiod. If two successive basic spots as well as the super-structure spots situated between them are used, the imagestill exhibits one-dimensional fringes, now revealing theelementary atomic layers but modulated in contrast withthe superperiod (Fig. 50b). If at least two neighboring rowsof basic spots are made to interfere, the stacking mode ofthe layers can be imaged (Fig. 50c) and the structure isrevealed.

C. Crystal Structures

In present-day microscopes the resolution reveals struc-tural details at the subunit cell level. This has obviousapplication in the study of “microphases” (i.e., phasespresent in small volumes only and which would be dif-ficult to detect by means of other diffraction techniques).

Figure 51 shows a two-dimensional superstructure thatwas found in Au4−x Mn next to the normal structure andto the one-dimensional long-period structure by means ofelectron microscopy (see also Fig. 49).

High-resolution electron microscopy has been appliedextensively to the study of high-Tc superconductors of

FIGURE 49 Periodic antiphase boundaries in Au4Mn revealedby high-resolution electron microscopy. The bright dots representmanganese columns. The basic Au4Mn structure is imaged in theleft part. [From Van Tendeloo, G., and Amelinckx, S. (1981). Phys.Status Solidi A 65, 431.]

FIGURE 50 High-resolution image of SiC polytype using differentimaging modes. (a) Two successive spots along the central roware used. (b) Two basic spots and the intermediate superstruc-ture spots are selected. (c) Spots on neighboring rows of basicspots are used. [From Amelinckx, S. (1986). In “Examining theSubmicron World,” p. 71, Plenum, New York.]

the perovskite type. Figure 52 shows a [100] zone imageof GdBa2Cu3O7−δ . The superperiod containing three per-ovskite cubes can be clearly recognized; the relationshipof the image with the structure is shown in the inset.

By using diffraction contrast in the structure factor con-trast mode, an incommensurate modulated structure wasdiscovered in quartz. It occurs in a narrow temperaturerange around 573◦C and constitutes an intermediate phasebetween the α and the β phase. It consists of a regu-lar arrangement of small α-type domains related by theDauphine twin law (Fig. 53). The domain size decreaseswith increasing temperature (i.e., the modulation wavevector increases with increasing temperature).

D. Single Defects

High-resolution images allow the study of atomic ar-rangement along planar interfaces. Figure 54 shows thefine structure of a nonconservative antiphase boundary in

Page 458: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

Transmission Electron Microscopy 85

FIGURE 51 Two-dimensional periodic antiphase boundary su-perstructures discovered in Au4Mn. (a) The bright dots repre-sent managanese columns. In the left-bottom part the basicAu4Mn structure is visible. (b) Schematic representation. [FromVan Tendeloo, G., and Amelinckx, S. (1981). Phys. Status SolidiA 65, 431.]

FIGURE 52 High-Tc superstructure GdBa2Cu3O7−δ imagedalong the [100] zone. Note the correspondence with the structuremodel. [Courtesy of G. Van Tendeloo.]

FIGURE 53 Diffraction contrast images of the incommensuratephase in quartz (SiO2) as observed along the c axis. The do-main size decreases with increasing temperature from top to bot-tom. [From Van Landuyt, J., Van Tendeloo, G., and Amelinckx, S.(1986). Phys. Rev. B 34, 2004.]

FIGURE 54 (a) Fine structure of a nonconservative antiphaseboundary in Au4Mn. (b) The dissociation into the lower-energyconfiguration is illustrated schematically. [From Amelinckx, S.(1986). In “Examining the Submicron World,” p. 71, Plenum, NewYork.]

Page 459: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

86 Transmission Electron Microscopy

FIGURE 55 High-resolution image of a stacking fault tetrahedrondue to vacancies in silicon. [From Coene, W., Bender, H., andAmelinckx, S. (1985). Phil. Mag. A 52, 369–381.]

Au4Mn. On changing its orientation by 90 ◦, it dissociatesinto two components, decreasing the free energy.

Figure 55 shows the high-resolution image of a stackingfault tetrahedron in silicon, which was first ion implantedand subsequently annealed. The tetrahedron is due to va-cancy agglomeration (see also Fig. 42).

E. In Situ Studies

The specimen chamber of an electron microscope can betransformed into a small laboratory in which specimenscan be exposed to different environments and external pa-rameters by the use of the appropriate specimen holders.Table II gives a survey of existing possibilities and someapplications of the special specimen holders developed forthese purposes.

In situ studies of electron radiation damage can be per-formed in a high-voltage electron microscope; the elec-trons create the damage and simultaneously form an im-

TABLE I Microscope Holders and Their Applications

Holder Application(s)

Heating holder Phase transitions

Chemical reactions

Order–disorder phenomena

Melting phenomena

Annealing phenomena—grain growth

Cooling holder Phase transitions

Order–disorder phenomena

Environmental cell Chemical reactions

Crystal growth

Straining holder Plastic deformation dislocation reactions(possibly combined withheating and cooling)

age. In situ chemical analysis can be performed by X-raymicroanalysis using the X rays excited by the electronbeam or by analysis of characteristic electron energy losspeaks.

F. Surface Studies

Quite recently transmission electron microscopy was ap-plied to the study of the atomic structure of surfaces. Anexcellent vacuum and clean surfaces are required for suchstudies. The reconstruction of surfaces has been demon-strated in this way, as well as the migration of steps oncrystal surfaces during growth.

SEE ALSO THE FOLLOWING ARTICLES

AUGER ELECTRON SPECTROSCOPY • CRYSTALLOGRAPHY

• INCOMMENSURATE CRYSTALS AND QUASICRYSTALS •MICROSCOPY • OPTICAL DIFFRACTION • POSITRON MI-CROSCOPY • SCANNING ELECTRON MICROSCOPY • SCAN-NING PROBE MICROSCOPY • X-RAY PHOTOELECTRON

SPECTROSCOPY

BIBLIOGRAPHY

Amelinckx, S. (1964). “The Direct Observation of Dislocations,”Academic Press, New York.

Amelinckx, S., and Nabarro, F. R. N. (eds.) (1979). “Dislocations inCrystals,” North-Holland, Amsterdam.

Amelinckx, S., Gevers, R., and Van Landuyt, J. (eds.) (1978). North-Holland, Amsterdam.

Amelinckx, S., Van Dyck, D., Van Landuyt, J., and Van Tendeloo,G. (eds.) (1997a). “Handbook of Microscopy. Methods I,” VCH,Weinheim.

Amelinckx, S., Van Dyck, D., Van Landuyt, J., and Van Tendeloo,G. (eds.) (1997b). “Handbook of Microscopy. Methods II,” VCH,Weinheim.

Amelinckx, S., Van Dyck, D., Van Landuyt, J., and Van Tendeloo,G. (eds.) (1997c). “Handbook of Microscopy. Applications,” VCH,Weinheim.

Amelinckx, S., Van Dyck, D., Van Landuyt, J., and Van Tendeloo, G.(eds.) (1997d). “Electron Microscopy, Principles and Fundamentals,”VCH, Weinheim.

Bethge, H., and Heydenreich, J. (eds.) (1982). “Elektronenmikroskopiein der Festkorperphysik,” VEB Deutscher Verlag der Wissenschaften,Berlin.

Cowley, J. M. (1975). “Diffraction Physics,” North-Holland,Amsterdam.

Edington, J. W. (1977). “Monographs in Practical Electron Microscopyin Materials Science,” Mcmillan, New York.

Glauert, A. M. (ed.) (1981). “Practical Methods: Electron Microscopy,”North-Holland, Amsterdam.

Hawkes, P. W. (1972). “Electron Optics and Electron Microscopy,”Taylor & Francis, London.

Head, A. K., Humble, P., Clarebrough, L. M., Morton, A. J., andForwood, C. T. (1973). “Computed Electron Micrographs and De-fect Identification,” North-Holland, Amsterdam.

Page 460: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GNH/GRD P2: GNH Final Pages

Encyclopedia of Physical Science and Technology EN017D-789 August 3, 2001 16:30

Transmission Electron Microscopy 87

Jouffrey, B. (ed.) (1972). “Methodes et techniques nouvelles d’obser-vation en metallurgie physique,” SFME, Paris.

Murr, L. E. (1970). “Electron Optical Applications in Materials Sci-ence,” McGraw–Hill, New York.

Spence, J. C. H. (1981). “Experimental High-Resolution ElectronMicroscopy,” Oxford University Press (Clarendon), London and NewYork.

Wenk, H.-R. (ed.) (1976). “Electron Microscopy in Mineralogy,”Springer-Verlag, Berlin and New York.

Williams, D. B., and Carter, C. B. (1996a). “Transmission ElectronMicroscopy. I. Basics,” Plenum, New York.

Williams, D. B., and Carter, C. B. (1996b). “Transmission Elec-tron Microscopy. II. Diffraction from Crystals,” Plenum, NewYork.

Williams, D. B., and Carter, C. B. (1996c). “Transmission ElectronMicroscopy. III. Imaging,” Plenum, New York.

Williams, D. B., and Carter, C. B. (1996d). “Transmission ElectronMicroscopy. IV. Spectrometry,” Plenum, New York.

Page 461: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages Qu: 00, 00, 00, 00

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

X-Ray, Synchrotron Radiation,and Neutron Diffraction

P. SuorttiUniversity of Helsinki and EuropeanSynchrotron Radiation Facility

I. Elastic Scattering of X-Rays and NeutronsII. Basic Formulas for DiffractionIII. Kinematical Theory of Diffraction by CrystalsIV. Perfect Crystal: Dynamical Theory of DiffractionV. Real Crystal and Extinction, Powder Diffraction

VI. Synchrotron Radiation Sources and Propertiesof Radiation

VII. X-Ray Optics for Synchrotron RadiationBeamlines

VIII. X-Ray Diffraction Methods and Applicationsof Synchrotron Radiation

IX. Neutron Sources and Neutron OpticsX. Neutron Diffraction, Methods, and ApplicationsXI. Future Developments

GLOSSARY

Atomic form factor Amplitude scattered by the electronsof an atom in units of the electron scattering length re

(for X-rays) or Bohr magneton µB (for neutrons).Autocorrelation function Convolution of a function by

itself; called Patterson function in the case of electrondensity of a crystal. The maxima correspond to inter-atomic distances.

Convolution theorem Fourier transform of a product of

functions is the convolution of the Fourier transformsof these functions.

Dynamical diffraction theory Interaction between for-ward and diffracted beams is included in calculation ofthe diffracted intensity.

Emittance Area of the storage ring electron (positron)beam in position-angle phase space.

Ewald’s construction Reciprocal lattice points on thesurface of the Ewald’s sphere fulfill the Laue equations.The radius of the sphere is 1/λ, and its center is on the

989

Page 462: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

990 X-Ray, Synchrotron Radiation, and Neutron Diffraction

line defined by the incident beam passing through theorigin of reciprocal space.

Extinction Loss of diffracted intensity due to coherentcoupling of forward and diffracted beams (primaryextinction), or due to extra absorption arising fromdiffraction (secondary extinction).

Interference function Function giving the structure ofdiffracted intensity in reciprocal space. Contains thesmall-angle scattering term and the Fourier transformof the Patterson function.

Kinematical diffraction Interaction between forwardand diffracted beams is negligible.

Laue equations Diffraction conditions in reciprocalspace.

Opening angle Angular width of the radiation cone froma single relativistic electron.

Phase problem Phase angles of the structure factors arenot obtained from diffracted intensity.

Powder sample Large number of randomly orientedcrystallites or grains in small particles.

Reciprocal lattice vectors Span the reciprocal unit cell,perpendicular to the planes defined by crystal unit cellvectors a, b, c. The volume of the reciprocal unit cell isV ∗

c = 1/Vc.Refractive index Complex quantity, where the real part

is slightly less than unity for X-rays and neutrons, andthe imaginary part corresponds to absorption.

Rietveld method Calculated diffraction pattern from astructural model and peak shape functions is fitted tothe experimental powder diffraction pattern.

Spallation source Neutron source where neutrons arestripped from nuclei by a high-energy proton beam.

Unit cell Basic building block of a crystal. The unit cellvectors a, b, c span a parallelepipedon of volume Vc.

Wigglers and undulators Periodic magnetic structuresplaced in the straight sections of storage rings for en-hanced production of synchrotron radiation.

THE DISCOVERY of X-rays by W. C. Rontgen in 1895came in the middle of great changes in physics. Much ofthe research in the last decades of the 19th century con-centrated on the nature and propagation of radiation. Newforms of radiation, radio waves, and cathode rays wereproduced, and the old argument of particles vs waves wasrekindled. The prevailing ether theory was subjected toserious blows, and the need for reformulation of emissionand absorption laws of radiation was becoming evident.X-rays propagated like visible light, were not affected byelectric and magnetic fields, but were not refracted andpenetrated through light materials, such as soft tissues ofhuman body. This last property made X-rays instantly fa-

mous, and led also to medical applications. Rontgen him-self suggested that X-rays might propagate as longitudi-nal vibrations of ether, and their true nature was revealedonly gradually. The existence of short-wavelength elec-tromagnetic radiation could be inferred from Maxwell’stheory, but the wavelength of X-rays is more than 1000times smaller than the wavelength of visible light, and theproperties of X-rays could not be extrapolated from ear-lier observations. Rontgen received the first Nobel Prizein Physics in 1901, and in the subsequent years severalNobel Prizes were awarded to scientists working in this orclosely related fields.

The wave nature of X-rays was established in the be-ginning of the 20th century, and their wavelength couldbe estimated. In 1912 M. von Laue had the idea thatX-rays would be diffracted by crystals, which were knownto be three-dimensional regular arrays of atoms, resem-bling one- or two-dimensional optical diffraction gratings.The fascinating story of the first X-ray diffraction exper-iment, which took place in Munich, Germany, has beentold by P. P. Ewald (1962). The Laue-diffraction patternwas recorded on the photographic plate, and the inten-sity maxima could be correlated with atomic structure ofthe crystal. Soon after, father and son W. H. Bragg andW. L. Bragg observed reflection of X-rays from cleav-age surfaces of crystals, and they established the Bragglaw, which gives the relation between the diffraction an-gle, X-ray wavelength, and spacing of atomic planes. TheBraggs solved the first crystal structures, those of cubicZnS and alkali halides. The results were quite revealing,because they showed that at least the crystals of inorganiccompounds are not composed of units of molecules, butof atoms making up a three-dimensional framework. Thisnew concept was difficult to accept for some chemists, whoconsidered the molecules the basic units of compounds.However, structure determination by X-ray diffraction de-veloped rapidly, and it formed the foundation for under-standing the nature of solid matter. The importance of thiswork was recognized immediately; von Laue received theNobel Prize in Physics in 1914 and the Braggs the follow-ing year.

The neutron was discovered by Chadwick in 1932. Bythat time, the wave/particle dualism of radiation was gen-erally accepted, and diffraction of electrons by crystalshad been demonstrated. It was suggested that neutronswould be diffracted as well, and the first experimental ev-idence was obtained in 1936. However, the neutron beamsfrom radioactive sources were too weak for any quantita-tive neutron diffraction experiments, and it was only af-ter 1945, with the advent of nuclear reactors, that neutrondiffraction became an important tool in study of condensedmatter. The neutron wavelength is

Page 463: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

X-Ray, Synchrotron Radiation, and Neutron Diffraction 991

λ = h/mv (1)

where h is Planck constant, m the mass of the neutron,and v its velocity. The fast neutrons from the reactor fuelelements are slowed down by successive collisions in themoderator, where they come to thermal equilibrium at thereactor temperature T . The neutrons have a Maxwelliandistribution of velocities, and the root-mean-square velo-city is given by

mv2/2 = 3kBT/2 (2)

where kB is the Boltzmann constant. It is a fortunate cir-cumstance that the rms velocities of thermal neutrons cor-respond to wavelengths best suited for diffraction studies.For instance, T = 100◦C corresponds to an rms velocityof 3000 m/s, λ = 1.3 A = 0.13 nm, or an kinetic energyof 48 meV. Recently, neutrons were also produced by theso-called spallation sources, where neutrons are strippedfrom the target nuclei by a high-energy proton beam, andthen moderated to thermal velocities. It will be seen thatneutron diffraction and X-ray diffraction are complemen-tary methods in studies of condensed matter. However,it took a long time before the pioneering work by B. N.Brockhouse and C. G. Shull was recognized by the NobelPrize in Physics in 1994.

Production of synchrotron radiation (SR) in particle ac-celerators was predicted by several scientists in the 1940s,and it was first observed at a General Electric laboratoryin 1947. SR wavelengths that are needed for diffractionstudies are produced only in synchrotrons or storage ringswhere light particles (electrons or positrons) are acceler-ated to energies of several GeV. Such accelerators werebuilt for particle physics research, and from the early1960s SR was used for condensed matter research. Thesources were optimized for collision experiments, not forproduction of SR, and many compromises had to be made.The first storage rings dedicated as SR sources were builtin 1970s, and the present-day facilities are called the third-generation sources. Synchrotron radiation is X-rays, butthe radiation is collimated to a narrow cone, its intensity ismany orders of magnitude larger than that available fromX-ray tubes, radiation is polarized, and it is emitted in shortpulses. These properties have revolutionized the methodsof X-ray scattering, diffraction included, and many newtypes of experiments have become possible. The wave-length distribution of SR is essentially continuous, andthis has emphasized the complementary nature of SR andneutron scattering studies. Also, the mode of operationof an SR laboratory is very similar to that of a researchreactor, and quite often the same scientists utilize bothfacilities.

I. ELASTIC SCATTERING OFX-RAYS AND NEUTRONS

A. X-Ray Scattering

Scattering of X-rays has been treated in many textbooks(James, 1962; Warren, 1969). There are several levels ofsophistication, but it is important that a simple classicalmodel of elastic scattering of an electromagnetic (EM)wave by an electron is fairly accurate. Descriptively, theEM wave makes the electron “dance” in the direction per-pendicular to the propagation direction of the wave, andas an accelerated charge the electron radiates with the fre-quency of the incident wave. The scattering cross sectionis

dσ/d� = r2e Kpol (3)

where � is the solid angle, re = e2/mc2 = 2.82 ×10−13 cm is the classical electron radius, and Kpol isthe polarization factor. For unpolarized radiation Kpol =(1 + cos2 φ)/2, where φ is the scattering angle.

The scattering units are actually rather atoms than freeelectrons. The electrons are bound to the atoms more orless strongly, except the outer electrons in metals, and theclassical equivalent of scattering from bound electrons isthat of reradiation by electrons in forced harmonic motion.The scattering amplitude of one electron in units of the freeelectron amplitude is

f = ω2/(

ω2 − ω2s − i�ω

) = f ′ + i f ′′ (4)

where ω is the X-ray frequency, ωs the natural frequencyof the oscillating electron, and � a damping factor. Thescattering factor is complex, and it depends strongly on ω

near the resonant frequency ωs. The refractive index n ofX-rays can be calculated by considering the sum wave ofthe incident and scattered radiation,

n = 1 − N (λ2/2π )re f = 1 − δ − iβ (5)

Here N is the number of electrons per unit volume and λ

the X-ray wavelength. It is seen that the real part 1 − δ < 1,when ω > ωs, corresponding to a change of phase of thetransmitted wave. The numerical value of δ is small, typ-ically 10−5. The imaginary part of n corresponds to ab-sorption of the wave,

µo = 2ωβ/c = 2Nλre f ′′ (6)

where µo is the linear absorption coefficient, and c the ve-locity of light. For an atom the scattering from electrons iscalculated as the vector sum of amplitudes from the elec-tron density distribution ρ(r) = |ψ |2, where ψ is the wavefunction of the electrons (see Fig. 1). The amplitude oratomic form factor is

Page 464: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

992 X-Ray, Synchrotron Radiation, and Neutron Diffraction

FIGURE 1 Elastic scattering by an atom. The scattering ampli-tude from a volume element is proportional to the electron densityρ(r), and the phase angle is (k − k0) · r = K · r.

f (K) =∫

ρ(r) exp(iK · r) d3r (7)

where the scattering vector is the difference between thewavevectors of the scattered and incident waves; K =k − ko. It is seen from Fig. 1 that K = 2 sin θ ko, where2θ is the scattering angle. The distribution ρ(r) is almostisotropic even when the atoms are bound in a solid, sothat

f (K ) =∫

4πr2ρ(r )(sin Kr/Kr ) dr (8)

Because of normalization of ψ , f (0) = Z , i.e., the numberof electrons of the atom.

Several important features are seen already in this sim-ple formulation of X-ray scattering. First, the scatteringamplitude is complex, and it changes strongly near theresonant frequencies, which correspond to the K , L , . . .

absorption edges of the atom. This is utilized in the meth-ods based on “anomalous dispersion,” which are becom-ing more and more important in X-ray scattering studies.Second, when Re(n) = 1 − δ < 1, total external reflectiontakes place at grazing incidence. This has made possiblediffraction studies from atomic layers on surfaces, and itis the basis for many X-ray optical components. The mostimportant result is that of Eq. (7), which shows that thescattering amplitude is the Fourier transform of the elec-tron density ρ(r). This is true for any distribution undercertain conditions, and it is the basis of all X-ray diffrac-tion studies.

The preceding formalism, which is based on classicalconcepts, is retained in quantum-mechanical treatment ofscattering. The real and imaginary parts of the atomic scat-tering amplitude are interpreted in terms of electron wavefunctions and transition probabilities, and they can be cal-culated with great precision. Also, the inelastic or Comp-

ton scattering of X-rays from electrons is included in thequantum-mechanical treatment. A thorough discussion onthe correspondence of classical and quantum-mechanicalformulations is given by James, Ch. IV (1962).

The electrical dipole (Thomson) scattering, which wasdescribed earlier, is the dominant process, but the EMwave interacts also with the magnetic moment of elec-trons, due either to the electron spin or the net orbital mo-ment of atoms. The ratio of the spin-scattering amplitudeto the Thomson scattering amplitude is

R ∼= (E/mc2) sin θ (9)

where E is the energy of the X-ray photon. R is appre-ciable only at high photon energies, because the electronrest energy mc2 = 511 keV, but near the absorption edgesof the atom there is a resonant enhancement of the scat-tering amplitude. This has made magnetic scattering ofX-rays a powerful tool for studies of magnetic structures.Previously these were studied by neutron diffraction only,but X-ray diffraction provides complementary informa-tion. Magnetic scattering of X-rays has been discussedby Brunel and de Bergevin (1991), and by Lovesey andCollins (1996).

B. Neutron Scattering

The scattering amplitude of neutron is a more compli-cated quantity than the X-ray scattering amplitude. Thereare two main contributions: scattering of neutron from thenucleus and scattering from the magnetic moments of elec-trons. These will be discussed separately in the following.

In its simplest form, nuclear scattering of neutronsis understood as formation of a compound nucleus andreemission of a neutron. The energy level structure of theunstable compound nucleus determines the cross sectionsof possible nuclear reactions, which may give rise to scat-tering or absorption of the incident neutrons. In the scaleof the neutron wavelengths used in the diffraction studiesthe nuclei are point scatterers. Therefore, the scatteringamplitude b is independent of the scattering angle, unlikein X-ray scattering [cf. Eq. (8)], and the increase of b withthe nuclear charge is weak and not systematic. To make adistinction with the angle-dependent scattering factor f , bis usually called “scattering length.” In the same way as inX-ray scattering the excitation energies of the compoundnucleus make the scattering length complex. In the caseof a single resonance energy Er,

b = ξ + C/[(E − Er) + i�r/2] (10)

where E is the energy of the incident neutron, �r thewidth of the resonance, and C a constant. The first termcorresponds to “potential” scattering, and it is equal tothe nuclear radius, and the second term corresponds to

Page 465: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

X-Ray, Synchrotron Radiation, and Neutron Diffraction 993

“resonance” scattering. Under certain conditions the res-onance term is negative and large enough to make the realpart of b negative.

If the scattering nucleus has a spin I , the compound nu-cleus has a spin I + 1

2 or I − 12 . These nuclei have different

scattering lengths b+ and b−, and the total cross section is

σ = 4πb2 = 4π(w+b2

+ + w−b2−)

(11)

Here w+ = (I + 1)/(2I + 1) and w− = I/(2I + 1) are theweight factors of the two possible compound nuclearstates. The cross section is split in two parts, those ofcoherent scattering, S, and incoherent scattering, s,

σ = S + s = 4π (w+b+ + w−b−)2

+ 4πw+w−(b+ − b−)2 (12)

Only the first term can produce interference, which is ob-served in the diffraction pattern.

There is even more diversity in b due to the existenceof different isotopes. In the general case, each of these hasits characteristic b+ and b− . The different isotopes of anelement will be distributed at random among the atomicpositions, and in the case of a crystal, scattering can bedivided into scattering from the average structure (orderedscattering) and scattering from fluctuations of the structure(disorder scattering). This division will be discussed in thesection where a general formulation for diffraction from acrystal is given. The different factors affecting the nuclearscattering length complicate the interpretation of neutrondiffraction patterns, but on the other hand they allow agreat variety of studies. Experimental determination of bis the basis of all quantitative scattering studies, and one ofthe methods used is determination of the critical angle ofthe total external reflection. In the same way as in the caseof X-rays the refractive index for neutrons is a complexquantity,

n = 1 − N (λ2/2π ) 〈b〉 (13)

where 〈b〉 is the average value of the bound coherent scat-tering amplitude of N nuclei in unit volume. For mostisotopes,〈b〉 is positive, and 1 − n ∼= 10−6 for thermal neu-trons. The critical angle is small, typically of the order of0.1 mrad, which makes possible optical components andexperiments similar to those with X-rays.

In addition to nuclear scattering there is scattering dueto the interaction between the neutron magnetic momentand that of the atom. The bulk of knowledge of the mag-netic structures of solids is based on neutron diffractionstudies. There are two important groups of atoms with netmagnetic moments, namely the first transition elementswith incomplete 3d shells, and rare-earth elements withincomplete 4f shells. The scattering cross section is

dσpm/d� = (2/3)S(S + 1)γ 2r2e f 2

m (14)

where S is the electron spin quantum number, γ the neu-tron magnetic moment in units of nuclear magnetons, andfm the magnetic form factor. This result for spin magneticmoment can be extended in cases where there is orbitalmoment also. The form factor fm arises from outer elec-tron shells, so that there is a strong dependence on thescattering angle (see Fig. 28).

The formalisms of X-ray and neutron scattering are verysimilar, which makes many results directly comparable.However, the interactions with the atoms are fundamen-tally different, which make X-rays and neutron comple-mentary probes. This has become more and more evidentwith the use of synchrotron radiation, as will be seen inthe chapters where applications are discussed.

II. BASIC FORMULAS FOR DIFFRACTION

Scattering from an atom can be generalized to concern anyelectron or nuclear density distribution ρ(r). It is typicalthat ρ(r) has several hierarchical levels. In a crystal theunit cell is repeated in three dimensions, whereas manyorganic materials are built of chainlike molecules, whichform fibers, and these in turn structures of bundles, etc.Any object has a finite size, and in the following the effectof size in diffraction is considered. The results are basedon the use of the convolution theorem of Fourier transformF (Guinier, 1963). Functions g(r) and h(r) are defined inreal space and their Fourier transforms G(K) and H (K)in wavevector or reciprocal space. Then

F[g(r)h(r)] = G(K) ∗ H (K) (15)

where * is the convolution operator.Consider a statistically homogeneous object of volume

V . When its shape is given by function τ (r), which is 1inside the object and 0 outside, the electron (or nuclear)density is

ρ(r) = ρ∞(r)τ (r) (16)

i.e., the object is “cut” by τ (r) from the infinite, statisticallyhomogeneous object with density ρ∞(r). The scatteredamplitude is

A(K) = A∞(K) ∗ T (K) (17)

where T(K) is the transform of τ (r), and A∞(K) that ofρ∞(r).

The intensity is the amplitude multiplied by its complexconjugate, and it turns out to be the Fourier transform of theautocorrelation (Patterson) function P(r) of the densityof scatterers,

P(r) =∫

ρ∞(u)ρ∞(u + r) τ (u) τ (u + r) d3u (18)

Page 466: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

994 X-Ray, Synchrotron Radiation, and Neutron Diffraction

Here the product τ (u)τ (u + r) is unity when both u andu + r are inside the object, and otherwise it is zero. Thisdefines the function V (r), which is the volume commonto the object and its “ghost” at distance r,

V (r) =∫

τ (u) τ (u + r) d3u (19)

The intensity can then be written as

I (K) =∫

P∞(r)V (r) exp[iK · r] d3r

= F [P∞(r)] ∗ |T (K)|2 (20)

Here |T (K)|2 is the Fourier transform of V (r), and P∞(r)corresponds to the average over unit volume. It is instruc-tive to divide P∞(r) in parts that correspond to diffractionfrom the average structure and from the fluctuations of thestructure. The so-called interference function is

I (K) = |T (K)|2/V Vc + {1 + (1/V Vc)

×∫

V (r)[P(r) − 1] exp(iK · r) d3r}. (21)

The first term is non zero only at small values of K. It isproportional to the Fourier transform of V (r), so it dependson the size and shape of the object and not on its internalstructure. This is the small-angle scattering term (SAXSor SANS), and it is described in detail in another article(Kratky and Laggner, 1987).

The term in brackets depends on the distribution of scat-tering objects of volume Vc (e.g., a group of atoms ornuclei) in homogeneous matter. If this distribution is sta-tistically uniform, P(r) = 1, and the interference functionis equal to unity outside the SAXS (or SANS) region in re-ciprocal space. The variations of I (K) about this averagevalue show the fluctuations of the density of scatterers.This is the most complete result that a diffraction exper-iment can provide about the structure of a statisticallyhomogeneous object. The effect of the finite size of theobject is convolution of I (K) − 1 by the Fourier transformof V (r).

III. KINEMATICAL THEORY OFDIFFRACTION BY CRYSTALS

The electron density distribution ρ(r) of a group of atomsat the positions rn may be taken as the sum of distributionsof the individual atoms. In a solid the division of electrondensity is not obvious, and chemical bonding modifiesthe simple superposition of the atomic distributions. Thisis an important field of X-ray diffraction studies, but forthe present purposes the effects of chemical bonding areignored. The scattering amplitude is

G(K) =∫ ∑

n

ρn(r) exp [iK · (r + rn)] d3r

∼=∑

n

fn(K) exp[iK · rn] (22)

If there are M identical groups separated by a vector a,the scattering amplitude of this row can be calculatedin the same way as for diffraction of light in an opticalgrating. The amplitude involves a factor (sin MK · a/2)/(sin K · a/2), which has maxima when K · a = 2πh, whereh is an integer. The widths of the maxima are ≈1/Ma.When there are similar periods of vectors b and c, whichare not coplanar with a, the Laue equations for diffractionconditions by a three-dimensional crystal are obtained,

K · a = 2πh; K · b = 2πk; K · c = 2πl (23)

Here the Miller indices (hkl) are integers (positive, nega-tive, or zero). The geometrical meaning of the Laue equa-tions is that they represent equidistant planes in K-space(reciprocal space), which intersect at reciprocal latticepoints (hkl). These points (relps) define the values of Kthat satisfy the diffraction conditions.

The vectors a, b, and c span the unit cell shown inFig. 2, the basic building block of the crystal. The atomicpositions are given by fractional coordinates, r j = x j a +y j b + z j c. The structure factor is the scattering amplitudeG(K) at relp (hkl),

F(hkl) =∑

j

f j (K ) exp[iKhkl · r j ]

= Vc

∫ ∫ ∫ ∑j

ρ j (xyz) exp[2π i(hx j

+ ky j + lz j ] dx dy dz (24)

The three-dimensional array of relps can be ex-pressed using the reciprocal lattice vectors, Khkl =2π (ha∗ + kb∗ + lc∗). When these are inserted in the Laue

FIGURE 2 Crystal unit cell.

Page 467: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

X-Ray, Synchrotron Radiation, and Neutron Diffraction 995

equations, we obtain

a∗ = b × c/Vc, b∗ = c × a/Vc; c∗ = a × b/Vc

(25)

where Vc = a · b × c is the volume of the unit cell.The significance of these results is the following.

Diffraction maxima occur at discrete values of the scatter-ing vector K = k − ko. It is easy to show that Khkl is thenormal of the reflecting planes, and the Bragg equation isobtained,

2d sin θ = λ or K = 2π/d (26)

where d is the spacing of the lattice planes. The scatteringamplitude at the diffraction maximum is called the struc-ture factor F(hkl), and it is in general a complex quantity.The structure factors are the Fourier coefficients of ρ(xyz),so that

ρ(xyz) = V ∗c

∑hkl

F(hkl) exp[−iKhkl · r] (27)

The integral values (hkl) span the reciprocal lattice, andthe volume of the reciprocal unit cell is V ∗

c = 1/Vc, whichshows that the density of relp’s and the number of reflec-tions increases proportionally to the volume Vc of the unitcell.

Ewald’s construction illustrates the diffraction condi-tion and reciprocal lattice (see Fig. 3). The vector −ko/2π

drawn from the origin of the reciprocal lattice defines thecenter of Ewald’s sphere. Any relp lying on the sphere ful-fills the diffraction condition, and an intensity maximumis observed in the direction k. The diffraction pattern is

FIGURE 3 The Ewald sphere and scanning of reflections. Thecenter of the sphere is P, and the origin of reciprocal space O.Scanning of reflections hk0, hk1, and hk2 is shown. For instance,the diffraction condition is fulfilled for reflection −1, −1, 1 (point B).

recorded by rotating the crystal and the associated recip-rocal lattice to make the relps to intersect Ewald’s sphere,and the detector is placed in the direction of the diffractedray (it may be a stationary two-dimensional detector). Inreality, the relps are not points but small domains due to thefinite width of |T (K)|2 and mosaicity of the crystal. Also,the Ewald sphere is a bit “fuzzy” because of variations ink0. Therefore, the relevant quantity is the integrated reflec-tion, which is obtained when the relp traverses the Ewaldsphere during the scan. The other possibility of recordingthe integrated reflections is to use radiation with a con-tinuous wavelength distribution. The center of the Ewaldsphere covers a range along ko/2π , and relp’s between thespheres of radii from 1/λmax to 1/λmin are intersected. Thisis the Laue method of X-ray and neutron diffraction. In ei-ther method there are several geometrical factors involved,but the main result is that the integrated reflection is pro-portional to the square of the structure factor, |F(hkl)|2.

Determination of crystal structure involves two basicsteps: calculation of the unit cell vectors from the diffrac-tion pattern, i.e., inversion from a∗, b∗, c∗ to a, b, c, andcalculation of the electron or nuclear density from thestructure factors by Fourier inversion [Eq. (26)]. How-ever, the observed quantity is the intensity of the reflection,so that the information of the phase of F(hkl) is lost. Thisis the famous phase problem of diffraction, and several in-genious schemes have been put forward for experimentaland theoretical solutions. These are discussed in detail inanother article (K. Ann Kerr, 1987). The basic assumptionin the preceding discussion is that the scattering ampli-tudes of individual unit cells add up. This is the so-calledkinematical approximation of diffraction, where interac-tions between the incident beam and diffracted beam areignored. In many cases this is a good starting point forcrystal structure determination, but usually corrections tothe kinematical approximation are needed.

The crystal is not a static, perfectly periodic three-dimensional structure, but the atoms are displaced fromtheir ideal positions by δ j , and the structure factor be-comes

F(K) =∑

j

f j (K) exp[iK · (r j + δ j )] (28)

The intensity from a crystal of volume V with N unit cellsis, in electron units,

I (K) =N∑m

[N∑n

Fn F∗n+m exp(iK · rm)

]

= (1/V )∑

m

V (rm)ym(K) exp(iK · rm) (29)

Here rm is the separation of unit cells located at rn andrn+m . As before, the effect of the size of the crystal is

Page 468: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

996 X-Ray, Synchrotron Radiation, and Neutron Diffraction

described by V (rm), which allows the summation be ex-tended to infinity. The use of K instead of Khkl is due tothe fact that in a distorted crystal there is scattered inten-sity between the relps. The average square of the structurefactors is divided in two parts,

ym(K) = ⟨Fn F∗

n+m

⟩ = |〈Fn(K)〉|2 + �m(K) (30)

The first part gives diffracted intensity from the averagestructure, the second from fluctuations of the structure.These may be due to static disorder, and the atoms arealways in thermal motion. The motions are correlated, sothat the thermal diffuse scattering (TDS) is not uniformin reciprocal space. In fact, long-wavelength in-phase vi-brations (acoustic phonons) make the TDS peak underthe Bragg reflections, and their contribution must be sub-tracted. If the average scattering and diffuse scatteringcan be resolved in components, a very complete pictureof atomic positions and their displacements is obtained.

The preceding separation of the average structure isvalid for X-rays, which travel at the speed of light, andin diffraction the instantaneous structure is “seen” by theX-rays. In the course of the experiment the average ofthe configurations is seen. This is not necessarily the casewith neutrons, which may be slower than the elastic waves(phonons) in the crystal. In such a case the TDS is modi-fied, and for instance, the peak under the Bragg reflectionsis not observed. A detailed discussion is found in Interna-tional Tables for Crystallography, Vol. C, Ch. 7.4 (1999).

IV. PERFECT CRYSTAL: DYNAMICALTHEORY OF DIFFRACTION

A complete description of diffraction from a crystalrequires that the interaction between the incident anddiffracted beams is taken into account. It may happen thatthere are several relps on the Ewald sphere at the sametime, so that more than one diffracted beam is excited. Inmost diffraction measurements the goal is to record theintensities of reflections in the two-beam case, where fora given wavelength only one diffracted beam is excited atthe time. However, multiple diffraction provides importantinformation about the phases of the structure factors, andrecently this has been used for an experimental solutionof the phase problem of diffraction.

The interaction of the incident and diffracted beamscan be treated on many levels of sophistication. All thesetreatments are called dynamical theories of diffraction.A good account of the approaches introduced by P. P.Ewald and M. von Laue is given by James (1962), buthere we follow a simplified formulation given by Warren(1969), based on the work of C. G. Darwin. All the variousforms of dynamical diffraction theory were formulated for

X-ray diffraction, but the following discussion applies toneutron diffraction as well with appropriate changes of themeanings of the symbols.

The Darwin treatment begins with description of reflec-tion from a layer of atoms. The resultant amplitudes of thereflected and transmitted beams are calculated using theformalism of Fresnel diffraction. The effects of electronbinding, which lead to a complex scattering amplitude andabsorption, are ignored, but these can be included withoutchanging the essential results of the calculation. It is foundthat there is a π/2 phase shift in the reflected beam, andthis shift plays a very important role in crystal diffraction,because a beam that has been reflected twice has suffereda phase shift of π and is out-of-phase with the incidentbeam. The transmitted beam is the combination of theincident beam and the forward reflected beam, and thisintroduces a small phase shift and makes the real part ofthe refractive index slightly smaller than unity, as alreadydiscussed in Section I.A.

There are two distinctly different cases in perfect crystaldiffraction. The geometry where the reflected beam exitsat the same surface as the incident beam enters is called thereflection or Bragg case, and the geometry where the beamexits at the opposite surface is called the transmission orLaue case. The following discussion concerns the Braggcase, but the Laue case will be discussed briefly.

In Darwin’s theory, diffraction in a crystal is treated byconsidering the propagation of the reflected and transmit-ted beams in a structure of parallel equidistant planes ofatoms. The resultant reflected beam is a coherent sum ofthe beams reflected by one atomic layer and transmittedthrough the layers above. The beam in the direction of theincident beam is a sum of the transmitted beam and thebeams reflected back to the forward direction. The phaseshift of π in two reflections decreases the amplitude ofthe transmitted beam. Calculation of the amplitudes leadsto difference equations for the transmission and reflectioncoefficients of successive atomic layers. These are solvedwith the assumption that the changes in one layer are small,so that first-order approximations can be used. At the sur-face of the crystal the ratio of the reflected amplitude tothe incident amplitude is

R0/T0 = i p/[iε ± (p2 − ε2)1/2] (31)

where p = δ/ sin θ cos θ and ε = θ − θ0. Here 1 − δ is thereal part of the refractive index, and θ the observed Braggangle. By multiplying R0/T0 by its complex conjugateand reorganizing the terms several interesting features areobserved:

� When ε/p is between −1 and +1, the incident beamis totally reflected, i.e., I/I0 = 1

Page 469: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

X-Ray, Synchrotron Radiation, and Neutron Diffraction 997

� The center of the totally reflecting regime is shifted toθ0 because of refraction

� The width of the regime is 2p = 2re Nλ2 F(hkl)Kpol,d/

π sin 2θ� The integrated reflection is Ed = (8/3)p.

The polarization factor Kpol,d is 1 when the electric vec-tor of the incident beam is perpendicular to the plane ofdiffraction (σ -polarization). In the plane of diffraction (π -polarization) the polarization factor is | cos 2θ |, so that foran unpolarized incident beam Kpol,d = (1 + |cos 2θ |)/2.

It is interesting to compare the preceding results withthose in kinematical diffraction. In the same notation, theintegrated reflection from a thick crystal in the symmetri-cal Bragg case is

Ek = (π2 sin 2θ/2µ0λ) (p/Kpol,d)2 Kpol,k = Q/2µ0

(32)

where Q is the integrated kinematical reflectivity per unitpath length of the beam. There is a fundamental differ-ence between kinematical and dynamical diffraction. Inthe latter case the penetration of radiation is limited to athin surface layer of the crystal because of the strong inter-action between the incident and reflected beams, whereasin the case of kinematical diffraction the penetration depthis limited by absorption. The range of total reflection isvery narrow, typically in the arcsec range, so that the inte-grated reflection is much smaller than in diffraction froman imperfect crystal under kinematical conditions. For in-stance, if the 111 reflection from a perfect silicon crystalis measured with 1 A X-rays, the integrated reflectionis 2.8 × 10−5, while the kinematical value is 42 × 10−5.The kinematical and dynamical values converge whenthe interaction between the incident and reflected beamsis weak, or when the interaction volume is very small.These conditions are discussed in the next section. Ab-sorption in a perfect crystal can be treated by introducinga complex index of refraction, or complex atomic scatter-ing factors. The principal result is that the total reflectionis no more complete, and the reflectivity curve becomesasymmetric. An example is shown in Fig. 4 for the Braggcase.

When the crystal thickness is not infinite in the scaleof beam attenuation, the boundary conditions become im-portant. Exact solutions in closed form exist only for aparallel-sided crystal slab. The reflectivity curves showoscillations, which are essentially different in Bragg andLaue cases. When the thickness of the crystal is increasedthe integrated reflection approaches an asymptotic valuein the Bragg case, whereas in the Laue case it oscillatesaround the value that is one-half of the Bragg case asymp-tote. The reason is that the transmitted beam exchanges

FIGURE 4 Reflectivity curves for perfect crystal, when absorp-tion is negligible (thick line), and with absorption (thin and brokenlines). Approximate values for 200 and 400 reflections of NaClwith 1.54 A radiation are used. [From Warren, B. E. (1969). X-RayDiffraction, Addison-Wesley, Reading, MA.]

energy with the reflected beam, and in the Laue case thebalance at the exit surface depends on the thickness of thecrystal. In the Bragg case there is only one exiting beam;in the Laue case there are two, and their total power isthat of the Bragg reflection, when the effects of absorp-tion are negligible. The integrated reflections as functionsof a normalized thickness A are shown in Fig. 5.

FIGURE 5 Integrated reflectivity of a nonabsorbing perfect crys-tal plate in symmetrical Bragg and Laue diffraction. The thicknessD is given in units A= D/�(γ0γh)1/2. The asymptotic values ofπ (Bragg case) and π/2 (Laue case) correspond to the Ewaldsolution of perfect crystal reflectivity.

Page 470: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

998 X-Ray, Synchrotron Radiation, and Neutron Diffraction

V. REAL CRYSTAL AND EXTINCTION,POWDER DIFFRACTION

Most single crystals that are used in studies of crystalstructure are closer to the kinematical than the dynamicallimit of diffraction. It is seen from Fig. 5 that at small val-ues of A the integrated reflection increases linearly withthe thickness. This is the range of kinematical diffraction,where the reflected beam is weak and its power increaseslinearly with the diffracting volume. However, deviationsfrom the conditions of kinematical diffraction are largeenough to warrant correction. The effect is called extinc-tion, and many different methods have been developed tocorrect for it.

Most extinction theories and correction methods arebased on the concept of mosaic crystal. If the mosaicblocks are sufficiently small and their orientations varyappreciably, the blocks diffract independently, and theinteraction between the incident and reflected beams in-side the blocks is negligible. In such a case the crystal iscalled ideally imperfect, and the conditions of kinematicaldiffraction prevail. Although the concept of mosaic crys-tal is oversimplified, since imperfection in a crystal maybe the result of dislocations and inhomogeneous strains, ithas persisted. Darwin introduced two kinds of extinction,primary and secondary, and these concepts have been usedsince then.

Primary extinction is present when the integrated inten-sity from each of the mosaic blocks is less than predictedby kinematical theory. Before reaching an interior blockthe beam may have been diffracted by several other blocks,so that the beam is attenuated more than by ordinary ab-sorption. This effect is called secondary extinction, andit becomes negligible only when the disorientation of theblocks becomes large, or the reflection is weak. Conceptu-ally, primary extinction may be said to arise from coherentor amplitude coupling of beams, while secondary extinc-tion is due to incoherent or intensity coupling of beams.With these concepts, explicit reference to the mosaic crys-tal model is avoided.

The effects of primary and secondary extinction onthe integrated reflection are usually given by the expres-sion

Eobs(hkl) = yp ys Ekin(hkl) (33)

where yp and ys are the primary and secondary extinctioncoefficients, respectively. These are average values, whichrefer to the integral over the reflectivity curve. It would bemore appropriate to use extinction coefficients that varyalong the reflectivity curve, so that no assumptions of thedistributions of the sizes and orientations of the mosaicblocks would be needed.

The size D of a mosaic block is best measured in unitsof the extinction distance �,

t = D/� = D/[Vc/reλKpol,d F(hkl)]=A(γ0γh)1/2 (34)

Here γ0 and γh are the direction cosines of the incidentand reflected beams, respectively. It is seen that t can bemade small and the kinematical limit approached, if theX-ray wavelength or Kpol,d is reduced. Model calculationsindicate that

yp(t) ≈ exp[−Ct2] → 1 − Ct2 (at small t) (35)

where C is a constant, which depends on the shape of themosaic block. This can be used for extrapolation to zeroprimary extinction by changing λ and/or Kpol,d (Suortti,1982). Recently, synchrotron radiation has provided newpossibilities also in this area. In particular, wavelengthsin the 0.1 A regime can be used. These are an order ofmagnitude smaller than the ones used in traditional crys-tallography, so that 1 − yp(t) can be reduced drastically.

Secondary extinction may be substantial even when pri-mary extinction is negligible. For instance, some crystalshave a layer-like structure, where the thin layers are highlyparallel, but diffract incoherently. The interaction betweenthe incident (P0) and diffracted (Ph) beams is describedby energy transfer equations (Zachariasen, 1945)

(∂ P0/∂s0) = −µe P0 + σ Ph

(∂ Ph/∂sh) = −µe Ph + σ P0 (36)

Here s0 and sh are the coordinates in the directionsof the incident and diffracted beams, respectively, andµe = µ0 + σ the total attenuation coefficient, due to thelinear absorption coefficient µ0 and the scattering coeffi-cientσ (s0, sh, θ ). If the transmitted and reflected beams arerecorded as functions of θ , and the geometrical factors areknown, the transfer equations can be solved, if σ dependsweakly on (s0, sh). The extinction-corrected integrated re-flection is

∫σ (θ ) dθ . This method has been used in a few

cases, but with highly collimated, monochromatic SR ex-perimental corrections for secondary extinction could bedone in routine way.

Usually, an angular distribution W (�θ ) of σ (θ ) is as-sumed without measuring the actual reflectivity curve, andthe width of W is a parameter of the model. For a sym-metrical Bragg reflection from a thick sample,

Eobs = Q/2(µ0 + gQ) (37)

where gQ is the mean secondary extinction coefficient.For a Gaussian distribution of W , g = 1/2

√πη, where η is

the standard deviation. In crystal structure determinationa very large number of reflections is recorded, and theindividual reflectivity curves are not examined. Extinctioncorrections are included in structure refinement, where the

Page 471: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

X-Ray, Synchrotron Radiation, and Neutron Diffraction 999

model parameters are fitted by least-squares methods. Themost widely used model for extinction corrections is thatby Becker and Coppens (1974).

Diffraction from powder samples can be used to ap-proach the kinematical limit. An ideal powder sample hasa randomly oriented distribution of small crystallites. Interms of Ewald’s construction, the reciprocal lattice takesall orientations with respect to k0, so that all reflectionswith K ≤ 2k0 take place simultaneously. The intensity ofa given reflection is distributed uniformly on the cone in-scribed by k about k0. The diffraction pattern or a partof it can be projected on a two-dimensional detector. TheBragg reflections form concentric rings, and the diffrac-tion pattern is projected on K by azimuthal integration.Secondary extinction is the same in all reflections, cor-responding to the average scattering contribution to totalattenuation. Primary extinction is usually small because ofthe small grain size of the powder particles, but broadeningand overlap of the reflections make it difficult to separatethe background from the diffraction pattern. Again, SRhas improved the quality of powder diffraction data bymaking instrumental broadening of reflections nearly neg-ligible. Powder diffraction is now used for structure de-termination, which is important because sufficiently largesingle crystals of many interesting materials are not avail-able. However, the principal use of powder diffraction isin materials science, in studies of grain size, strain, andtexture in metals, alloys, and ceramics. These quantitiesare extracted from the shapes and shifts of the reflectionprofiles, and from nonuniform distribution of intensity ondiffraction cones. Accounts of developments in powderdiffraction are found in books edited by Young (1993),and Snyder, Fiala, and Bunge (1999).

VI. SYNCHROTRON RADIATIONSOURCES AND PROPERTIESOF RADIATION

A. Sources

Synchrotron radiation (SR) is emitted when light chargedparticles (electrons or positrons) moving with relativisticvelocity undergo radial acceleration. Small synchrotronshave developed to large (circumference about 1 km) stor-age rings, which are dedicated to production of syn-chrotron radiation. Periodic magnetic structures calledwigglers and undulators are inserted in the straight sec-tions of the storage rings to enhance the flux of the SRsources. The latest advance in the field is development ofspecial storage rings and linear accelerators, where freeelectron lasers provide tunable coherent radiation of highbrilliance. The sources and properties of SR have beendiscussed by Brefeld and Gurtler (1991).

Third-generation synchrotron radiation laboratories areactually accelerator complexes, which include the sourceof particles (electron gun), a preaccelerator (microtronor linac), a booster synchrotron, where the final particleenergy is achieved, and the storage ring. The layout ofthe European Synchrotron Radiation Facility (ESRF) isshown in Fig. 6. The storage ring is actually a polygon,where the particles travel in a vacuum tube, and dipolemagnets bend the particle trajectory at the corners tomake a closed orbit. The particles have formed bunches inthe preaccelerators, and when they are injected to the stor-age ring, their energy is typically several GeV; the lengthof the bunch is of the order of 1 cm, and its diameter issmaller than 0.1 mm. The fill of the storage ring can bevaried from single bunch to about 1000 bunches, so that thetime difference between bunches is from a few microsec-onds to nanoseconds, and the bunch length corresponds toa few tens of picoseconds. The velocity is very close thevelocity of light, because the electron or positron mass isabout 104 times the rest mass. When changing direction inthe magnetic field the particles radiate along the tangent ofthe orbit, as will be seen in the following, and the loss of en-ergy is compensated for in radio-frequency cavities. Whenscattering from residual gas molecules in the vacuum tubeis small the lifetime of the particle beam is about 1 day.

The closed orbit of the particles can be described inphase space. The 1σ -contours of position and angle are el-lipses in horizontal and vertical directions, and the area/πof each one is called the horizontal or vertical emittance, εx

or εz , respectively. A steady state is achieved in a few mil-liseconds after injection, and εx and εz are characteristicconstants of the storage ring. The shape of the emittanceellipse changes along the ring depending on the focusingmagnets, but the area is constant. The ratio εz/εx is calledcoupling, and in modern storage rings it is a few percent.The horizontal emittance of the ESRF is 3 × 10−9 m rad,which corresponds to a typical beam size σx = 300 µmand divergence σ ′

x = 10 µrad. In the vertical direction,both values are reduced by a factor of 10.

B. Properties of SR

The radiation pattern of a nonrelativistic electron orbit-ing in a magnetic field has the well-known dipole radiationdistribution I ∝ sin2 ϕ, where ϕ is the angle between theobservation direction and the direction of radial acceler-ation in the rest frame of the electron. For a relativisticelectron, this radiation distribution has to be transformedby Lorentz transformation into the laboratory system. Asa result, radiation is observed only within a narrow conein the propagation direction of the electron (Fig. 7). Theopening angle of this cone is approximately

�ψ = 1/γ = E0/E (38)

Page 472: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

1000 X-Ray, Synchrotron Radiation, and Neutron Diffraction

FIGURE 6 Layout of the European Synchrotron Radiation Facility (ESRF) in Grenoble, France. The circumferenceof the storage ring is 845 m, and presently 40 beamlines are in operation. [ESRF Highlights 1999; reproduced withpermission.]

where E is the electron energy and E0 is the electron restenergy, 511 keV. The frequency spectrum observed by astationary observer is understood by considering the sit-uation where the narrow radiation cone sweeps past ina short time approximately equal to R/γ 3c, where R isthe radius of curvature of the orbit and c the velocityof light. For the values relevant for the ESRF bendingmagnet, γ = 1.174 × 104, R = 23 m, this single electronflash lasts 4.7 × 10−20 s. This short pulse, Fourier trans-formed to frequency, contains a spectrum of harmonicsup to 2.1 × 1019 Hz, which corresponds to 0.14 A wave-length or 90 keV photon energy. In a real storage ring withmany emitting electrons, a continuous spectrum coveringthe range from infrared to hard X-ray regime is observed.

The spectrum of synchrotron radiation can be calculatedprecisely, and SR is actually used to calibrate instrumentsutilized in astrophysics. The spectral brilliance is usuallygiven in units of photons per second per mm2 source areaper mrad2 source divergence and per 0.1% bandwidth,I = I (x, z, θ, ψ, E, t). Integration over the source areayields the spectral brightness (intensity), and integrationover all angles yields the spectral flux. Universal curves ofspectral brightness and flux can be calculated for a bend-

ing magnet source, and those are illustrated in Fig. 8.The wavelength scale is given in units of the critical orcharacteristic wavelength λc, which is in practical units ofthe bending radius R, magnetic field B, and the electronenergy E ,

λc[A] = 5.59R[m]/E3[GeV3] = 18.6/B[T]E2[GeV2]

(39)

The concrete meaning of λc is that it divides the emittedpower in equal halves. The universal curves can be usedfor calculating the brightness and flux of radiation from agiven bending magnet.

SR from a bending magnet is linearly horizontally po-larized when observed in the orbit plane. Out of the plane,the polarization is elliptical and can be decomposed intoits horizontal and vertical components. These are shownin Fig. 7, and it is seen that the more intense horizontalcomponent is closely approximated by a Gaussian of vari-ance σψ , which is related the full-width-half-maximum byFWHMψ = 2.35 σψ .

The sources of radiation in modern storage ringsare mostly wigglers and undulators. These are periodic

Page 473: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

X-Ray, Synchrotron Radiation, and Neutron Diffraction 1001

FIGURE 7 Cone of synchrotron radiation from a relativistic electron, and vertical intensity distributions of the parallel(electric vector in the orbit plane) and perpendicular components.

Page 474: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

1002 X-Ray, Synchrotron Radiation, and Neutron Diffraction

FIGURE 8 Spectral brightness and flux of SR from a bendingmagnet source in universal units.

magnetic structures, where the electrons travel oscillatingabout the center line. In the simplest case the trajectory issinusoidal, and it can always be described by a few Fouriercomponents. In most cases, the magnetic field is vertical,so that the electron trajectory lies in the horizontal plane.Wigglers and undulators are similar structures, and theyconsist usually of permanent magnet blocks above andbelow the vacuum chamber. The magnetic field can bechanged by opening or closing the gap between the upperand lower jaws. The wiggler or undulator is characterizedby a parameter giving the ratio of the maximum angulardeflection δ of the electron beam to the opening angle ofthe radiation cone, 1/γ ,

K = γ δ = 0.934 λ0 [cm]B0[T] (40)

where λ0 is the period length and B0 the peak magneticfield. When K � 1 the radiation cone sweeps over a widefan 2δ, typically a few mrad, and the device is called wig-gler. The intensities from different source points add up in-coherently, and the brightness and flux are approximately

equal to that from 2N bending magnets, if there areN magnet periods. When K is of the order of 1, wavefronts from different periods interfere coherently, produc-ing sharp peaks in the emitted spectrum. When an elec-tron moves through an undulator of period λu it undergoestransverse harmonic oscillation in its rest frame, and itemits at one frequency. In the laboratory frame, the wave-length of the radiation is

λ1 = (λu

/2γ 2

)(1 + K 2/2 + γ 2θ2) (41)

where θ is the angle between the axis and direction of ob-servation. When K increases the displacements becomelarger, and the electron oscillates also in the longitudinaldirection with double frequency. The oscillations are nolonger harmonic, and shorter wavelengths λi = λ1/ i ap-pear. The odd harmonics have their maximum intensityon axis, while the on-axis intensity of even harmonics iszero. The intensity of the central beam and that of theangle-integrated spectrum are shown in Fig. 9.

The central brightness of an undulator beam is

Ii (θ = 0) = 1.744×1011 N 2 E2[GeV2]I [mA]Fi (K) (42)

where Fi (K) is a function with a maximum value of about0.5. For instance, with K = 1.8 and N = 150, an ESRFundulator delivers 1.4 × 1016 photons/s/mm2/0.1% BWin the third harmonic at 30 m from the source. The totalpower radiated by a wiggler or undulator is

PT[kW] = 0.633E2[GeV2]B20 [T2]L[m]I [A] (43)

so that the foregoing undulator radiates 7.3 kW. The totalwiggler power can be even higher, and therefore all beam-line components exposed to the beam must be efficientlycooled. These numbers can be compared with high-powerX-ray tubes, which may radiate 100 W to a solid angleof 2π , while the undulator radiates to a solid angle of10−8 rad2.

VII. X-RAY OPTICS FOR SYNCHROTRONRADIATION BEAMLINES

X-ray optics are based on certain reflecting and diffractingelements, which are used to select an energy band fromthe SR beam and to focus it on the sample. The transportof the SR beam from the source to the sample and fromthe sample to the detector can be best described by phasespace analysis, where each optical element is a window inposition–angle–wavelength space. The beamline shouldbe seen as an integral system, where the optical compo-nents and the detector should be matched with the reso-lution of the sample and requirements of the experiment.In practice, beamlines are complicated structures, where

Page 475: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

X-Ray, Synchrotron Radiation, and Neutron Diffraction 1003

FIGURE 9 Spectral flux from an undulator. The numerical values correspond to the early operation parameters ofthe ESRF.

compromises between optimal performance and technicalfeasibility must be made.

Two phenomena are utilized in X-ray optics: diffrac-tion by single crystal or by synthetic multilayer structure,and total external reflection from a mirror. Focusing isachieved by curved crystals, multilayers, or mirrors.

A. Mirrors

It was already mentioned that the refractive index is a com-plex quantity, where the real part 1 − δ is slightly smallerthan unity. Therefore, total reflection takes place at theinterface of vacuum and solid mirror, and if absorption issmall, there is a well-defined critical angle,

θc [mrad] = √2δ = 2.3λ (ρZ/A)1/2 (44)

where λ is the X-ray wavelength in A, ρ the mirror den-sity in units of g/cm3, Z the atomic number, and A theatomic mass. Absorption rounds off the sharp edge atthe critical angle. Because of the small opening angle ofSR totally reflecting mirrors provide efficient solutionseven when θc is only a few mrad in a typical case. Idealpoint-to-point focusing is achieved with an elliptical mir-ror, when the source is at one of the focii. In practice,mirrors are ground to cylindrical shape in the sagittal di-rection and bent in the meridional direction. In additionto focusing, an X-ray mirror acts as a low bandpass fil-ter, as the critical angle is inversely proportional to theX-ray energy. In many experiments, the high-energy har-monics from the monochromator cannot be allowed, and

they are rejected by a mirror set to reflect the fundamentalenergy.

Mirror technology has advanced enormously in recentyears. The rms surface roughness of 1 m long mirrors ison the level of atomic size, and the figure errors are lessthan 1 µrad. The usual mirror material is Si or SiC, andthe mirror can be coated by a thin metal layer to providethe desired critical angle. In order to benefit from the highquality of the mirrors their shape must be maintainedeven under the high heat load of the SR beam. The mirroris water-cooled, and its shape is monitored by an opticalsensor. The sensor has feedback to piezoelectric actuators,which correct for the effects of thermal deformations of themirror.

B. Multilayers

Multilayers are synthetic periodical structures, where al-ternating layers of light and heavy elements are depositedon a substrate. Typically, the layer thickness for each crys-tal element is 5 to 10 atomic layers, so that the period inthe direction of the surface normal is 20 to 50 A. Becauseof the large period, the Bragg angles are small, as seenin Fig. 10. The figure shows also the regime of total re-flection at small incident angles. The relative width of theBragg reflection from a multilayer is 1/N in the energyscale, where N is the number of periods (typically be-tween 100 and 1000), so that multilayers can be used aswide bandpass monochromators and focusing elements.Actually, the X-ray optical properties of multilayers are

Page 476: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

1004 X-Ray, Synchrotron Radiation, and Neutron Diffraction

FIGURE 10 Reflectivity of a W/C synthetic multilayer structurefor 8 keV X-rays. The regime of total external reflection is seen atsmall angles, and the first multilayer Bragg reflection at 2.2◦. Thehigher order reflections are strongly suppressed.

somewhere between those of mirrors and crystals, whichmake them useful in many applications.

C. Perfect Crystals

It was already seen that perfect crystals are totally re-flecting in a narrow range about the Bragg angle, whenabsorption is small. Perfect crystals, Si in particular, areused as X-ray optical elements that separate a narrow en-ergy band from the polychromatic SR beam. The widthof the relative energy or wavelength band is constant fora given reflection,

δE/E = re (2d)2 Kpol,d F(hkl)/πVc (45)

which is easily derived from Eq. (31). The width of therelative energy band varies from 10−4 of the low orderreflections to 10−7 or 10−8 for the high-order reflectionsused in back-reflection geometry. The angular width of the

reflection is proportional to the wavelength, and it variesfrom a few arcseconds to the submicroradian range. Theother contribution to the width of the energy band comesfrom the variations in the incident angle. From the Bragglaw,

�E/E = − cot θ�θ (46)

The standard monochromator construction is the so-callednondispersive or antiparallel setting of two identical crys-tals, as shown in Fig. 11. It is evident that the wavelengthband reflected by the first crystal is reflected by the sec-ond, and the propagation direction is conserved. There isan offset of the beam, and various geometrical and me-chanical solutions have been introduced to keep the offsetconstant when the energy is changed by rotating the crys-tals. The first crystal is cooled, because the heat load ofthe SR beam causes distortions and large changes of in-tensity and energy. Perhaps the most advanced solutionis to cool the crystal to about 100 K by liquid nitrogen,because at that temperature the thermal expansion coeffi-cient of Si goes to zero, and the thermal conductivity hasa maximum. The second crystal may be bent sagittallyfor horizontal focusing, and different constructions havebeen put forward to maintain the cylindrical shape underdynamical bending.

Bent perfect crystals are used also for meridional hor-izontal focusing, particularly at high SR energies, wherethe geometrical aberrations of standard constructions be-come large due to small Bragg angles. Such an arrange-ment is shown in Fig. 12. The beam is first monochrom-atized by a cylindrically bent Laue-type crystal, and thenfocused in the horizontal plane by a Bragg-type crystal.Vertical focusing is obtained by a multilayer, which isplaced between the monochromator and sample. Focussizes of a few micrometers have been achieved, which al-lows probing local strains in bulk samples, for instance.

D. Other X-Ray Optical Elements

All X-ray optical elements of SR beamlines are basedeither on refraction or diffraction. An interesting com-bination of the two phenomena is used in so-calledBragg–Fresnel lenses. Focusing by Fresnel zone platesis well-known in optics, and when the zone plates aremade of perfect single crystals they can act as narrow-bandmonochromators at the same time. Photographs of linearand circular Bragg–Fresnel lenses are shown in Fig. 13.The special feature of the structure is that the X-rays re-flected from the bottom of the grooves interfere construc-tively at the same place as those reflected from the elevatedpart of the lens. This is achieved by an appropriate depth ofthe groove, which is typically a few micrometers. Circu-lar Bragg–Fresnel lenses form a point focus, but they can

Page 477: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

X-Ray, Synchrotron Radiation, and Neutron Diffraction 1005

FIGURE 11 Geometry of double-crystal monochromator in nondispersive (n, −n) (solid lines) and dispersive (n, n)(broken lines) settings.

be used only near normal incidence of the beam, whichlimits the available X-ray energies. Linear lenses form aline focus, but they can be bent cylindrically to focus alsoin the other direction. Focus sizes in the micrometer rangehave been achieved. The limitation of the Bragg–Fresnel

FIGURE 12 Laue–Bragg monochromator. Horizontal focusing is achieved by bent crystals in the geometry where thesource points and focii are on the Rowland circles. Vertical focusing is realized by a bent multilayer. [ESRF Highlights1996/1997, reproduced with permission.]

lenses is their small size, which limits the useful apertureof the SR beam.

The fact that the refractive index of X-rays is smallerthan unity can be utilized for construction of X-ray lenses.For a long time, this was considered impossible, because

Page 478: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

1006 X-Ray, Synchrotron Radiation, and Neutron Diffraction

FIGURE 13 Linear and circular Bragg–Fresnel lenses fabri-cated on Si wafers. [ESRF Highlights 1994/1995; reproduced withpermission.]

typically δ is of the order of 10−5. However, a sphericalcavity inside a solid acts as a condensing lens, and if along row of holes is milled in a weakly absorbing materialsuch as Be, a beam traveling along the row is focused.Focusing in the other direction is obtained by a crossedrow of holes. Again, focus sizes in the micrometer rangeare achieved.

Still another method of microfocusing is that of us-ing tapered capillaries. The SR beam enters the capillaryat the wide end, and after successive total reflectionsfrom the inner walls of the capillary, exits from the otherend. Submicron beam sizes have been achieved by thismethod. The methods described above have their advan-tages and disadvantages. With the exception of the bent

crystal scheme, the useful aperture of the focusing elementis small, typically less than 1 mm. There are losses dueto scattering and absorption, and it always has to beremembered that the phase space density of photons can-not be increased. The price paid for high spatial resolutionis the corresponding deterioration of the angular resolu-tion, which limits the use of microfocusing in diffractionstudies.

E. Beamline

It was already mentioned that the power of the SR beamfrom an undulator or wiggler is typically several kW.Therefore, the optical elements and beamline instrumentsmust be placed in “hutches,” which provide radiationshielding. For instance, the spectrum of the radiation fromthe superconducting wavelength shifter at the ESRF ex-tends beyond 1 MeV, and 60 mm of lead is needed forshielding. The beamline components are under vacuumto prevent contamination and production of ozone by theSR beam. An example of a beamline is shown in Fig. 14.The beam enters the first optics hutch through the shieldwall. It is limited by fixed apertures, and usually the “soft”low-energy part of the spectrum is removed by absorbers.The beam position is monitored, and the size of the beamis limited by primary slits to match the optical elementsdownstream. An adaptive mirror focuses the beam to in-finity and removes high energies, a narrow energy band isseparated by a two-crystal monochromator, and the beamis focused on the sample by the second mirror. At highenergies, where the critical angle of the mirrors becomestoo small, the mirrors can be moved away, and horizontalfocusing is achieved by the sagittally bent second crystalof the monochromator. All the components of the beam-line are controlled remotely, and there are several feedbackloops, which are used to maximize and stabilize the inten-sity of the exit beam. Usually the sample is enclosed in achamber, where it is under vacuum or high pressure, it canbe heated or cooled, and electric or magnetic fields can beapplied. In diffraction studies, usually a four-circle or evensix-circle diffractometer is used. Two angular motions areneeded to orient the crystal for fulfillment of the diffractionconditions, and two motions are needed for rotation of thecrystal during the scan and for tracking the diffracted beamby the detector. There may be an analyzer crystal in frontof the detector to remove parasitic scattering. However,more and more two-dimensional detectors are used, whicheliminates the detector movement and increases the speedof data collection. On the other hand, most such detectorsoperate in the integrating mode without any energy res-olution, which increases the background in the observeddiffraction pattern. A modern diffractometer is shown inFig. 15.

Page 479: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

X-Ray, Synchrotron Radiation, and Neutron Diffraction 1007

FIGURE 14 X-ray optical components of a wiggler beamline at the ESRF.

VIII. X-RAY DIFFRACTION METHODS ANDAPPLICATIONS OF SYNCHROTRONRADIATION

The conventional X-ray diffraction methods are coveredin another article (K. Ann Kerr, 1987), and the presentone concentrates on new developments, which originatemostly from SR laboratories. The methods are numerous,and only short descriptions of the main areas of researchare possible.

A. Structure Determination,Protein Crystallography

Recent reviews of protein crystallography based on the useof SR have been given by Bartunik (1991) and Helliwell(1992). The first steps of structure determination—growing the crystal, determination of the unit cell andthe orientation matrix—have stayed the same as before,but there are some important advances. Very small crys-tals can be used, since 1010 to 1011 photons/s can be fo-cused on a sample of 10 µm diameter. Peak search and

orientation of the crystal is greatly facilitated by area de-tectors, most notably by image intensifiers coupled to fastreadout CCDs. In general, area detectors with associatedsoftware have made collection and analysis of diffractiondata a routine step in structure determination. On-line dataanalysis runs automatically as diffraction patterns are col-lected and keeps up with the data acquisition, so that theexperimenter knows the quality of the data, completenessetc. A graphical user interface interacts with different pro-cesses (beamline control, data acquisition, and data anal-ysis) running on different computers at different times.

Diffraction experiments with monochromatic radiationuse mostly the rotating crystal method to scan reciprocalspace. A diffractometer, such as the one shown in Fig. 15,is used for crystal orientation and scanning. The data ratesare enormous, because the area detector has 106 to 107 pix-els, and the dynamic range is 14 bits. It is estimated thatthe rate of uncorrected data may be up to 150 GB/day. Ev-idently, mass storage and structured data base are neededto handle the data. The essential step is reduction of thedata to integrated intensities of the reflections, indexing thereflections, and making corrections for absorption, beam

Page 480: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

1008 X-Ray, Synchrotron Radiation, and Neutron Diffraction

FIGURE 15 Six-axis Kappa diffractometer at the Materials Science beamline of the ESRF. All axes are mechanicallyindependent, and they allow free choice of the scattering plane. The detector arm can be equipped with a crystalanalyzer or an area detector.

polarization, geometrical factors, and probably for extinc-tion. The resulting structure factors without informationabout their phase angle give the starting point for crystalstructure analysis.

Synchrotron radiation provides new possibilities for ex-perimental determination of the phase angles of the struc-ture factors. The use of multiple diffraction was mentionedearlier, but the methods based on anomalous dispersionare more important at the moment. The atomic scatteringfactor changes strongly near the absorption edges [cf.Eq. (4)]. The principle of phase determination from struc-ture factors where the scattering factor of one type of atomsis changed is shown in Fig. 16. The changes � f ′ and i� f ′′

can be calculated from theory, but the free-atom valuesmay be modified in the crystal. The imaginary part, whichis due to photoelectric absorption, can be determinedexperimentally by measuring the fluorescence signal asa function of energy, and the real part can be calculatedfrom f ′′ by Kramers–Kronig relations. Unambiguousphase angles are obtained from measurements with two orthree X-ray energies, where the changes in f ′ and f ′′ arelarge. Most native proteins do not have heavy atoms withabsorption edges in the range of energies suitable fordiffraction studies, i.e., above 5 keV. Preparation of

heavy-atom derivatives is a standard problem in proteincrystallography, and there is much empirical knowledge.A quite general technique is to introduce Se duringbiosynthesis of proteins, and the K-edge of Se is at theconvenient energy of 12.66 keV.

The use of continuous radiation and stationary crystal,i.e., the Laue method, has had a renaissance with the ad-vent of synchrotron radiation. The intensity of SR as afunction of energy can be calculated precisely, so that theintegrated intensities of the reflections recorded with dif-ferent energies can be converted to structure factors on acommon scale. A diffraction pattern is shown in Fig. 17.It is remarkable that this pattern was taken in a singleshot of 150 picoseconds using an image intensifier cou-pled to a CCD camera, i.e., the pattern is due to diffrac-tion of X-rays from a single electron bunch of the storagering. On one hand, this means that it is possible to de-termine structures of proteins which disintegrate rapidlydue to radiation damage. It is also possible to make pump-and-probe experiments, where a conformational changeof a protein is triggered by a laser pulse, and by vary-ing the time difference to the X-ray pulse the evolutionof a change in structure can be followed. The changes ofthe structure factors are observed, and the changes of the

Page 481: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

X-Ray, Synchrotron Radiation, and Neutron Diffraction 1009

FIGURE 16 Determination of the phase angle of the structure factor by using the multiple-wavelength anomalousdispersion (MAD) method. FA is the structure factor contribution of the anomalous scatterer, and FP that from therest of the unit cell. The phase angle φ is solved from the intersection points of the circles of radius |F | (Harkerdiagram).

electron density are calculated from Eq. (27). One recentexample is the study of binding and release of CO at theheme site of myoglobin. The movements of CO have beenstudied extensively by infrared spectroscopy and molec-ular dynamics calculations, but time-resolved diffractionof SR provides a detailed picture of the conformationalchanges of the protein. The movement of the CO moleculeto the docking site, where it stays ca 350 ns before leavingthe pocket, can be seen. Then the molecule diffuses aboutin the outer protein for a fraction of millisecond and comesback to the Fe by random collisions.

Radiation damage is a limiting factor in protein crys-tallography. Most of the radiation falling on the crystal isnot used, either as wavelengths that are not reflected in theLaue method, or during the time spent between the reflec-tions in monochromatic diffraction. Radiation damage isnot a linear function of dose, but also scales with the totalduration of the experiment. Therefore, efficient data col-lection strategies are essential. It turns out that the damagecaused by high-energy radiation is substantially less thandamage at the energies normally used in protein crystal-lography.

Page 482: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

1010 X-Ray, Synchrotron Radiation, and Neutron Diffraction

FIGURE 17 Laue diffraction pattern from a protein crystal takenin a single shot of 150 ps from an ESRF undulator. The band-width is 7–11 keV, and the image was recorded by a CCD cameracoupled to an image intensifier. [ESRF Highlights 1994/1995; re-produced with permission.]

B. Fiber Diffraction

It was already mentioned that many materials and tis-sues found in nature have hierarchical structures. Thismeans that there are many different length scales, and thediffraction pattern extends from ultrasmall angles to thewide-angle regime. Typically, the structures are built oflong, chainlike molecules, which form fibrils, and these inturn are packed in ordered bundles. The interatomic dis-tances in the molecule are seen in the wide-angle regimeof diffraction, and the effects of the size and shape of thebundles in the ultrasmall-angle regime. Between these ex-tremes, ordering is seen as diffraction maxima. Collagen isa good example of these hierarchical structures. There aremany variants of collagen, and study of collagen structuresand functions is a wide field of research by itself. Diffrac-tion studies with conventional X-ray sources were limitedto well-ordered collagen structures, such as tendon. Thehigh intensity and collimation of SR has made possiblehigh-resolution studies in the whole angular range. Anexample is given in the following.

Human dermis under the epidermis is mostly composedof fibroblasts in a matrix of collagen with small amountsof proteoglycans and elastin. The molecular structure ofcollagen is a triple helix, which is about 3000 A long. Themolecules form twisted fibrils, where adjacent moleculesare parallel and shifted by 650 A. In the dermis, these fib-rils are also parallel forming bundles of 20 to 100 fibrils,

and the bundles are more or less randomly oriented. Theexact dimensions of these structures can be seen in thediffraction pattern. Only the small-angle and ultrasmall-angle part of the pattern is shown in Fig. 18. The diffrac-tion maxima due to the 652 A periodicity in the direc-tion of the fiber axis are easily identified. One maximumarises from the 1150 A spacing in the hexagonal packingof the fibrils in a bundle. In addition, there are oscillationsthat correspond to the side maxima of the Fourier trans-form of the cylindrical shape of the fibrils. The diameterof the fibrils is estimated to be about 1100 A. The sizeof the bundles is too large to be evaluated reliably fromthe intensity at ultrasmall angles. This interpretation ofthe collagen structure in human dermis is summarized inFig. 18.

C. Surface Structures

The present understanding of the atomic structure of sur-faces is to a large extent based on studies with SR. Becauseof the high intensity, a sufficient signal can be obtainedfrom an atomic layer, and total reflection at grazing inci-dence largely eliminates the signal from bulk. The stan-dard geometry of surface diffraction studies is shown inFig. 19. The SR beam is incident at a small angle α, anddiffraction is observed as a function of the lateral angle 2θ

along a line perpendicular to the surface.Surface diffraction takes place in a few top atomic lay-

ers of the surface, so that propagation of X-rays on thesurface must be studied more closely. The critical angleαc, which is given by Eq. (44), corresponds to a wave vec-tor transfer Kc = 2k sin αc ≈ 2kαc, which is independentof wavelength. Snell’s law can be written at small anglesas α2 = α′2 + α2

c . This implies that α′ must be imaginaryfor α < αc, which physically means that the transmittedwave at grazing angle α′ is transformed into an evanes-cent wave propagating along the surface with an expo-nentially damped intensity profile. The penetration depthapproaches 1/Kc at the limit α � αc. The evanescent wavecan be considered as an incident plane wave, which willbe diffracted by the in-plane structures.

Surface structures can be classified as termination of thethree-dimensional crystal at the surface, reconstruction ofthe topmost atomic layers of the surface, terraces of atomiclayers on inclined surfaces, ordering of adsorbed mono- orsubmonolayers on crystal surface, and structure of filmson surfaces. The termination of the crystal gives rise toso-called truncation rods, as the two-dimensional surfacestructure is Fourier-transformed to one-dimensional struc-ture in reciprocal space. The relps of the bulk crystal lie onthese rods, but the intensity is not zero between the relps.Formally, this is most easily seen by using the convolutiontheorem. The three-dimensional δ-function of the crystal

Page 483: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

X-Ray, Synchrotron Radiation, and Neutron Diffraction 1011

FIGURE 18 Small-angle diffraction pattern from human dermis. The maxima in the diffraction pattern arise from theaxial periodicity of the collagen, the diameter of the fibrils and their hexagonal packing. [ESRF Newsletter No. 29,1997; reproduced with permission.]

lattice is multiplied by a step function. The reciprocal lat-tice is then convoluted by the Fourier transform of thestep function, which is of form 1/(iK · c), if the latticevectors a and b are in the surface plane. The atomic lay-ers on the surface reorganize themselves spontaneously,forming new bonds with other atoms on the surface andin the layers underneath. This produces two-dimensionalstructures, which do not have the same periodicity as theterminated crystal, and in reciprocal space fractional orderrods are observed. Similar structures develop when atomsare adsorbed on a surface. An example is shown in Fig. 20.Oxygen atoms on the (110) surface of Cu create a rect-angular unit cell, the so-called c6×2 structure. Studies ofthe surface structures are important from the technologicalpoint of view, but also for understanding the mechanismsof phase transitions, for instance. A review has been givenby Robinson and Tweet (1992).

Solid films of organic molecules can be grown on a gas–water interface; these monomolecular layers are calledLangmuir layers. In these films the hydrophilic part ofthe long molecule is embedded in the water subphase,whereas the hydrophobic part, often aliphatic chains,points towards the gas. Studies of the structures and order-

ing of Langmuir films are important for many reasons. Thefilms are model systems for studies of two-dimensionalordering processes, including superstructure and domainformation. Intermolecular interactions can be changed inwide limits by varying the surface charge density. Lang-muir layers are also important models in membrane bio-physics for the study of lipid–lipid and lipid–protein in-teractions. Films can be “peeled” from the water surfaceto form mono- or multilayers, called Langmuir–Blodgettfilms, which have unique optical and electronic properties.Information of the lateral structure of the Langmuir films isbased mostly on surface diffraction studies by SR. Becauseof the relatively large thickness of the film, the contributionof the substrate is rather small. A review of these studieshas been given by Als-Nielsen and Mohwald (1991).

D. Microdiffraction

It was already mentioned that the intensity of SR at anESRF crystallography beamline is 1010 to 1011 ph/s in aspot of 10 µm diameter. The beam size can be further re-duced by capillary optics, and at the microfocus beamlineof the ESRF the flux in a 2 µm2 spot is 4 × 1010 ph/s.

Page 484: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

1012 X-Ray, Synchrotron Radiation, and Neutron Diffraction

FIGURE 19 Geometry of surface diffraction. Specular reflection(top) probes the density profile across the surface layer. When theangle of incidence is less than the critical angle the evanescentwave is diffracted by the in-plane structure of the surface layer(bottom).

The divergences of the beam increase in proportion, sothat such a beam is not suitable for studies where highangular resolution is needed. Some length scales are sum-marized in Fig. 21. An example of fiber diffraction wasgiven earlier, but there the sample was relatively large. Amicrofocused beam allows diffraction measurements on asingle fiber, and recent examples include structural stud-ies of spider silk and cellulose fibrils in a single wood cellwall (ESRF Highlights, 1999).

Perhaps the most important applications are thosewhere microdiffraction is combined with the use of othermicroprobes. One such combination is simultaneous map-ping of X-ray fluorescence and diffraction patterns. Fluo-rescence from the sample is recorded by a solid-state de-tector, such as Li-drifted Si, and the diffraction pattern isrecorded by a two-dimensional CCD detector. The probingbeam is of the order of 1 µm, and the sample is scannedin even smaller steps. In general, with the resolution of1 µm diffraction has become one of the imaging meth-ods. In its most advanced form, tomographic imagingprovides three-dimensional morphology and elemental

FIGURE 20 Oxygen atoms (black) on the Cu(110) surface(white). There are two different positions for the O atoms, andtheir density is 2/3 of the Cu atoms in a nonreconstructed layer.The structure is called c6 × 2, and the unit cell is indicated by therectangle. [From Robinson, I. K., and Tweet, D. J. (1992). Rep.Prog. Phys. 55, 599–651.]

FIGURE 21 Comparison of the spatial resolution of microdiffrac-tion with the resolution of microscopes. Sizes of different objectsare indicated in the bottom panel.

Page 485: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

X-Ray, Synchrotron Radiation, and Neutron Diffraction 1013

composition of an object, and diffraction adds anotherdimension by revealing the structures on the molecularand atomic level.

E. Powder Diffraction

One of the most important applications of X-ray diffrac-tion in materials science is powder diffraction. Thismethod, which is widely used in nondestructive testingof materials, has experienced a renaissance as a tool instructure research. The high intensity, collimation, andnarrow energy band of a monochromatic SR beam havevirtually eliminated the instrumental broadening of pow-der reflections. It is possible to study small samples inspecial environments, to record the full diffraction conesby two-dimensional detectors for analysis of the textureof the sample, or to insert a crystal analyzer between thesample and detector to eliminate all incoherent scatter-ing. Many interesting materials are not available as suf-ficiently large crystals for single-crystal diffraction, andpowder diffraction may be the only possibility in structuralstudies of materials with coexisting phases. The densityof reflections becomes very large when the unit cell di-mensions approach 15 or 20 A. Several pattern decom-position and indexing programs are available, and directmethods can be used for solving the structure. In the finalphase, the atomic positions and even thermal parameterscan be refined by the Rietveld method, where the calculateddiffraction pattern is fitted to the observed pattern by least-squares methods (see Section X.B.)

F. Other Diffraction Studies by SR

The preceding areas of diffraction studies by SR repre-sent only a part of a very broad field. Small-angle scat-tering (SAXS) is covered by another article; topographyhas expanded from studies of crystal defects to phase con-trast imaging where the coherence of the SR beam is uti-lized; structural changes due to magnetic ordering are seenby high-energy diffraction with submicroradian resolu-tion. Magnetic structures themselves can be studied bySR diffraction, as a very large enhancement of the signalis seen near the absorption edges. A new field is diffrac-tion where the energy of the incident beam is tuned acrossan absorption edge of the compound. This method pro-vides at least in principle more complete information thanEXAFS spectroscopy. Diffraction of coherent SR revealscorrelations in the sample, which are averaged out whenincoherent radiation is used. When coherent radiation isscattered from a disordered system, it gives rise to a ran-dom but reproducible “speckle” pattern, which is relatedto the exact spatial arrangement of the disorder. When

this evolves with time the speckle pattern changes, andobservation of intensity fluctuations at a single point ofthe pattern is a direct measure of the dynamics. The shortwavelength of X-rays provides atomic-scale resolution,so that critical fluctuations near an order–disorder tran-sition can be studied, for instance. Still another impor-tant field is studies of structures under high pressure. Indiamond anvil cells small amounts of solids can be sub-jected to pressure that is close to that at the center of theearth. Understanding the structure of solids under suchconditions is important in geophysics or even in astro-physics, and high-pressure studies also address the basicquestions of interactions between atoms. On the methodsside, the continuous and precisely known spectrum of SRmakes quantitative diffraction measurements with energy-dispersive techniques possible. In this method, scatteringis observed at a fixed angle, and the spectrum is ana-lyzed by a solid-state detector, i.e., the diffraction pat-tern is recorded in the energy scale using the Bragg law[Eq. (26)].

In summary, the use of synchrotron radiation has madepossible many new fields of research where X-ray diffrac-tion is used. Perhaps the greatest impact is due to the useof the coherence properties of SR in imaging and stu-dies of correlations, and on the other hand, due to com-bination of various methods in the submicron-lengthscales.

IX. NEUTRON SOURCESAND NEUTRON OPTICS

A. Nuclear Reactors and Spallation Sources

The neutrons for scattering experiments are alwaysproduced in large facilities, because there is no equivalentto the small X-ray laboratory unit. The size and operationpractices of the research reactors and spallation sourcesare similar to those of synchrotron radiation facilities. Thesource is surrounded by a large number of experimentalstations, which extract different distributions of X-rays orneutrons, and have different instruments for the experi-ments. The most common sources of neutrons are nuclearreactors, which are based on the continuous, self-sustainedfission reaction. In the research reactors the power densityis maximized in the volume that “leaks” the neutronsout. The fuel rods are made of highly enriched 235U. Thedistribution of neutron spectrum is centered about 1 MeV;most of them are moderated in the cooling liquid, D2O orH2O, and are absorbed in fuel to propagate the reaction.As large a fraction as possible is allowed to leak out asfast neutrons into the surrounding moderator and to slowdown to thermal equilibrium with this moderator. The

Page 486: Encyclopedia of Physical Science and Technology - Condensed Matter

P1: GRB Final pages

Encyclopedia of Physical Science and Technology EN017B-831 August 2, 2001 19:37

1014 X-Ray, Synchrotron Radiation, and Neutron Diffraction

FIGURE 22 Layout of the experimental facilities of the Institut Laue-Langevin (ILL) in Grenoble, France. The diffractionexperiments are indicated by letter “D,” and these include single-crystal and powder diffractometers (e.g., D1) andsmall-angle scattering instruments (e.g., D11). In addition, there are instruments for inelastic scattering (“IN”) andother experiments. Also, the spectral range of neutrons (hot, thermal, cold) is indicated. [ILL Annual Report 1999;reproduced with permission.]

mean energy of the Maxwellian distribution at 300 K is 38meV, which corresponds to 1.8 A wavelength [cf. Eqs. (1)and (2)].

Neutron beams are extracted through holes that pene-trate the moderator. The layout of the experimental stationsaround the High-Flux reactor at the Institut Laue-Langevinin Grenoble, France, is shown in Fig. 22. To shift the spec-trum in energy, a cold source (liquid deuterium at 25 K)and a hot source (graphite at 2400 K) have been insertedin the D2O moderator. These extend the wavelength rangebetween 20 A and 0.2 A.

The second method of producing neutrons is withcharged particles (usually protons) striking target nuclei.Short bursts (<1 µs) of about 1 GeV protons producehigh-energy “evaporation” neutrons, which have energiesextending close to that of the incident proton beam. Theneutrons are moderated as rapidly as possible in hydroge-nous materials such as polyethylene to provide a shortpulse of “slow” (En < 10 eV) neutrons. The frequency ofthe pulses is between 10 and 100 Hz.

There are fundamental differences between experi-ments carried out at steady reactor sources and those madeat pulsed sources. This is illustrated in Fig. 23. A crystalmonochromator is used to separate a narrow energy bandfrom the distribution of thermal neutrons, and this is usedfor diffraction experiments much like the beam from anX-ray tube. On the other hand, the beam from a pulsedsource has a wide energy distribution, and for an efficientuse of the beam time-of-flight (TOF) method is used. Thepulse length is 1 to 50 µs, depending on energy, and theirseparation is 10 to 100 ms. The average intensity is low,but very high within each pulse. With the TOF method,all neutrons scattered to a given angle are exploited. Thediffraction pattern is analyzed as a function of the neutronwavelength, which is related to the flight time τ overdistance L ,

λ (A) = 0.3966τ (µs)/L (cm) (47)

Actually, the TOF method is rather similar to the energy-dispersive diffraction method with X-rays, the difference

Page 487: Encyclopedia of Physical Science and Technology - Condensed Matter
Page 488: Encyclopedia of Physical Science and Technology - Condensed Matter
Page 489: Encyclopedia of Physical Science and Technology - Condensed Matter
Page 490: Encyclopedia of Physical Science and Technology - Condensed Matter
Page 491: Encyclopedia of Physical Science and Technology - Condensed Matter
Page 492: Encyclopedia of Physical Science and Technology - Condensed Matter
Page 493: Encyclopedia of Physical Science and Technology - Condensed Matter
Page 494: Encyclopedia of Physical Science and Technology - Condensed Matter
Page 495: Encyclopedia of Physical Science and Technology - Condensed Matter