189192401-theory-of-defects-in-semiconductors

275
This book is dedicated to Manuel Cardona, who has done so much for the field of defects in semiconductors over the past decades, and convinced so many theorists to calculate beyond what they thought possible.

Upload: t-louise-blanchard-ms-industrial-engineering

Post on 18-Aug-2015

17 views

Category:

Documents


1 download

TRANSCRIPT

This book is dedicated to Manuel Cardona, who has done somuch for the field of defects in semiconductors over the past

decades, and convinced so many theorists to calculate beyondwhat they thought possible.

Preface

Semiconductor materials emerged after World War II and their impact onour lives has grown ever since. Semiconductor technology is, to a large ex-tent, the art of defect engineering. Today, defect control is often done at theatomic level. Theory has played a critical role in understanding, and thereforecontrolling, the properties of defects.

Conversely, the careful experimental studies of defects in Ge, Si, thenmany other semiconductor materials has a huge database of measured quan-tities which allowed theorists to test their methods and approximations.

Dramatic improvement in methodology, especially density functional the-ory, along with inexpensive and fast computers, has impedance matched theexperimentalist and theorist in ways unanticipated before the late eighties.As a result, the theory of defects in semiconductors has become quantitativein many respects. Today, more powerful theoretical approaches are still beingdeveloped. More importantly perhaps, the tools developed to study defectsin semiconductors are now being adapted to approach many new challengesassociated with nanoscience, a very long list that includes quantum dots,buckyballs and buckytubes, spintronics, interfaces, and many others.

Despite the importance of the field, there have been no modern attemptsto treat the computational science of the field in a coherent manner within asingle treatise. This is the aim of the present volume.

This book brings together several leaders in theoretical research on defectsin semiconductors. Although the treatment is tutorial, the level at which thevarious applications are discussed is today’s state-of-the-art in the field.

The book begins with a ‘big picture’ view from Manuel Cardona, andcontinues with a brief summary of the historical development of the subjectin Chapter 1. This includes an overview of today’s most commonly usedmethod to describe defects.

We have attempted to create a balanced and tutorial treatment of thebasic theory and methodology in Chapters 3-6. They including detailed dis-cussions of the approximations involved, the calculation of electrically-activelevels, and extensions of the theory to finite temperatures. Two emergingelectronic structure methodologies of special importance to the field are dis-cussed in Chapters 7 (Quantum Monte-Carlo) and 8 (the GW method). Thencome two chapters on molecular dynamics (MD). In chapter 9, a combina-

VIII Preface

tion of high-level and approximate MD is developed, with applications tothe dynamics of extended defect. Chapter 10 deals with semiempirical treat-ments of microstructures, including issues such as wafer bonding. (Chapters9 and 10). The book concludes with studies of defects and their role in thephoto-response of topologically disordered (amorphous) systems.

The intended audience for the book is graduate students as well as ad-vanced researchers in physics, chemistry, materials science, and engineering.We have sought to provide self-contained descriptions of the work, with de-tailed references available when needed. The book may be used as a text ina practical graduate course designed to prepare students for research workon defects in semiconductors or first-principles theory in materials sciencein general. The book also serves as a reference for the active theoretical re-searcher, or as a convenient guide for the experimentalist to keep tabs ontheir theorist colleagues.

It was a genuine pleasure to edit this volume. We are delighted with thecontributions provided in a timely fashion by so many busy and accomplishedpeople. We warmly thank all the contributors and hope to have the oppor-tunity to share some nice wine(s) with all of them soon. After all,

When Ptolemy, now long ago,Believed the Earth stood still,He never would have blundered soHad he but drunk his fill.He’d then have felt it circulateAnd would have learnt to say:The true way to investigateIs to drink a bottle a day.

(author unknown)published in Augustus de Morgans A Budget of Paradoxes, (1866).

Athens, Ohio, David A. DraboldLubbock, Texas, Stefan K. Estreicher

February 2006

Contents

ForewordsManuel Cardona . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Early history and contents of the present volume . . . . . . . . . . . . . . . . 12 Bibliometric studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1 Defect theory: an armchair historyDavid A. Drabold, Stefan K. Estreicher . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.2 The evolution of theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131.3 A sketch of first-principles theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

1.3.1 Single particle methods: History . . . . . . . . . . . . . . . . . . . . . . . . . . 161.3.2 Direct approaches to the many-electron problem . . . . . . . . . . . . 181.3.3 Hartree and Hartree-Fock approximations . . . . . . . . . . . . . . . . . . 181.3.4 Density Functional Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

1.4 The Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2 Supercell methods for defect calculationsRisto M. Nieminen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272.2 Density-functional theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292.3 Supercell and other methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302.4 Issues with the supercell method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322.5 The exchange-correlation functionals and the semiconducting gap . 342.6 Core and semicore electrons: pseudopotentials and beyond . . . . . . . . 382.7 Basis sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392.8 Time-dependent and finite-temperature simulations . . . . . . . . . . . . . . 412.9 Jahn-Teller distortions in semiconductor defects . . . . . . . . . . . . . . . . . 42

2.9.1 Vacancy in silicon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422.9.2 Substitutional copper in silicon . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

2.10Vibrational modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452.11Ionisation levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462.12The marker method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 482.13Brillouin-zone sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

X Contents

2.14Charged defects and electrostatic corrections . . . . . . . . . . . . . . . . . . . . 502.15Energy-level references and valence-band alignment . . . . . . . . . . . . . . 532.16Examples: the monovacancy and substitutional copper in silicon . . . 53

2.16.1Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552.16.2Calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

2.17Summary and conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

3 Marker-method calculations for electrical levels usingGaussian-orbital basis-setsJ P Goss, M J Shaw, P R Briddon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633.2 Computational method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

3.2.1 Gaussian basis-set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 653.2.2 Choice of exponents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683.2.3 Case study: bulk silicon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693.2.4 Charge density expansions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

3.3 Electrical levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733.3.1 Formation energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 743.3.2 Calculation of electrical levels using the Marker Method . . . . . 76

3.4 Application to defects in group-IV materials . . . . . . . . . . . . . . . . . . . . 773.4.1 Chalcogen-hydrogen donors in silicon . . . . . . . . . . . . . . . . . . . . . . 773.4.2 VO-centers in silicon and germanium . . . . . . . . . . . . . . . . . . . . . . 793.4.3 Shallow and deep levels in diamond . . . . . . . . . . . . . . . . . . . . . . . 80

3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

4 Dynamical Matrices and Free energiesStefan K. Estreicher, Mahdi Sanati . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 854.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 854.2 Dynamical matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 874.3 Local and pseudolocal modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 884.4 Vibrational lifetimes and decay channels . . . . . . . . . . . . . . . . . . . . . . . 894.5 Vibrational free energies and specific heats . . . . . . . . . . . . . . . . . . . . . 924.6 Theory of defects at finite temperatures . . . . . . . . . . . . . . . . . . . . . . . . 954.7 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

5 The calculation of free energies in semiconductors: defects,transitions and phase diagramsE. R. Hernandez, A. Antonelli, L. Colombo,, and P. Ordejon . . . . . . . . . 1035.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1035.2 The Calculation of Free energies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

5.2.1 Thermodynamic integration and adiabatic switching . . . . . . . . 1055.2.2 Reversible scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

Contents XI

5.2.3 Phase boundaries and phase diagrams . . . . . . . . . . . . . . . . . . . . . 1105.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

5.3.1 Thermal properties of defects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1135.3.2 Melting of Silicon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1165.3.3 Phase diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

5.4 Conclusions and outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

6 Quantum Monte Carlo techniques and defects insemiconductorsR.J. Needs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1276.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1276.2 Quantum Monte Carlo methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

6.2.1 The VMC method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1286.2.2 The DMC method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1296.2.3 Trial wave functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1316.2.4 Optimization of trial wave functions . . . . . . . . . . . . . . . . . . . . . . . 1326.2.5 QMC calculations within periodic boundary conditions . . . . . . 1336.2.6 Using pseudopotentials in QMC calculations . . . . . . . . . . . . . . . 134

6.3 DMC calculations for excited states . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1346.4 Sources of error in DMC calculations . . . . . . . . . . . . . . . . . . . . . . . . . . 1356.5 Applications of QMC to the cohesive energies of solids . . . . . . . . . . . 1366.6 Applications of QMC to defects in semiconductors . . . . . . . . . . . . . . . 136

6.6.1 Using structures from simpler methods . . . . . . . . . . . . . . . . . . . . 1366.6.2 Silicon Self-Interstitial Defects . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1376.6.3 Neutral vacancy in diamond . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1426.6.4 Schottky defects in magnesium oxide . . . . . . . . . . . . . . . . . . . . . . 144

6.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

7 Quasiparticle Calculations for Point Defects atSemiconductor SurfacesArno Schindlmayr, Matthias Scheffler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1497.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1497.2 Computational Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

7.2.1 Density-Functional Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1527.2.2 Many-Body Perturbation Theory . . . . . . . . . . . . . . . . . . . . . . . . . 155

7.3 Electronic Structure of Defect-Free Surfaces . . . . . . . . . . . . . . . . . . . . 1607.4 Defect States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1637.5 Charge-Transition Levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1687.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

XII Contents

8 Multiscale modelling of defects in semiconductors: a novelmolecular dynamics schemeGabor Csanyi, Gianpietro Moras, James R. Kermode, Michael C.Payne, Alison Mainwood, Alessandro De Vita . . . . . . . . . . . . . . . . . . . . . . 1758.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1758.2 A hybrid view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1768.3 Hybrid simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1798.4 The LOTF scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1828.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1858.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

9 Empirical molecular dynamics: Possibilities, requirements,and limitationsKurt Scheerschmidt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1959.1 Introduction: Why empirical molecular dynamics ? . . . . . . . . . . . . . . 1959.2 Empirical molecular dynamics: Basic concepts . . . . . . . . . . . . . . . . . . 198

9.2.1 Newtonian equations and numerical integration . . . . . . . . . . . . . 1989.2.2 Particle mechanics and non equilibrium systems . . . . . . . . . . . . 2009.2.3 Boundary conditions and system control . . . . . . . . . . . . . . . . . . . 2029.2.4 Many body empirical potentials and force fields . . . . . . . . . . . . . 2039.2.5 Determination of properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205

9.3 Extensions of the empirical molecular dynamics . . . . . . . . . . . . . . . . . 2079.3.1 Modified boundary conditions: Elastic embedding . . . . . . . . . . . 2079.3.2 Tight-binding based analytic bond-order potentials . . . . . . . . . . 209

9.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2129.4.1 Quantum dots: Relaxation, reordering, and stability . . . . . . . . . 2129.4.2 Bonded interfaces: tailoring electronic or mechanical

properties? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2159.5 Conclusions and outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219

10 Defects in Amorphous Semiconductors: AmorphousSiliconD. A. Drabold and T. A. Abtew . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22510.1Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22510.2Amorphous Semiconductors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22510.3Defects in Amorphous Semiconductors . . . . . . . . . . . . . . . . . . . . . . . . . 228

10.3.1Definition for defect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22810.3.2Long time dynamics and defect equilibria . . . . . . . . . . . . . . . . . . 23010.3.3Electronic Aspects of Amorphous Semiconductors . . . . . . . . . . . 23010.3.4Electron correlation energy: electron-electron effects . . . . . . . . . 232

10.4Modeling Amorphous Semiconductors . . . . . . . . . . . . . . . . . . . . . . . . . . 23310.4.1Forming Structural Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23310.4.2Interatomic Potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234

Contents XIII

10.4.3Lore of approximations in density functional calculations . . . . 23510.4.4The electron-lattice interaction . . . . . . . . . . . . . . . . . . . . . . . . . . . 236

10.5 Defects in Amorphous Silicon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244

11 Light-Induced Effects in Amorphous and Glassy SolidsS.I. Simdyankin, S.R. Elliott . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24711.1Photo-induced metastability in Amorphous Solids: an

Experimental Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24711.1.1Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24711.1.2Photo-induced effects in chalcogenide glasses . . . . . . . . . . . . . . . 249

11.2Theoretical studies of photo-induced excitations in amorphousmaterials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25011.2.1Application of the Density-Functional-based Tight-Binding

method to the case of amorphous As2S3 . . . . . . . . . . . . . . . . . . . 251References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261

List of Contributors

T. A. AbtewDept. of Physics and Astronomy,Ohio University, Athens, OH 45701,[email protected]

A. AntonelliInstituto de Fısica Gleb Wataghin,Universidade Estadual de Campinas,Unicamp, 13083-970, Campinas, SaoPaulo, [email protected]

P.R. BriddonSchool of Natural Science, Universityof Newcastle, Newcastle upon Tyne,NE1 7RU, [email protected]

Manuel CardonaMPI-FKF, Heisenbergstr. 170569 Stuttgart, [email protected]

L. ColomboSLACS (INFM-CNR) and De-partment of Physics, University ofCagliari, Cittadella Universitaria,I-09042 Monserrato (Ca), [email protected]

Alessandro De VitaDepartment of Physics, King’sCollege London, Strand, London,United Kingdom, DEMOCRITOSNational Simulation Center andCENMAT-UTS, Trieste, Italyalessandro.de [email protected]

Gabor CsanyiCavendish Laboratory, University ofCambridge,Madingley Road, CB3 0HE, [email protected]

David A. DraboldDept. of Physics and Astronomy,Ohio University, Athens, OH 45701,[email protected]

S.R. ElliottDepartment of Chemistry, Universityof Cambridge,Lensfield Road, Cam-bridge CB2 1EW, [email protected]

Stefan K. EstreicherPhysics Department, Texas TechUniversity, Lubbock TX 79409-1051,[email protected]

J.P. GossSchool of Natural Science, Universityof Newcastle, Newcastle upon Tyne,NE1 7RU, [email protected]

XVI List of Contributors

E.R. HernandezInstitut de Ciencia de Materials deBarcelona (ICMAB–CSIC), Campusde Bellaterra, 08193 Barcelona,[email protected]

James R. KermodeCavendish Laboratory, University ofCambridge, Madingley Road, CB30HE, [email protected]

Alison MainwoodDepartment of Physics, King’sCollege London, Strand, London,[email protected]

Gianpietro MorasDepartment of Physics, King’sCollege London, Strand, London,[email protected]

R.J. NeedsTheory of Condensed Matter Group,Cavendish Laboratory,University of Cambridge, MadingleyRoad, Cambridge, CB3 0HE, [email protected]

Risto M. NieminenCOMP/Laboratory of Physics,Helsinki University of Technology,POB 1100, FI-02015 HUT, [email protected]

P. OrdejonInstitut de Ciencia de Materials deBarcelona (ICMAB–CSIC), Campusde Bellaterra, 08193 Barcelona,[email protected]

Michael C. PayneCavendishLaboratory, University of Cam-bridge, Madingley Road, CB3 0HE,[email protected]

Mahdi SanatiPhysics Department, Texas TechUniversity, Lubbock TX 79409-1051,[email protected]

Kurt ScheerschmidtMax-Planck Inst. for MicrostructurePhysics, Weinberg 2,D-06120 Halle, [email protected]

Matthias SchefflerFritz-Haber-Institut der Max-Planck-Gesellschaft, Faraday-weg 4–6, D-14195 Berlin-Dahlem,[email protected]

Arno SchindlmayrInstitut fur Festkorperforschung,Forschungszentrum Julich, D-52425 Julich, [email protected]

M.J. ShawSchool of Natural Science, Universityof Newcastle, Newcastle upon Tyne,NE1 7RU, [email protected]

S.I. SimdyankinDepartment of Chemistry, Universityof Cambridge,Lensfield Road, Cam-bridge CB2 1EW, [email protected]

Forewords

Manuel Cardona

Max-Planck-Institut fr Festkrperforschung, 70569 Stuttgart, Germany, EuropeanUnion [email protected]

Man sollte sich mit Halbleitern nicht beschftigen,das sind Dreckeffekte –

wer wei, ob sie richtig existieren.Wolfgang Pauli, 1931

1 Early history and contents of the present volume

This volume contains a comprehensive description of developments in thefield of Defects in Semiconductors which have taken place during the pasttwo decades. Although the field of defects in semiconductors is at least 60years old, it had to wait, in order to reach maturity, for the colossal increase incomputer power that has more recently taken place, following the predictionsof Moores law [1]. The ingenuity of computational theorists in developingalgorithms to reduce the intractable many-body problem of defect and hostto one that can be handled with existing and affordable computer power hasalso played a significant role: much of it is described in the present volume. Ascomputational power grew, the simplifying assumptions of these algorithms,some of them hard to justify, were reduced. The predictive accuracy of thenew calculations then took a great leap forward.

In the early days, the real space structure of the defect had to be postu-lated in order to get on with the theory and self-consistency of the electroniccalculations was beyond reach. During the past two decades emphasis hasbeen placed in calculating the real space structure of defect plus host andachieving self-consistency in the electronic calculations. The results of thesenew calculations have been a great help to experimentalists groping to in-terpret complicated data related to defects. I have added up the number ofreferences in the various chapters of the book corresponding to years before1990 and found that they amount only to 25% of the total number of refer-ences. Many of the remaining 75% of references are actually even more recent,having been published after the year 2000. Thus one can say that the contentsof the volume represent the State of the Art in the field. Whereas most of thechapters are concerned with defects in crystalline semiconductors, Chapters10 and 11 deal with defects in amorphous materials, in particular amorphoussilicon, a field about which much less information is available.

2 Manuel Cardona

The three aspects of the defect problem, real space structure, electronicstructure and vibrational properties are discussed in the various chaptersof the book, mainly from the theoretical point of view. Defects break thetranslational symmetry of a crystal, a property that already made possiblerather realistic calculations of the host materials half a century ago. Smallcrystals and clusters with a relatively small number of atoms (including im-purities and other defects), have become useful to circumvent, in theoreticalcalculations. the lack of translational symmetry in the presence of defectsor in amorphous materials. The main source of uncertainty in the state ofthe art calculations remains the small number of cluster atoms imposed bythe computational strictures. This number is often smaller than that corre-sponding to real world samples, including even nanostructures. Clusters witha number of atoms that can be accommodated by extant computers are thenrepeated periodically so as to obtain a crystal lattice, with a supercell anda mini-Brillouin zone. Although these lattices do not exactly correspond tophysical reality, they enable the use of k-space techniques and are instrumen-tal in keeping computer power to available and affordable levels. Anotherwidespread approach is to treat the cluster in real space after passivating thefictitious surface with hydrogen atoms or the like. When using these methodsit is a good practice to check convergence with respect to the cluster size byperforming similar calculations for at least two sets of clusters with numbersof atoms differing, say, by a factor of two.

The epigraph above, attributed to Wolfgang Pauli, translates as Oneshould not keep busy with semiconductors, they are dirt effects – Who knowswhether they really exist. The 24 authors of this book, like many tens of thou-sands of other physicists and engineers, have fortunately not heeded Pauli’sadvice (given in 1931, 14 years before he received the Nobel Prize). Had theydone it, not only the World would have missed a revolutionary and nowa-days ubiquitous technology, but basic physical science would have lost someof the most fruitful, beautiful and successful applications of Quantum Me-chanics. ‘Dreckeffekte’ is often imprecisely translated as effects of dirt i.e., aseffects of impurities. However, effects of structural defects would also fall intothe category of Dreckeffekte. In Paulis days applications of semiconductors,including variation of resistivity through doping leading to photocells and rec-tifiers, had been arrived at purely empirically, through some sort of trial anderror alchemy. I remember as a child using galena (PbS) detectors in crystalradio sets. I had lots of galena from various sources: some of it worked, somenot but nobody seemed to know why. Sixty years later, only a few monthsago, I was measuring PbS samples in order to characterize the number ofcarriers (of non-stoichiometric origin involving vacancies) and their type (nor p) so as to wrap up original research on this canonical material. [2] TodayGOOGLE lists 860000 entries under the heading ‘defects in semiconductors’.The Web of Science (WoS) lists 3736 mentions in the title and abstract ofsource articles. [3]

Forewords 3

The modern science of defects in semiconductors is closely tied to the in-vention of the transistor at Bell Laboratories in 1948 (by Bardeen, Shockleyand Brattain, [4] Physics Nobel laureates for 1956). Early developments tookplace mainly in the United States, in particular at Bell Laboratories, theLincoln Lab (MIT) and Purdue University. Karl Lark-Horovitz, an Austrianimmigrant, started at Purdue a program to investigate the growth and doping(n and p-type) of germanium and all sorts of electrical and optical propertiesof this element in crystalline form. [5] The initial motivation was the develop-ment of germanium detectors for Radar applications. During the years 1928till his untimely death in 1958 he built up the Physics Department at Purdueinto the foremost center of academic semiconductor research. Work similar tothat at Purdue for germanium was carried out at Bell Labs, also as a spin-offof the development of silicon rectifiers during World War II. At Bell, Scaffet al. [6] discovered that crystalline silicon could be made n- or p-type bydoping with atoms of the fifth (P, As, Sb) or the third (B, Al) column of theperiodic table, respectively. n-type dopants were called donors, p-type onesacceptors. Pearson and Bardeen performed a rather extensive investigationof the electrical properties of intrinsic’ and doped silicon. [7] These authorsproposed the simplest possible expression for estimating the binding energyof the so-called hydrogenic energy levels of those impurities: The ionizationenergy of the hydrogen atom (13.6 eV) had to be divided by the square of thestatic dielectric constant ε (ε = 12 for silicon) and multiplied by an effectivemass (typical values m∗ ∼ 0.1) which simulated the presence of a crystallinepotential. According to this Ansatz, all donors (acceptors) would have thesame binding energy, a fact which we now know is only approximately true(see Fig. 3.5 for diamond).

The simple hydrogenic Ansatz applies to semiconductors with isotropicextrema, so that a unique effective mass can be defined (e.g. n-type GaAs).It does not apply to electrons in either Ge or Si because the conduction bandextrema are strongly anisotropic. The hydrogen-like Schrodinger equationcan, however, be modified so as to include anisotropic masses, as appliy togermanium and silicon. [8] The maximum of the valence bands of most dia-mond and zincblende-like semiconductors occurs at or very close to k=0. Itis four-fould degenerate in the presence of spin-orbit interaction and six-foldif such interaction is neglected. [9] The simple Schrodinger equation of thehydrogen atom must be replaced by a set of four coupled equations (withspin-orbit coupling) with effective-mass parameters to be empirically deter-mined. [10] Extensive applications of Kohn’s prescriptions were performed byseveral Italian theorists. [11]

In the shallow (hydrogenic) level calculations based on effective massHamiltonians the calculated impurity eigenvalues are automatically referredto the corresponding band edges, thus obviating the need for using a marker,of the type discussed in Chapter 3 of this book. This marker was introducedin order to avoid errors inherent to the ‘first principles’ calculations, such as

4 Manuel Cardona

those related to the so called ‘gap problem’ found when using local densityfunctionals to represent many-body exchange and correlation. For a way topalliate this problem using the so-called GW approximation see Chapter 7where defects at surfaces are treated.

We have discussed so far the electronic levels of shallow substitutionalimpurities. In this volume a number of other defects, such as vacancies, in-terstitial impurities, clusters, etc., will be encountered. Energy levels relatedto structural defects were first discussed by Lark-Horovitz and coworkers. [12]These levels were produced by irradiation with either deuterons, alpha parti-cles or neutrons. After irradiation, the material became more p-type. It wasthus postulated that the defect levels introduced by the bombardment wereacceptors (vacancies?).

It was also discovered by Lark-Horovitz that neutron bombardment, fol-lowed by annealing in order to reduce structural damage, could be used tocreate electrically active impurities by nuclear transmutation. [13] The smallamount of the 30Si isotope (∼4%) present in natural Si converts, by neutroncapture, into radioactive 31Si, which decays through β-emission into stable31P, a donor. This technique is still commercially used nowadays for produc-ing very uniform doping concentrations.

Since Kohn-Luttinger perturbation theory predicts reasonably well theelectronic levels of shallow impurities (except for the so-called central cellcorrections [14]) this book covers mainly deep impurity levels which not onlyare difficult to calculate for a given real space structure but also require relax-ation of the unperturbed host crystal around the defect. Among these deeplevels, native defects such as vacancies and self-interstitials are profusely dis-cussed. Most of these levels are related to transition metal atoms, such asMn, Cu, and Au (I call Au and Cu transition metals for obvious reasons).The solubility of these transition metal impurities is usually rather low (lessthan 1015cm−3. Exceptions: Cd1−xMnxTe and related alloys). They can gointo the host lattice either as substitutional or as interstitial atoms,1 a pointthat can be clarified with EPR and also with ab initio total energy calcu-lations. These dopants were used in early applications in order to reducedthe residual conductivity due to shallow levels (because of the fact that tran-sition metal impurities have levels close to the middle of the gap) One caneven nowadays find in the market semi-insulating GaAs obtained by dopingwith chromium. I remember having obtained 1957 semi-insulating germa-nium and silicon (doped with either Mn or Au) with carrier concentrationslower than intrinsic (this makes a good exam question!). They were used formeasurements of the low frequency dielectric constants of these materials, in1 The reader who tries to do a literature search for interstitial gold may be sur-

prised by the existence of homonyms: Interstitial gold is important in the treat-ment of prostate cancer. It has, of course, nothing to do with our interstitialgold. See Lannon et al., British J. Urology 72, 782 (1993).

Forewords 5

particular vs. temperature and pressure [15] while I was working at Harvardon my PhD under W. Paul.

Rough estimates of the positions of deep levels of many impurity elementsin the gap of group IV and II-V semiconductors were obtained by Hjalmarsonet al. using Greens function methods. [9, 16] In the case of GaAs and relatedmaterials, two kinds of defect complexes, involving structural changes andmetastability have received a lot of attention because of technological impli-cations: the so-called EL2 and DX centers. Searching the Web of Science forEL2 one finds 1055 mentions in abstracts and titles of source articles. Like-wise 695 mentions are found for the DX centers. Chadi and coworkers haveobtained theoretical predictions for the structure of these centers and theirmetastability. [17, 18] Although these theoretical models explain a numberof observations related to these centers, there is not yet a general consensusconcerning their structures.

An aspect of the defect problem that has not been dealt with explicitlyin this volume is the errors introduced by using non-relativistic Schrodingerequations, in particular the neglect of mass-velocity corrections and spin-orbitinteraction (the latter, however, is explicitly included in the Kohn-LuttingerHamiltonian, either in its 4×4 or its 6×6 version). Discrepancies betweencalculated and measured gaps are attributed to the ‘gap problem’ inherentin the local density approximation (LDA). However, already for relativelyheavy atoms (Ge, GaAs) the mass-velocity correction decreases the s-likeconduction levels and, together with the LDA gap problem. converts thesemiconductor in a metal in the case of germanium. For GaAs it is statedseveral times in this volume that the LDA calculated gap is about half theexperimental one. This is for a non-relativistic Hamiltonian. Even a scalarrelativistic one reduces the gap even further, to about 0.2eV (experimentalgap: 1.52eV at 4K). [19] This indicates that the gap problem is more seriousthan previously thought on the basis of non-relativistic LDA calculations.

Another relativistic effect is the spin-orbit coupling. For moderately heavyatoms such as Ge, Ga and As the spin-orbit splitting at the top of the va-lence bands (∼0.3eV) is much larger than the binding energy of hydrogenicacceptors. Hence we can calculate the binding energies of the latter by solv-ing the decoupled 4×4 (J=3/2) and 2×2 (J=1/2) effective mass equations.This leads to two series of acceptor levels separated by a ‘spin orbit’ splittingbasically equal to that of the band edge states. In the case of silicon, how-ever, the spin-orbit splitting at k=0 (Δ = 0.044eV) is of the order of shallowimpurity binding energies. The impurity potential thus couples the J=3/2and J=1/2 bands and the apparent spin-orbit splitting of the correspondingimpurity series becomes smaller than that at the band edges. [20, 21] Thedifference between band edge spin-orbit splitting (Δ = 0.014eV) and that ofthe acceptor levels becomes even larger in diamond. Using a simple Greensfunctions technique and a Slater-Koster δ-function potential, the impuritylevel splittings have been calculated and found to be indeed much smaller

6 Manuel Cardona

than Δ = 0.014eV. This splitting depends strongly on the binding energy ofthe impurity.

Another aspect that has hardly been treated in the present volume (see,however, chapter 10 for amorphous silicon) is the temperature dependenceof the electronic energy levels which is induced by the electron-phonon in-teraction. Whenever this question appears in this volume, it is assumed thatwe are in the classical high temperature limit, in which the correspondingrenormalization of electronic gaps and states is proportional to temperature.At low temperatures, the electron-phonon interaction induces a zero-pointrenormalization of the electronic states which can be estimated from themeasured temperature dependence. It is also possible to determine the zero-point renormalization of gaps by measuring samples with different isotopiccompositions. The interested reader should consult the review by Cardonaand Thewalt. [22]

When an atom of the host lattice of a semiconductor has several stable iso-topes (e.g. diamond, Si, Ge, Ga, Zn samples grown with natural material lose,strictly speaking, their translational symmetry. In the past 15 years a largenumber of semiconductors have been grown using isotopically pure elements(which have become available in macroscopic and affordable quantities afterthe fall of the Iron Curtain). A different isotope added to an isotopically puresample can thus be considered as an impurity, probably the simplest kind ofdefect possible: Only the atomic mass of such an impurity differs from thatof the host, the electronic properties remain nearly the same.2 The main ef-fect of isotope mass substitution is found in the vibrational frequencies ofhost as well as local vibrational modes: such frequencies are inversely pro-portional to the square root of the vibrating mass (see Chapter 4). Althoughthis effect sounds rather trivial it often induces changes in phonon widths andin the zero-point anharmonic renormalizations (see Ref. 22) which in somecases can be rather drastic and unexpected. [23] The structural relaxationaround isotopic impurities is rather small. The main such effect correspondsto a increase of the lattice constant with increasing isotopic mass, about0.015% between 12C and 13C diamond. Its origin lies in the change in thezero point renormalization of the lattice constant: ab initio calculations areavailable. [24]

The third class of effects of the isotopic impurities refers to electronicstates and energy gaps and their renormalization on account of the electron-phonon interaction. The zero point renormalizations also vary like the inversesquare root of the relevant isotopic mass. By measuring a gap energy at lowtemperatures for samples with two different isotopic masses, one can extrapo-late to infinite mass and thus determined the unrenormalized value of the gap.Values around 60 meV have been found for Ge and Si. For diamond, however,2 Except for the electron-phonon renormalization of the electronic states and gaps

which is usually rather small. See Ref. 22.

Forewords 7

this renormalization seems to be much larger, [25] around 400 meV.3 Thislarge renormalization is a signature of strong electron-phonon interactionwhich seems to be responsible for the superconductivity recently observedin heavily boron doped (p-type) diamond (Tc higher than 10K). [26, 27] Abinitio calculations of the electronic and vibronic structure of heavily borondoped diamond have been performed and used for estimating the criticaltemperature Tc. [28]

2 Bibliometric studies

In the previous section I have already discussed the number of times certaintopics appear in titles, keywords and abstracts in source journals (about 6000publications chosen by the ISI among ∼100000, as those which contributesignifically to the progress of science). While titles go back to the presentstarting date of the source journal selection (the year 1900), abstracts andkeywords are only collected since 1990. In the Web of Science (WoS) one cancompletely eliminate the latter in order to avoid distortions but, for simplicity,I kept them in the qualitative survey presented here.

In this section, a more detailed bibliometric analysis will be performedusing the WoS which draws on the citation index as the primary data bank.In order to get a feeling for the standing of the various contributors to thisvolume, we could simply perform a citations count (it can be done relativelyeasily within the WoS using the cited reference mode. However, a more taletelling index has been recently suggested by J.E. Hirsch, [29] the so-calledh-index. This index is easily obtained for anyone with access to the WoSgoing back to the first publication of the authors under scrutiny (1974 forNieminen and Shaw). How far back your access to the WoS goes depends onhow much your institution is willing to pay to ISI- Thomson Scientific. Theh-index is obtained by using the general search mode of the WoS and orderingthe results of the search for a given individual according to the number ofcitations (there is a function key to order the authors contributions frommost cited to less cited). You then go down the list till the order number of apaper equals its number of citations (you may have to take one more or lesscitation if equality does not exist). The number so obtained is the h-index. Itrewards more continued, sustained well cited publications rather than only acouple with a colossal number (such as those that deserve the Nobel Prize).Watch out for possible homonyms although, on the average, they appearseldom. They can be purged by hand if the number of terms is not toohigh. I had problems with homonyms only for five out of the 24 (excludingmyself) contributors to this volume (Antonelli, Colombo, Hernndez, Sanatiand Shaw). I simply excluded them from the count.

3 Theorists: beware (and be aware) of this large renormalization when comparingyour fancy GW calculations of gaps with experimental data.

8 Manuel Cardona

The average h-factor of the remaining 19 authors is h=20. Hirsch men-tions in Ref. 29 that recently elected fellows of the American Physical Soci-ety have typically h∼15-20. Advancement of a physicist to full professor ata reputable US university corresponds to h=18. The high average h alreadyreveals the high standing of the authors of this book. In several cases, theauthors involve a senior partner (h=20, Chapters 3,4,5,7,8,10 and 11) anda junior colleague. I welcome this decision. It is a good procedure for intro-ducing junior researchers to the intricacies and ordeals involved in writinga review article of such extent. In this connection, I should mention thatthe h-index is roughly proportional to the scientific age (counted from thefirst publication or the date of the PhD thesis). The values of h given abovefor faculty and NAS membership are appropriate to physicists and chemists.Biomedical scientists often have, everything else being equal, twice as largeh-indexes, whereas engineers and mathematicians (especially the latter) havemuch lower ones.

After having discussed the average h-index of our contributors, I wouldlike to mention the range they cover without mentioning specific names.4 Theh-indexes of our contributors cover the range 6 ≤ h ≤ 64. Four very juniorauthors who have not yet had a chance of being cited have been omitted(one could have set h=0 in their case). Hirsch mentions in his seminal article[29] that election to the National Academy of Sciences of the US is usuallyassociated with h=45. We therefore must have some potential academiciansamong our contributors.

Because of the ease in the use of the h-algorithm just described and itsusefulness to evaluate the ‘impact’ of a scientists career, bibliometrists havebeen looking for other applications of the technique. Instead of people onecan apply it to journals (provided they are not too large in terms of pub-lished articles), institutions, countries, etc. One has to keep in mind that theresulting h-number always reverts to an analysis of the citations of individu-als which are attached to the investigated items (e.g. countries, institutions,etc.). One can also use the algorithm to survey the importance of keywords ortitle subjects. The present volume has 11 chapters and this gives it a certain(albeit small) statistical value to be use in such a survey. We thus attach toeach chapter title a couple of keywords and evaluate the corresponding h-index entering these under ‘topic’ in the general search mode of the WoS. Inthe table below we list these words, the number of items we find for each setof them and the corresponding h-index. There is considerable arbitrariness inthe procedure to choose the keywords but we must keep in mind that theseapplications are just exploratory and at their very beginning.

We display in Table 1 the keywords we have assigned to the eleven chap-ters, the number of terms citing them and the corresponding h-index which4 Mentioning the h-indexes of the authors, one by one, may be invidious. The

interested reader with access to the WoS can do it by following the prescriptiongiven above.

Forewords 9

weights them according to the number of times each citing term is cited. Onecan draw a number of conclusions from this table. Particularly interesting arethe low values of n and h for empirical molecular dynamics, which probablysignals the turn towards ab initio techniques. Amorphous semiconductors, in-cluding defect and the metastabilities induced by illumination plus possiblytheir applications to photovoltaics are responsible for the large values of nand h.

Table 1. Keywords assigned (somewhat arbitrarily) to each of the 11 chapters inthe book together with the corresponding number n of source articles citing themin abstract, keywords or title. Also, Hirsch number h which can be assigned to eachof the chapters according to the keywords.

Chapter Keyword (topic in WoS) n h

1 defects and semiconductors 3735 762 supercell calculations 165 273 Gaussian orbitals 190 274 dynamical matrix 231 265 free energy and defect 494 366 Quantum Monte Carlo 2551 717 point defect and surface 426 388 defect and molecular dynamics 2023 679 empirical molecular dynamics 23 710 defect and amorphous 4492 7711 light and amorphous 5747 87

References

1. G.E. Moore, Electronics 38, 114 (1965).2. R. Sherwin, R.J.H. Clark, R. Lauck, and M. Cardona, Solid State Commun.

134, 265(2005).3. A source article is one published in a Source Journal as defined by the ISI-

Thomson Scientific. There are about 6000 such journals, including all walks ofscienc.e

4. J. Bardeen and W.H. Brattain, Phys. Rev. 74, 230 (1948).5. K. Lark-Horovitz and V.A. Johnson, Phys. Rev. 69, 258 (1946).6. J.A. Scaff, H.C. Theuerer and E.E. Schumacher, J. Metals: Trans. Am. Inst.

Mining and Metallurgical Engineers 185, 383 (1949).7. G.L. Pearson and J. Bardeen, Phys. Rev. 75, 865 (1949).8. W. Kohn and J.M. Luttinger, Phys. Rev. 97, 1721 (1975).9. P.Y. Yu and M. Cardona, Fundamentals of Semiconductors 3rd ed. (Springer,

Berlin, 2005).10. W. Kohn and D. Schechter, Phys. Rev. 99, 1903 (1955).11. A. Baldereschi and N. Lipari, Phys. Rev. B 9, 1525 (1974).

10 Manuel Cardona

12. W.E. Johnson and K. Lark-Horovitz, Phys Rev. 76, 442 (1949); K. Lark-Horovitz, E. Bleuler, R. Davis, and D.Tendam, Phys. Rev. 73, 1256 (1948).

13. K. Lark-Horovitz, Nucleon-Bombarded Semiconductors, in Semiconducting Ma-terials (Butterworths, London, 1950) p. 47.

14. W. Kohn, Solid State Physics (Academic, New York, 1957, Vol 5) p. 255.15. M. Cardona, W. Paul and H. Brooks, J. Phys. Chem. Sol. 8, 204 (1959).16. H.P. Hjalmarson, P. Vogl, D.J. Wolford, and J.D. Dow, Phys. Rev. Lett. 44,

810 (1980).17. EL2 center: D.J. Chadi and K.J. Chang, Phys Rev. Lett. 60, 2187 (1988).18. DX center: S.B. Chang and D.J. Chadi, Phys. Rev. B 42, 7174 (1990).19. M. Cardona, N.E. Christensen and G. Fasol, Phys. Rev. B 38, 1806 (1988).20. N.O. Lipari, Sol. St. Commun. 25, 266 (1978).21. J. Serrano, A. Wysmolek, T. Ruf and M. Cardona Physica B 274, 640 (1999).22. M. Cardona and M.L.V. Thewalt, Rev. Mod. Phys. 77, 1173 (2005).23. J. Serrano, F.J. Manjon, A.H. Romero, F. Widulle, R. Lauck, and M. Cardona,

Phys. Rev. Lett. 90, 055510 (2003).24. P. Pavone and S. Baroni, Sol. St. Commun. 90, 295 (1994).25. M. Cardona, Science and Technology of Advanced Materials, in press. See also

Ref. 22.26. E.A. Ekimov, V.A. Sidorov, E.D. Bauer, N.N. Mel’nik, N.J. Curro, J.D. Thomp-

son and S.M. Stishov, Nature 428, 542 (2004).27. Y. Takano, M. Nagao, I. Sakaguchi, M. Tachiki, T. Hatano, K. Kobayashi, H.

Umezawa, and H. Kawarada, Appl. Phys. Letters 85, 2851 (2004).28. L. Boeri, J. Kortus and O.K. Andersen Phys. Rev. Lett. 93, 237002 (2004).29. J.E. Hirsch, Proc. Nat. Acad. Sc. (USA) 102, 16569 (2005).

1 Defect theory: an armchair history

David A. Drabold and Stefan K. Estreicher

1 Dept. of Physics and Astronomy, Ohio University, Athens, OH [email protected]

2 Physics Department, Texas Tech University, Lubbock TX [email protected]

1.1 Introduction

The voluntary or accidental manipulation of the properties of materials byincluding defects has been performed for thousands of years. The most an-cient example we can think of is well over 5,000 years old. It happened whensomeone realized that adding trace amounts of tin to copper lowers the melt-ing temperature, increases the viscosity of the melt, and results in a metalconsiderably harder than pure copper: bronze. This allowed the manufactureof a variety of tools, shields and weapons. Not long afterwards, the earlymetallurgists realized that sand mixed with a metal is relatively easy to meltand produces glass. The Ancient Egyptians discovered that glass beads ofvarious brilliant colors can be obtained by adding trace amounts of specifictransition metals, such as gold for red or cobalt for blue [1].

Defect engineering is not something new. However, materials whose me-chanical, electrical, optical, and magnetic properties are almost entirely con-trolled by defects are relatively new: semiconductors [2, 3]. Although, thefirst publication describing the rectifying behavior of a contact dates back to1874 [4], the systematic study of semiconductors begun only during WorldWar II. The first task was to grow high-quality Ge (then Si) crystals, that isremoving as many defects as possible. The second task was to manipulate theconductivity of the material by adding selected impurities which control thetype and concentration of charge carriers. This involved theory to understandas quantitatively as possible the physics involved. Thus, theory has played akey role since the very beginning of this field. These early developments havebeen the subject of several excellent reviews [5–8].

For a long time, theory has been trailing the experimental work. Ap-proximations at all levels were too drastic to allow quantitative predictions.Indeed, modeling a perfect solid is relatively easy since the system is periodic.High-level calculations can be done in the primitive unit cell. This periodicityis lost when a defect is present. The perturbation to the defect-free materialis often large, in particular when some of the energy eigenvalues of the defectare in the forbidden gap, far from band edges. However, in the past decade orso, theory has become quantitative in many respects. Today, theorists oftenpredict geometrical configurations, binding, formation, and various activationenergies, charge and spin densities, vibrational spectra, electrical properties,

12 David A. Drabold and Stefan K. Estreicher

and other observable quantities with sufficient accuracy to be useful to ex-perimentalists and sometimes device scientists.

Furthermore, the theoretical tools developed to study defects in semicon-ductors can be easily extended to other areas of materials theory, includ-ing many fields of nanoscience. It is the need to understand the propertiesof defects in semiconductors, in particular silicon, that has allowed theoryto develop as much as it did. One key reason for this was the availabil-ity of microscopic experimental data, ranging from electron paramagneticresonance (EPR) to vibrational spectroscopy, photoluminescence (PL), orelectrical data, all of which provided critical tests for theory at every step.

The word ‘defect’ means a native defect (vacancy, self-interstitial, an-tisite,...), an impurity (atom of a different kind than the host atoms), orany combination of those isolated defects: small clusters, aggregates, or evenlarger defect structures such as precipitates, interfaces, grain boundaries, sur-faces, etc. However, nanometer-size defects play many important roles andare the building blocks of larger defect structures. Therefore, understandingthe properties of defects begins at the atomic scale.

There are many examples of the beneficial or detrimental roles of de-fects. Oxygen and nitrogen pin dislocations in Si and allow wafers to undergoa range of processing steps without breaking. [10] Small oxygen precipitatesprovide internal gettering sites for transition metals, but some oxygen clustersare unwanted donors which must be annealed out. [11] Shallow dopants areoften implanted. They contribute electrons to the conduction band or holes tothe valence band. Native defects, such as vacancies or self-interstitials, pro-mote or prevent the diffusion of selected impurities, in particular dopants.Self-interstitial precipitates may release self-interstitials which in turn pro-mote the transient enhanced diffusion of dopants. [12] Transition metal impu-rities are often associated with electron-hole recombination centers. Hydro-gen, almost always present at various stage of device processing, passivatesthe electrical activity of dopants and of many deep-level defects, or formsextended defect structures known as platelets. [13] Mg-doped GaN must beannealed at rather high temperatures to break up the {Mg,H} complexeswhich prevent p-type doping. [14] Magnetic impurities such as Mn can ren-der a semiconductor ferromagnetic. The list goes on.

Much of the microscopic information about defects comes from electri-cal, optical, and/or magnetic experimental probes. The electrical data areoften obtained from capacitance techniques such as deep-level transient spec-troscopy (DLTS). The sensitivity of DLTS is very high and the presenceof defects in concentrations as low as 1011 cm−3 can be detected. However,even in conjunction with uniaxial stress experiments, these data provide littleor no elemental and structural information and, by themselves, are insuffi-cient to identify the defect responsible for electrical activity. local vibrationalmode (LVM) spectroscopy, Raman, and Fourier-transform infrared absorp-tion (FTIR), often give sharp lines characteristic of the Raman- or IR-active

1 Defect theory: an armchair history 13

LVMs of impurities lighter than the host atoms. When uniaxial stress, anneal-ing, and isotope substitution studies are performed, the experimental dataprovide a wealth of critical information about a defect. This information canbe correlated e.g. with DLTS annealing data. However, Raman and FTIRare not as sensitive as DLTS. In the case of Raman, over 1017 cm−1 defectscenters must be present in the surface layer exposed to the laser. In the caseof FTIR, some 1016 cm−3 defect centers are needed, although much highersensitivities have been obtained from multiple-internal reflection FTIR. [15]Photoluminescence is much more sensitive, sometimes down to 1011 cm−1,but the spectra can be more complicated to interpret. [6] Finally, magneticprobes such as EPR are wonderfully detailed and a lot of defect-specific datacan be extracted: identification of the element(s) involved in the defect andits immediate surrounding, symmetry, spin density maps, etc. However, thesensitivity of EPR is rather low, of the order of 1016 cm−3. Further, localizedgap levels in semiconductors often prefer to be empty or doubly occupied asmost defect centers in semiconductors are unstable in a spin 1

2state. The

sample must be illuminated in order to create an EPR-active version of thedefect under study. [17, 18]

This introductory chapter contains brief reviews of the evolution of theory[20, 31] since its early days and of the key ingredients of today’s state-of-the-art theory. It concludes with an overview of the content of this book.

1.2 The evolution of theory

The first device-related problem that required understanding was the creationof electrons or holes by dopants. These (mostly substitutional) impurities area small perturbation to the perfect crystal and are well described by EffectiveMass Theory (EMT) [21]. The Schrodinger equation for the nearly-free chargecarrier, trapped very close to a parabolic band edge, is written in hydrogenicform with an effective mass determined by the curvature of the band. Thecalculated binding energy of the charge carrier is that of a hydrogen atom butreduced by the square of the dielectric constant. As a result, the associatedwavefunction is substantially delocalized, with an effective Bohr radius some100 times larger than that of the free hydrogen atom.

EMT has been refined in a variety of ways [22] and provided a basicunderstanding of doping. However, it cannot be extended to defects thathave energy eigenvalues far from band edges. These so-called ‘deep-level’defects are not weak perturbations to the crystal and often involve substantialrelaxations and distortions. The first such defects to be studied were thebyproducts of radiation damage, a hot issue in the early days of the cold war.EPR data became available for the vacancy [17, 23] and the divacancy, [24]in silicon (the Si self-interstitial has never been detected). Transition metal(TM) impurities, which are common impurities and active recombinationcenters, have also been studied by EPR. [25]

14 David A. Drabold and Stefan K. Estreicher

In most charge states, the undistorted vacancy (Td symmetry) or di-vacancy (D3d symmetry) is an orbital triplet or doublet, respectively, andtherefore should undergo Jahn-Teller distortions. The EPR studies showedthat this is indeed the case. Although interstitial oxygen, the most commonimpurity in Czochralski-grown Si, was known to be at a puckered bond-centered site [26, 27], it was not realized how much energy is involved inrelaxations and distortions. It was believed that the chemistry of defects insemiconductors is well described in first order by assuming high symmetry,undistorted, lattice sites. Relaxations and distortions were believed to be asecond-order correction. The important issue then was to correctly predicttrends in the spin densities and electrical activities of specific defects centersin order to explain the EPR and electrical data (see e.g. Refs. [28, 29]). Thecritical importance of carefully optimizing the geometry around defects andthe magnitudes of the relaxation energies were not fully realized until the1980s [30]. The host atom displacements can be of several tenths of an A,and the chemical rebonding can lead to energy changes as large as severaleVs (undistorted vs. relaxed structures).

The first theoretical tool used to describe localized defects in semiconduc-tors involved Green’s functions [2, 31–33]. These calculations begin withthe Hamiltonian H0 of the perfect crystal. Its eigenvalues give the crystalsband structure and the eigenfunctions are Bloch or Wannier functions. Inprinciple, the defect-free host crystal is perfectly described. The localized de-fect is represented by a Hamiltonian H′ which includes the defect potentialV. The Green’s function is G(E) = 1/(E− H′). Therefore, the perturbed en-ergies E coincide with its poles. The new eigenvalues include the gap levels ofthe defect and the corresponding eigenfunctions are the defect wavefunctions.In principle, Green’s functions provide an ideal description of the defect inits crystalline environment. In practice, there are many difficulties associatedwith the Hamiltonian, the construction of perfect-crystal eigenfunctions thatcan be used as a basis set for the defect calculation, [34] and the constructionof the defect potential itself. This is especially true for those defects whichinduce large lattice relaxations and/or distortions.

The first successful Green’s functions calculations for semiconductors dateback to the late 1970s. [35–37] They were used to study charged defects,[38,38] calculate forces, [40–42] total energies, [43,44] and LVMs. [45,46] Thesecalculations also provided important clues about the role of native defects inimpurity diffusion. [47] However, while Green’s functions do provide a near-ideal description of the defect in a crystal, their implementation is difficultand not very intuitive. Clusters or supercells are much easier to use andprovide a physically and chemically appealing description of the defect andits immediate surroundings. Green’s functions have mostly been abandonedsince the mid 1980s, but a rebirth within the GW formalism [48] is now takingplace (see the chapter by Schindlmayr and Scheffler).

1 Defect theory: an armchair history 15

In order to describe the distortions around a vacancy, Friedel et al. [50]completely ignored the host crystal and limited their description to rigidlinear combinations of atomic orbitals (LCAO). Messmer and Watkins [51]expanded this approach to linear combinations of dangling-bond states. Thesesimple quantum-chemical descriptions provided a much-needed insight anda correct, albeit qualitative, explanation of the EPR data. Here, the defectwas assumed to be so localized that the entire crystal could be ignored in 0th

order.The natural extension of this work was to include a few host atoms around

the defect, thus defining a cluster. These types of calculations were per-formed in real space with basis sets consisting of localized functions such asGaussians or LCAOs. The dangling bonds on the surface atoms must be tiedup in some way, most often with H atoms. However, without the underlyingcrystal and its periodicity, the band structure is missing and the defect’s en-ergy eigenvalues cannot be placed within a gap. Further, the finite size of thecluster artificially confines the wavefunctions. This affects charged defects themost, as the charge tends to distribute itself on the surface of the cluster.However, the local covalent interactions are well described.

The Schrodinger equation for a cluster containing a defect can be solvedusing almost any electronic structure methods. The early work was empiri-cal or semiempirical, with heavily approximated quantum chemical methods.At first, extended Huckel theory [52, 53] then self-consistent semiempiricalHartree-Fock: CNDO, [54] MNDO, [55] MINDO. [56] Geometries could beoptimized, albeit often with symmetry assumptions. The methods sufferedfrom a variety of problems such as cluster size and surface effects, basis setlimitations, lack of electron correlation, and the use of adjustable parameters.Their values are normally fitted to atomic or molecular data, and transfer-ability is a big issue.

In order to bypass the surface problem, cyclic clusters have been designed,mostly in conjunction with semiempirical Hartree-Fock. Cyclic clusters can beviewed as clusters to which Born-von Karman periodic boundary conditionsare applied. [57, 58] These boundary conditions can be difficult to handle, inparticular when 3- and 4-center interactions are included. [59]

DeLeo and co-workers extensively used the scattering-Xα method in clus-ters [60,61] to study trends for interstitial TM impurities and hydrogen-alkalimetal complexes. The results provided qualitative insight into these issues.Ultimately, the method proved difficult to bring to self-consistency and therather arbitrarily defined muffin-tin spheres rendered it poorly suited to thecalculation of total energies vs. atomic positions.

The method of Partial Retention of Diatomic Differential Overlap [62,63](PRDDO) was first used for defects in diamond and silicon in the mid-1980’s.It is self-consistent, contains no semiempirical parameters, and allows geom-etry optimizations to be performed without symmetry assumptions. Conver-gence is very efficient and relatively large clusters (44 host atoms) could be

16 David A. Drabold and Stefan K. Estreicher

used. However, PRDDO is a minimal basis set technique and ignores electroncorrelation. Its earliest success was to demonstrate [30] the stability of bond-centered hydrogen in diamond and silicon. It was not expected at all that animpurity as light as H could indeed force a Si-Si bond to stretch by a over1A. Substantial progress in the theory of defects in semiconductors occurredin the mid 1980’s with the combination of periodic supercells to representthe host crystal, ab-initio type pseudopotentials [78–80] for the core regions,DF theory for the valence regions, and ab-initio molecular dynamics (MD)simulations [81, 82] for nuclear motion. This combination is now referred toas ‘first-principles’ in opposition to ‘semiempirical’. There are parameters inthe theory. They include the size of the supercell, k-point sampling, type andsize of the basis set, chosen by the user, as well as the parameters associatedwith the basis sets and pseudopotentials. However, these parameters and userinputs are not fitted to an experimental database. Instead, some are deter-mined self-consistently, other are calculated from first principles or obtainedfrom high-level atomic calculations. Note that the first supercell calculationswere done in the 1970’s in conjunction with approximate electronic structuremethods. [83–85] As much as 1.5 to 1.6A. PRDDO was used to study clustersize and surface effects [64] and many defects. [65] It provided good inputgeometries for single point ab-initio Hartree-Fock calculations. [66] However,it suffered from the problems associated with all Hartree-Fock techniques,such as unreasonably large gaps and inaccurate LVMs. A number of researchgroups have used Hartree-Fock and post-Hartree-Fock techniques [67–69] tostudy defects in clusters, but these efforts have now been mostly abandoned.

Density-functional (DF) theory [70, 71] with local basis sets [72] in largeclusters allowed more quantitative predictions. The DF-based AIMPRO code[73, 74] uses Gaussian basis sets, has been applied to many defect problems(this code handles periodic supercells as well). In addition to geometries andenergetics, rather accurate LVMs for light impurities can be predicted. [75,76]Large clusters have been used [77] to study the distortions around a vacancyor divacancy in Si. However, all clusters suffer from the surface problem andlack of periodicity.

1.3 A sketch of first-principles theory

The theoretical approach known as ‘first-principles’ has proven to be a rev-olutionary tool to predict quantitatively some key properties of defects. Anelementary exposition of the theory follows.

1.3.1 Single particle methods: History

After the Born-Oppenheimer approximation is made, so that electronic andionic degrees of freedom are separated, we face the time-independent many-electron problem [71]:

1 Defect theory: an armchair history 17

[−h2/2m∑

j

∇2j −

∑j,l

Zle2/|rj −Rl|+1/2

∑j �=j′

e2/|rj − rj′|−E]Ψ = 0 (1.1)

Here, rj are electron coordinates, Rl and Zl are positions and atomicnumbers of the nuclei and E is the energy. It is worth reflecting on the re-markable simplicity of this equation: it is exact (non-relativistically), and themeaning of each term is entirely transparent. In one of the celebrated legendsof science, P. A. M. Dirac is said to have implied that chemistry was just anapplication of Equation 1.1, though he also acknowledged that the equationwas intractable. It is true that the quantum mechanics of the many-electronproblem is beautifully and succinctly represented in Equation 1.1.

Kohn gives an interesting argument [71] stating that even in principle,Equation 1.1 is hopeless as a practical tool for calculation if the number ofelectrons exceeds of order 103. His argument is a development of a paper ofVan Vleck [86, 87] and points out that it appears to be fundamentally im-possible to obtain an approximate many-electron function Ψ with significantoverlap with the “true” many-body wavefunction for large systems. He fur-ther points out that the sheer dimensionality of the problem rapidly makesit unrepresentable on any conceivable computer. Thus, a credible case canbe made that for large systems it does not even make sense to estimate Ψdirectly. Kohn has named this the “exponential wall”. To some extent this isdisconcerting, because of the simplicity of the form of Equation 1.1, but itpoints to the need for new concepts if we are to make sense of solids – to saynothing of defects!

Empirical experience with solids also suggests that the unfathomable com-plexity of the many-body wavefunction is unnecessary. If all of the informa-tion contained in Ψ was really required for estimating the properties of solidsthat we care about, e.g. experimental observables, molecular physics wouldreach exhaustion with tiny molecules, and solid state physics would neverget off the ground at all. The fact is that many characteristics of materialsare independent of system size. For example, in a macroscopic sample, theelectronic density of states has the same form for a system with N atomsand an identically prepared one with 2N atoms. Yet Ψ is immeasurably morecomplex for the second system than the first. Thus, the additional complexityof the 2N system ψ must be completely irrelevant to our observable, in thiscase the density of states.

The saturation of complexity of the preceding paragraph is connected toKohn’s “principle of nearsightedness” [88], which states that in fact quantummechanics in the solid state is intrinsically local (how local depends sensitivelyon the system [89, 90]). The natural gauge of this locality is the decay ofthe density operator in the position representation: ρ(x,x′) = 〈x|ρ|x′〉: afunction of |x− x′|, decaying as a power law in metals and exponentially insystems with a gap. For systems with a gap, the exponential fall off enablesaccurate calculations of all local properties by undertaking a calculation in a

18 David A. Drabold and Stefan K. Estreicher

finite volume determined by the rate of decay. The decay in semiconductorsand insulators can be exploited to produce efficient order-N methods forcomputing total energies and forces, with computational cost scaling linearlywith the number of atoms or electrons [68].

1.3.2 Direct approaches to the many-electron problem

While this book emphasizes single-particle methods, there is one importantexception. Richard Needs shows in Chapter 6 that remarkable progress canbe made for the computation of expectation values of observables using Quan-tum Monte Carlo methods, with no essential approximations to Equation 1.1.This is a promising class of methods that offers the most accurate calcula-tions available today for complex systems. Several groups are advancing thesemethods, and even the stochastic calculation of forces is becoming possible.One can be certain that Quantum Monte Carlo will play an important role insystems needing the most accurate calculations available, and certainly thisis can be the case for the theory of defects.

1.3.3 Hartree and Hartree-Fock approximations

In 1928, Hartree [92] started with Equation 1.1 with a view to extractinguseful single-particle equations from it. He used the variational principlefor the ground state wavefunction adopting a simple product trial function:Ψ(x1,x2, ...,xn) = ψ1(x1)ψ2(x2)...ψn(xn). The product ansatz did not en-force Fermion antisymmetry requirements; this was built in with a Slater de-terminant ansatz as proposed by Fock in 1930 [93]: the Hartree-Fock method.

These methods map the many-body ground state problem onto a set ofchallenging single-particle equations. The structure of the Hartree equationsis a Schrodinger equation with a potential depending upon all the orbitals:

−h2/2m∇2ψl + Veffψl = εlψl (1.2)

where the effective potential Veff (r) =∑

j �=�

∫d3r′ψ∗

j (r′)ψ�(r′)/|r− r′|, andthe sum is over occupied states. This appealing equation prescribes an effec-tive Coulomb field for electron l arising from all of the other electrons. Sincecomputing the potential requires knowledge of the other wavefunctions, theequation must be solved self-consistently with a scheme of iteration. Whilethe equation is intuitive, it is highly approximate. Curiously, it will turn outthat the equations of density functional theory have the mathematical struc-ture of Equation 1.2, but with quite a different (and far more predictive) Veff

(see Equation 2.7 below), derived from a very different point of view.The Hartree-Fock approximation includes that part of the exchange en-

ergy implied by the exclusion principle and has well known analytic problemsat the Fermi surface in metals [68] and a tendency to exaggerate charge fluc-tuations at atomic sites in molecules or solids [94]. In molecular calculations,

1 Defect theory: an armchair history 19

the correlation energy (roughly speaking, what is missing from Hartree-Fock)can be estimated in perturbation theory. This is computationally expensive:the most popular fourth order perturbation theory (MP4) scales like N7

e ,where Ne is the number of electrons. Such methods are important for molec-ular systems, but challenging to apply to condensed systems.

1.3.4 Density Functional Theory

Thomas-Fermi Model

Not long after the dawn of quantum mechanics, Thomas and Fermi [96] sug-gested a key role for the electron density distribution as the determiner ofthe total energy of an inhomogeneous electron gas. This was the first seriousattempt to express the energy as a functional of the electron density ρ:

E[ρ] =∫d3rV (r)ρ(r)+e2/2

∫d3rd3r′ρ(r)ρ(r′)/|r−r′|+α/m

∫d3rρ5/3(r).

(1.3)Here, α is a numerical constant, m and e are the electron mass and chargerespectively, and V is an external potential. The first two terms are evidentlyobtained from classical electrostatics. Quantum mechanics appears only inthe third term, derived from the kinetic energy of a homogeneous electron gas.Already this equation is making a “local density approximation”, forming anestimate for the inhomogeneous electron kinetic energy from the result fromhomogeneous gas; an approximation that is expected to succeed if |∇ρ|/ρis sufficiently small. Variation of Equation 1.3 with respect to ρ with fixedelectron number leads to the coupled self-consistent Thomas Fermi equations(see e.g., the treatment by Fulde [94]). The coarse treatment of the kineticenergy greatly limits the predictive power of the method – for example, theshell structure of atoms is not predicted [97]. However, the Thomas-Fermimodel is conceptually a density functional theory.

Modern Density Functional Theory

For atomistic force calculations on solids, the overwhelming method of choiceis density functional theory, due to Kohn, Hohenberg and Sham [98]. Thischapter only sketches the concepts in broad outline, as a detailed treatmentfocused on defect calculations is available in this book (Niemenen chapter).For additional discussion we strongly recommend the books of Martin [68]and Fulde [94].

The following statements embody the foundation of zero-temperaturedensity functional theory:

(1) The ground state energy of a many electron system is a functional ofthe electron density ρ(x):

20 David A. Drabold and Stefan K. Estreicher

E[ρ] =∫d3xV (x)ρ(x) + F [ρ], (1.4)

where V is an external potential (due for example to interaction with ions,external fields, e.g., not with electrons), and F [ρ] is a universal functional ofthe density. The trouble is that F [ρ] is not exactly known, though there iscontinuing work to determine it. The practical utility of this result derivesfrom:

(2) The functional E[ρ] is minimized by the true ground state electrondensity.

It remains to estimate the functional F [ρ], which in conjunction with thevariational principle (2), enables real calculations. To estimate F [ρ], the usualprocedure is to note that we already know some of the major contributionsto F [ρ], and decompose the functional in the form:

F [ρ] = e2/2∫d3x d3x′ ρ(x) ρ(x′)/|x− x′| + Tni(ρ) + Exc(ρ) . (1.5)

Here, the integral is just the electrostatic (Hartree) interaction of theelectrons, Tni is the kinetic energy of a noninteracting electron gas ofdensity ρ, and Exc(ρ) is yet another unknown functional, the exchange-correlation functional, which includes non classical effects of the interact-ing electrons. Eq. 2.5 is difficult to evaluate directly in terms of ρ, becauseof the term Tni. Thus, one introduces single electron orbitals |χi〉, for whichTni =

∑i occ〈χi|−h2/2m∇2|χi〉, and ρ =

∑i occ |χi(x)|2 is the charge density

of the physically relevant interacting system. The value of this decomposi-tion is that Exc(ρ) is smooth and reasonably slowly varying, and therefore afunctional that we can successfully approximate: we have included the mostdifficult and rapidly varying parts of F in Tni and the Hartree integral, as canbe seen from essentially exact many-body calculations on the homogeneouselectron gas [99]. The Hartree and non-interacting kinetic energy terms areeasy to compute and if one makes the local density approximation, that is as-suming that the electron density islocally uniform). With information aboutthe homogeneous electron gas, the functional (Eq. 2.5) is fully specified.

With noninteracting orbitals |χi〉, (with ρ(x) = 2∑

i occ |〈x|χi〉|2), thenthe minimum principle plus the constraint that 〈χi|χj〉 = δij can be trans-lated into an eigenvalue problem for the |χi〉:

{−h2∇2/2m+ Veff [ρ(x)]}|χi〉 = εi|χi〉, (1.6)

where the effective density-dependent (in practical calculations, orbital-dependent) potential Veff is:

Veff [ρ(x)] = V (x) + e2∫d3x′ρ(x′)/|x− x′| + δεxc/δρ (1.7)

In this equation εxc is the parameterized exchange-correlation energy densityfrom the homogeneous electron gas. The quantities to be considered as physi-cal in local density functional calculations are: the total energy (electronic or

1 Defect theory: an armchair history 21

system), the ground state electronic charge density ρ(x), and related groundstate properties like the forces. In particular, it is tempting to interpret the|χi〉 and εi as genuine electronic eigenstates and energies, and indeed this canoften be useful. Such identifications are not rigorous [68]. It is instructive tonote that the starting point of density functional theory was to depart fromthe use of orbitals and formulate the electronic structure problem rigorouslyin terms of the electron density ρ; yet a practical implementation (whichenables an accurate estimate of the electronic kinetic energy) led us imme-diately back to orbitals! This illustrates why it would be very worthwhileto know F (ρ), or at least the kinetic energy functional since we would thenhave a theory with a structure close to Thomas-Fermi form and would there-fore be able to seek one function ρ rather than the cumbersome collection oforthonormal |χi〉.

The initiation of modern first-principles theory and its development intoa standard method with widespread application was due to Car and Par-rinello. [81] They developed a powerful method for simultaneously solvingthe electronic problem and evolving the positions of the ions. The method isusually applied to plane wave basis sets though the key ideas are indepen-dent of the choice of representation. One of the early application to defects insilicon was the diffusion of bond-centered hydrogen. [101] An alternative ab-initio approach to MD simulations, based on a tight-binding perspective, wasproposed by Sankey and co-workers. [82] Their basis sets consist of pseudo-atomic orbitals with s, p, d, . . . symmetry. The wavefunctions are truncatedbeyond some radius and renormalized. The early version of this code wasnot self-consistent and was restricted to minimum basis sets. A more recentversion [102] is self-consistent and can accommodate expanded and polarizedbasis sets. This is also the case for the highly flexible SIESTA code [10, 11](Spanish Initiative for the Electronic Structure with Thousands of Atoms).Although basis sets of local orbitals (typically, LCAOs) are highly intuitiveand allow population analysis and other chemical information to be calcu-lated, they are less complete than plane wave basis sets. The latter can easilybe checked for convergence. On the other hand, when an atom such as Si isgiven two sets of 3s and 3p plus a set of 3d orbitals, the basis set is sufficientto describe quite well virtually all the chemical interactions of this element,as the contribution of the n = 4 shell of Si is exceedingly small, except un-der extreme conditions that ground-state theory is not capable of handlinganyway. Many details of the implementation of density functional methodsis given in the contribution of Nieminen in Chapter 2.

The power of these methods is that they yield parameter free estimatesfor the structure of defects and even topologically disordered systems, provideaccurate estimates of total energies, formation energies, vibrational states, de-fect dynamics, and with suitable caveats information about electronic struc-ture, defect levels, localization, etc. There are many subtle aspects to their

22 David A. Drabold and Stefan K. Estreicher

applications to defect physics, partly because high accuracy is often requiredin such studies.

1.4 The Contributions

In Chapter 2, Nieminen carefully discusses the use of periodic boundary con-ditions in supercells calculations, detailing both strengths and weaknesses,and other basic features of these calculations. In Chapter 3, Goss and cowork-ers discuss the marker method to extract electronic energy levels, and includeseveral important applications. In Chapter 4, Estreicher and Sanati describethe calculation and remarkable utility of vibrational modes in systems con-taining defects, including novel analysis of finite-temperature properties ofdefects. Then, in Chapter 5, Hernandez and co-workers work out a propertheory of free energies and phase diagrams for semiconductors. Chapter 6describes the most rigorous attack of the book on the quantum many-bodyproblem, as Needs explores Quantum Monte Carlo methods and their promisefor defects in semiconductors. Schindlmayr and Scheffler describe the theoryand application of self-energy corrected density functional theory, the ‘GW’approximation in Chapter 7. Like the work of Needs, this technique has pre-dictive power beyond DFT. Csanyi and coworkers present a multiscale mod-eling approach in Chapter 8 with applications. Scheerschmidt offers a com-prehensive view of molecular dynamics in Chapter 9, focusing on empiricalmethods, though much of his chapter is applicable to first principles simula-tion as well. In Chapter 10, Drabold and Abtew discuss defects in amorphoussemiconductors with a special emphasis on hydrogenated amorphous silicon.Last but not least, in Chapter 11, Simdyankin and Elliott write on the theoryof light-induced effects in amorphous materials, an area of great basic andpractical interest, which nevertheless depends very much upon the defectspresent in the material.

References

1. J.L. Mass, M.T. Wypyski, and R.E. Stone, Ma. Res. Soc. Bull. 26, 38 (2001).2. A.M. Stoneham, Theory of Defects in Solids (Clarendon, Oxford, 1985).3. H.J. Queisser and E.E. Haller, Science 281, 945 (1998).4. F. Braun, Ann. Phys. Chem. (Leipzig) 153, 556 (1874).5. A.B. Fowler, Phys. Today, October 1993, p. 59.6. F. Seitz, Phys. Today, January 1995, p. 22.7. M. Riordan and I. Hoddeson, Phys. Today, December 1997, p. 42.8. I.M. Ross, Phys. Today, December 1997, p. 34.9. J.R. Chelikowsky, Interface 14, 24 (2005).

10. Oxygen in Silicon, ed. F. Shimura, Semic. and Semimetals 42 (Academic,Boston, 1994)

1 Defect theory: an armchair history 23

11. Early Stages of Oxygen Precipitation in Silicon, ed. R. Jones (Kluwer, Dor-drecht, 1996)

12. R. Scholtz, U. Gosele, J.Y.Y.Huh, and T.Y. Tan, Appl. Phys. Lett. 72, 200(1998).

13. S.K. Estreicher, Mat. Sci. Engr. R 14, 319 (1995)14. S.J. Pearton, in GaN and Related Materials, ed. S.J. Pearton (Gordon and

Breach, Amsterdam, 1997), p. 33315. F. Jiang, M. Stavola, A. Rohatgi, D. Kim, J. Holt, H. Atwater, and J. Kalejs,

Appl. Phys. Lett. 83, 931 (2003)16. G. Davies, Phys. Rep. 176, 83 (1989)17. G.D. Watkins, in Deep Centers in Semiconductors, ed. S.T. Pantelides (Gor-

don and Breach, New York, 1986), p. 14718. Y.V. Gorelkinskii and N.N. Nevinnyi, Physica B 170, 155 (1991) and Mat.

Sci. Engr. B 36, 133 (1996)19. S.T. Pantelides, in Deep Centers in Semiconductors, ed. S.T. Pantelides (Gor-

don and Breach, New York, 1986)20. S.K. Estreicher, Materials Today 6 (6), 26 (2003)21. W. Kohn, Sol. St. Phys. 5, 257 (1957)22. S.T. Pantelides, Rev. Mod. Phys. 50, 797 (1978)23. G.D. Watkins, in Radiation Damage in Semiconductors (Dunod, Paris, 1964),

p. 9724. G.D. Watkins and J.W. Corbett, Phys. Rev. 138, A543 (1965)25. G.W. Ludwig and H.H. Woodbury, Sol. St. Phys. 13, 223 (1962)26. W. Kaiser, P.H. Keck, and C.F. Lange, Phys. Rev. 101, 1264 (1956)27. J.W. Corbett, R.S. McDonald, and G.D. Watkins, J. Phys. Chem. Sol. 25,

873 (1964)28. G.G. DeLeo, G.D. Watkins, and W. Beal Fowler, Phys. Rev. B 25, 4972 (1982)29. A. Zunger and U. Lindefelt, Phys. Rev. B 26, 5989 (1982)30. T.L. Estle, S.K. Estreicher, and D.S. Marynick, Hyp. Inter. 32, 637 (1986)

and Phys. Rev. Lett. 58, 1547 (1987)31. S.T. Pantelides in Deep Centers in Semiconductors, ed. S.T. Pantelides (Gor-

don and Breach, New York, 1986)32. G.F. Koster and J.C. Slater, Phys. Rev. 95, 1167 (1954) and 96, 1208 (1954)33. J. Callaway, J. Math. Phys. 5, 783 (1964)34. J. Callaway and A.J. Hughes, Phys. Rev. 156, 860 (1967) and 164, 1043

(1967)35. J. Bernholc and S.T. Pantelides, Phys. Rev. B 18, 1780 (1978)36. J. Bernholc, N.O. Lipari, and S.T. Pantelides, Phys. Rev. Lett. 41, 895 (1978)37. G.A. Baraff and N. Schluter, Phys. Rev. Lett. 41, 892 (1978)38. G.A. Baraff, E.O. Kane, and M. Schluter, Phys. Rev. Lett. 43, 956 (1979)39. G.A. Baraff, and M. Schluter, Phys. Rev. B 30, 1853 (1984)40. M. Scheffler, J.P. Vigneron, and G.B. Bachelet, Phys. Rev. Lett. 49, 1756

(1982)41. U. Lindefelt, Phys. Rev. B 28, 4510 (1983)42. U. Lindefelt and A. Zunger, Phys. Rev. B 30, 1102 (1984)43. G.A. Baraff and M. Schluter, Phys. Rev. B 28, 2296 (1983)44. R. Car, P.J. Kelly, A. Oshiyama, and S.T. Pantelides, Phys. Rev. Lett. 52,

1814 (1984)45. D.N. Talwar, M. Vandevyver, and M. Zigone, J. Phys. C 13, 3775 (1980)

24 David A. Drabold and Stefan K. Estreicher

46. R.M. Feenstra, R.J. Hauenstein, and T.C. McGill, Phys. Rev. B 28, 5793(1983)

47. R. Car, P.J. Kelly, A. Oshiyama, and S.T. Pantelides, Phys. Rev. Lett. 54 360(1985)

48. F. Aryasetiawan and O. Gunnarsson, Rep. Prog. Phys. 61, 237 (1998)49. M. Scheffler and A. Schindlmayr, in Theory of Defects in Semiconductors, ed.

D. A. Drabold and S.K. Estreicher (Springer, Berlin, 2006)50. J. Friedel M. Lannoo, and G. Leman, Phys. Rev. 164, 1056 (1967)51. R.P. Messmer and G.D. Watkins, Phys. Rev. Lett. 25, 656 (1970) and Phys.

Rev. B 7, 2568 (1973)52. F.P.Larkins, J. Phys. C: Sol. St. Phys. 4, 3065 (1971)53. R.P. Messmer and G.D. Watkins, Proc. Reading Conf. on Radiation Dam-

age and Defects in Semiconductors, e.d. J.E.Whitehouse (Insitute of Physics,1973), p. 255

54. A. Mainwood, J. Phys. C: Sol. St. Phys. 11, 2703 (1978)55. J.W. Corbett, S.N. Sahu, T.S. Shi, and L.C. Snyder, Phys. Lett. 93A, 303

(1983)56. P.Deak and L.C. Snyder, Radiation Effects and Defects in Solids 111-112, 77

(1989)57. P. Deak and L.C. Snyder, Phys. Rev. B 36, 9619 (1987)58. P. Deak, L.C. Snyder, and J.W. Corbett, Phys. Rev. B 45, 11612 (1992)59. J. Miro, P. Deak, C.P. Ewels, and R. Jones, J. Phys.: Cond. Matt. 9, 9555

(1997)60. G.G. DeLeo, G.D. Watkins, and W.B. Fowler, Phys. Rev. B 23, 1819 (1981)61. G.G. DeLeo, W.B. Fowler, and G.D. Watkins, Phys. Rev. B 29, 1851 (1984)62. T.A. Halgren and W.N. Lipscomb, J. Chem. Phys. 58, 1569 (1973)63. D.S. Marynick and W.N. Lipscomb, Proc. Nat. Acad. Sci. (USA) 79, 1341

(1982)64. S.K. Estreicher, A.K. Ray, J.L. Fry, and D.S. Marynick, Phys. Rev. Lett. 55,

1976 (1985)65. See e.g. S.K. Estreicher, Phys. Rev. B 41, 9886 (1990) and 41, 5447 (1990)66. See e.g. S.K. Estreicher, Phys. Rev. B 60, 5375 (1999)67. A.A. Bonapasta, A. Lapiccirella, N. Tomassini, and M. Capizzi, Phys. Rev. B

36, 6228 (1987)68. E. Artacho and F. Yndurain, Sol. St. Comm. 72, 393 (1989)69. R. Luchsinger, P.F. Meier, N. Paschedag, H.U. Suter, and Y. Zhou, Phil.

Trans. R. Soc. Lond. A 350, 203 (1995)70. P. Hohenberg and W. Kohn, Phys. Rev. 136B, 468 (1964); W. Kohn and L.J.

Sham, Phys. Rev. 140A, 1133 (1965); L.J. Sham and W. Kohn, Phys. Rev.145, 561 (1966)

71. W. Kohn, Rev. Mod. Phys. 71, 1253 (1999)72. See, e.g., M. Saito and A. Oshiyama, Phys. Rev. B 38, 10711 (1988)73. R. Jones and A. Sayyash, J. Phys. C 19, L653 (1986)74. R. Jones and P.R. Briddon, in Identification of Defects in Semiconductors, ed.

M. Stavola, Semiconductors and Semimetals 51 (Academic, Boston, 1998)75. J.D. Holbech, B. Bech Nielsen, R. Jones, P. Sitch, and S. Oberg, Phys. Rev.

Lett. 71, 875 (1993)76. R. Jones, B.J. Coomer, and P.R. Briddon, J. Phys. : Condens. Matter 16,

S2643 (2004)

1 Defect theory: an armchair history 25

77. S. Ogut and J.R. Chelikowsky, Phys. Rev. Lett. 83, 3852 (1999) and Phys.Rev. B 64, 245206 (2001)

78. D.R. Hamann, M. Schluter, and C. Chiang, Phys. Rev. Lett. 43, 1494 (1979)79. G.B. Bachelet, D.R. Hamann, and M. Schluter, Phys. Rev. B 26, 4199 (1982)80. L. Kleinman and D.M. Bylander, Phys. Rev. Lett. 48, 1425 (1982)81. R. Car and M. Parrinello, Phys. Rev. Lett. 55, 2471 (1985)82. O. Sankey and D.J. Niklewski, Phys. Rev. B 40, 3979 (1989)83. A. Zunger and A. Katzir, Phys. Rev. B 11, 2378 (1975)84. S.G. Louie, M. Schluter, J.R. Chelikowsky, and M.L. Cohen, Phys. Rev. B 13,

1654 (1976)85. W. Pickett, M.L. Cohen, and C. Kittel, Phys. Rev. B 20, 5050 (1979)86. J. H. Van Vleck, Phys. Rev. 49 232 (1936).87. In this prescient paper on Heisenberg’s theory of ferromagnetism, Van Vleck

introduces what later was called the ‘Van Vleck Catastrophe’, and emphasizesthe fundamentally non-local character of quantum mechanics as expressedin Equation 1.1, the associated factorial growth in complexity, and its direimplications to attempts to compute many-particle states for large systems.

88. W. Kohn, Phys. Rev. Lett, 76 3168 (1996).89. X. P. Li, R. W. Nunes and D. Vanderbilt, Phys. Rev. B 47 10891 (1993).90. S. N. Taraskin, D. A. Drabold and S. R. Elliott, Phys. Rev. Lett. 88 196405

(2002).91. See R. M. Martin, Electronic Structure, Basic Theory and Practical Applica-

tions, Cambridge University Press, Cambridge (2004).92. D. R. Hartree, Proc. Cambridge Philos. Soc. 24 89 (1928).93. F. Fock, Z. Phys. 61 126 (1930).94. P. Fulde, Electron Correlations in Molecules and Solids, Springer, Berlin

(1993).95. L. H. Thomas, Proc. Cambridge Philos. Soc. 23 542 (1927).96. E. Fermi, Z. Phys. 48 73 (1928).97. R. G. Parr and W. Yang, Density-functional theory of atoms and molecules,

Oxford, Clarendon (1989).98. P. Hohenberg and W. Kohn, Phys. Rev. 136 B864 (1964); W. Kohn and L.

J. Sham, Phys. Rev. 140 A1133 (1965).99. D. M. Ceperley and G. J. Alder, Phys. Rev. Lett. 45, 566 (1980).

100. P. Ordejon, E. Artacho and J. M. Soler, Phys. Rev. B 53, R10441 (1996);101. F. Buda, G.L. Chiarotti, R. Car, and M. Parrinello, Phys. Rev. Lett. 63, 4294

(1989)102. A.A. Demkov, J. Ortega, O.F. Sankey, and M.P. Grumbach, Phys. Rev. B 53,

10441 (1995)103. D. Sanchez-Portal, P. Ordejon, E. Artacho, and J.M. Soler, Int. J. Quant.

Chem. 65, 453 (1997)104. E. Artacho, D. Sanchez-Portal, P. Ordejon, A. Garcıa, and J.M. Soler, Phys.

Stat. Sol. (b) 215, 809 (1999)

2 Supercell methods for defect calculations

Risto M. Nieminen

COMP/Laboratory of Physics, Helsinki University of Technology, POB 1100,FI-02015 HUT, Finland [email protected]

2.1 Introduction

An important role for theory and computation in studies of defects in semi-conductors is to provide means for reliable, robust interpretation of defectfingerprints observed by many different experimental techniques, such asdeep-level transient spectroscopy, various methods based on positron annihi-lation (PA), local-vibrational-mode (LVM) spectroscopy, and spin-resonancetechniques. [1] High-resolution studies can provide a dizzying zoo of defect-related features, the interpretation and assignment of which requires accuratecalculations of both the defect electronic and atomic properties as well asquantitative theory and computation also for probe itself.

Materials processing can also draw significant advantages of the predictivecomputational studies. The kinetics of defect diffusion and reactions during

thermal treatment depend crucially on the atomic-scale energetics (for-mation energies, migration barriers, etc., derived from the bond-breaking,bond-making and other chemical interactions between the atoms in question.Accurately calculated estimates for defect and impurity energetics can consid-erably facilitate the design of strategies for doping and thermal processing toachieve the desired materials properties. Parameter values calculated atom-istically, from the electronic degrees of freedom, can be fed into multiscalemodeling methods for defect evolution, such as kinetic Monte Carlo, cellularautomaton, or phase-field simulation tools. [2]

A defect in otherwise perfect material breaks the crystalline symmetryand introduces the possibility of localized electronic states in the funda-mental semiconducting or insulating gap. Depending on the position of thehost-material Fermi level of the host semiconductor, these states are eitherunoccupied or occupied by one or more electrons, depending on their degen-eracy and the spin assignment. The defects thus appear in different chargestates, with varying degrees of charge localization around the defect center.To achieve a new ground state, the neighboring atoms relax around the defectto new equilibrium positions. For a point defect a new point symmetry groupcan be defined.

The energy levels in the gap, also known as ‘ionization levels’, ‘occupa-tion energy levels’ or ‘transition levels’, correspond to the Fermi level posi-tions where the ground state of the system changes from one charge state

28 Risto M. Nieminen

to another. Their values can be computationally estimated by the ‘ΔSCF’approach, i.e. from total-energy differences between different charge states.These gap levels can be probed experimentally by temperature-dependentHall conductivity measurements, deep-level-transient spectroscopy (DLTS),and photoluminescence (PL). The nature of the defect-related gap-state wavefunctions can be examined with electron paramagnetic resonance (EPR) andelectron-nuclear double-resonance (ENDOR) measurements.

Experimentally, the defect energy levels can often be located with the pre-cision of the order of 0.01 eV with respect to host material band edges. Theproper interpretation of the levels and the physical features associated withthem pose demanding challenges for theory and computation. At present,the computational accuracy of the level position is typically a few tenths ofan eV. It follows that sometimes different calculations for the same physicalsystem lead to very different conclusions and interpretations of the experi-ments. Defect calculations may also fail to predict correctly the symmetry-breaking distortions around defects undisputedly revealed by experiments.While theoretical modeling has had considerable success and a major impactin semiconductor physics, there are still severe limitations to its capabilities.The purpose of this chapter is to point out and discuss these.

Density-functional theory (DFT) [3] is the workhorse of atomic-scale com-putational materials science and is also widely used to study defects in semi-conductors, predict their structures and energetics, vibrational and diffusionaldynamics, and elucidate their electronic and optical properties, as observedby the various experimental probes. The central quantity in DFT is the elec-tron density. The ground-state total energy is a functional of the electrondensity and can be obtained via variational minimization of the functional.The wave-mechanical kinetic energy part of the total-energy functional isobtained not from the density directly but through a mapping to a non-interacting Kohn-Sham system. The mean-field electrostatic Hartree energyis obtained from the density, as are in principle the remaining exchange andcorrelation terms, coded into the exchange-correlation functional.

The purpose of this chapter is to examine critically the methodologicalstatus of calculations of defects in semiconductors, especially those basedon the so-called supercell methods. As will be discussed in more detail be-low, there are three main sources of error in such calculations. The first isthe proper quantum-mechanical treatment of electron-electron interactions,which at the DFT level is dependent on the choice of the exchange-correlationfunctional. The popular local- or semi-local density approximations lead tounderestimation, sometimes serious, of the semiconducting gap. The under-lying reasons for this are the neglect of self-interactions and the unphysicalcontinuity of the exchange-correlation energy functional as a function of leveloccupation number. This has naturally serious consequences to the mappingof the calculated defect electronic levels onto the experimental energy gap.The second source of errors is related to the geometrical description of the

2 Supercell methods for defect calculations 29

defect region, often known as the finite-size effect. The third source of erroris the numerical implementation of DFT, i.e. the self-consistent solution ofthe effective-particle Kohn-Sham equations and the evaluation of the totalelectronic energy.

2.2 Density-functional theory

The quantum physics of electrons in materials is governed by the Schrodingerequation. Density-functional theory (DFT) provides a parameter-free frame-work for casting the formidable many-electron problem to a numericallytractable form involving only the three spatial coordinates of the N inter-acting electrons (as opposed to 3N in the full solution of the Schrodingerequation).

The fundamental starting point is the DFT expression for the total energyEtot of the electronic system:

Etot [{ψiσ(r), {Rα } ] = h2

2m

∑i,σ

⟨ψiσ(r)

∣∣−12∇2

∣∣ψiσ(r)⟩

+e2

2

∫ ∫drdr′ n(r)n(r′)

|r−r′| +∫drVion(r)n(r) + 1

2

∑α,β;α �=β

ZαZβ

|Rα−Rβ |

+Exc [n↑(r), n↓(r)]

(2.1)

where the total electronic density n(r) =∑i,σ

fiσ |ψiσ(r)|2 is expressed as a

summation of the one-electron states ψiσ(r), dependent on the spin indexσ =↓, ↑ and with the occupation number fiσ. Above, Vion is the nuclear(ionic) potential, and Zα denotes the charge of ion α at Rα.

The coupled Kohn-Sham eigenvalue equations

− h2

2m∇2ψiσ(r) + Veff [n↑(r), n↓(r)]ψiσ(r) = εiσψiσ(r) (2.2)

resulting from the minimization of Etot with respect to the density recast thecomplex many-body interactions into an effective single-electron potentialVeff via the exchange-correlation functional Exc. In practice, the functionalExc has to be approximated. The popular choices are functionals where ex-change and correlation are treated as functions of the local density and/or itsgradients. This sounds drastic, because electronic Pauli exchange is a mani-festly non-local object, as demonstrated by the Hartree-Fock theory. However,there is a partial cancelation of errors, and these methods are surprisinglyrobust and accurate. The generalization to spin degrees of freedom is alsostraightforward in scalar-relativistic DFT.

A useful primer for designing DFT calculations has been written by Matts-son et al. [4] The design of any calculation, whether using periodic bound-ary conditions, finite clusters or embedding techniques, involves the choice

30 Risto M. Nieminen

of the exchange-correlation functional. The choice, whether a particular fla-vor of local-density (LDA) or local spin-density (LSDA) approximation, ageneralized-gradient approximation (GGA), full Hartree-Fock-type exchange,screened exchange, or a ‘hybrid’ between a non-local orbital-based functionaland a density-dependent functional, defines the physical accuracy of the cal-culation. The numerical accuracy of the calculation depends on such techni-cal things as basis-set completeness, accuracy of integration in both real andreciprocal space, convergence criteria etc. The model accuracy depends nat-urally on how faithfully the chosen supercell, cluster or embedding methodsdescribe the desired situation.

In the context of defects in semiconductors, the ground-state propertiescan be obtained via the minimization of the total energy Etot with respect tothe charge density n(r) and the ionic positions Rα through the Hellmann-Feynman interatomic forces.This enables the calculation of not only the for-mation energy of a defect in a given lattice position, but also its low-energyvibrational excitations, and any observables accessible via the charge and spindensities, such as positron-annihilation parameters [5] or hyperfine fields. [6,7]

2.3 Supercell and other methods

A popular way to calculate defect energetics from first principles (i.e. fromelectronic degrees of freedom) is based on the supercell idea. In this methodone confines the atoms defining the defect area of interest into an other-wise arbitrary box, which is then repeated infinitely in one or more spatialdirections (see Fig. 2.1). In other words, this box (the supercell) now be-comes the new unit cell of the system, and periodic boundary conditions areapplied at one or more of its boundaries. For a point-defect assembly in athree-dimensional solid, the system now becomes a three-dimensional peri-odic defect array. For a line defect (e.g. a dislocation) the result is regularline-defect network. For a surface, the system becomes a sandwich of materialslabs interlaced by vacuum regions. Typical sizes of the supercell are nowa-days 64 atoms or larger in a cubic system, and the increasing computationalpower has enabled supercells with hundreds of atoms.

The supercell method is one of the three main approaches to defect cal-culations. The other two are the finite-cluster methods [8, 9] and the Green’sfunction embedding techniques. [10,11] In the former the interesting defect issimply incorporated in a finite atomic cluster (with often atoms added to thecluster surface to saturate any dangling electron bonds). The finite-clustermethod is well suited for studies of local defect properties (such as vibrationalmodes), provided that the cluster is large enough and thus the surface effectssmall. However, care should be taken to make sure that manipulating the clus-ter surface does not interfere with the defect-localized electrons. Estreicheret al. [12] have shown how this can influence for example the defect-relatedhyperfine fields.

2 Supercell methods for defect calculations 31

Fig. 2.1. Schematic presentation of the supercell construction. (a) lattice vacancyin a solid, (b) surfaces in a stack of slabs, (c) isolated molecules.

The embedding techniques, in turn, match the perturbed defect regionto the known DFT Green’s function of the unperturbed host material, andoffer in principle the best way to study isolated defects. However, the nu-merical implementation of the Green’s function method is challenging foraccurate interatomic forces and long-range atomic relaxations, as it requiresa well-localized defect potential and usually short-range basis functions. Thussupercell methods have surpassed the Green’s function methods in popularity.

The great advantage of supercell calculations is that the periodic bound-ary conditions allow the utilization of the many efficient techniques derived

for the quantum physics of periodic systems. The wave vectors of theBrillouin zone (BZ) in the reciprocal space of the supercell are good quantumnumbers, and the standard ‘band-structure methods’ of periodic solids can beapplied in full force. Particularly fast computation methods can be performedby using Fourier analysis (plane-wave basis sets), as they adopt naturally toperiodic boundary conditions and offer spatially uniform resolution.

The supercell approach enables the full relaxation of the structure to mini-mize total energy, and the calculation of the formation and migration energiesof defects in different charge states as a function of the Fermi level positionof the host semiconductor, and as a function of the chemical potentials ofthe atoms building up the material. Thus the method can be convenientlyadapted to describe such effects as the host material stoichiometry on theenergetics of isolated defects in compound semiconductors. [13] From super-cell calculations one can extract several useful physical properties, such as:(i) the probabilities of certain types of defect to form under given chemicaland thermodynamical conditions during the growth, (ii) the basic nature (ac-ceptor, donor, deep level) of the defect electronic states, (iii) the migrationand recombination barriers, (iv) vibrational modes, hyperfine fields, positron-annihilation parameters and other ground-state properties , and (v) within

32 Risto M. Nieminen

certain limitations, excited-state (for example optical) properties as well. Theground-state properties are those most reliably accessible. For example, thedefect-induced electronic energy levels in the fundamental gap are obtainedas differences between formation energies for different charge states.

2.4 Issues with the supercell method

The obvious drawback of the supercell methods for defect calculations is thatthe periodicity is artificial and can lead to spurious interactions between thedefects. They have a finite density, and do not necessarily mimic the truephysical situation with an aperiodic, very low-density defect distribution.The supercell method, widely used and with demonstrated success in de-fect studies, requires a critical examination of the finite-size and periodicityeffects.

The first consequence of the finite-size supercell approximation is thebroadening of the defect-induced electronic levels to ‘defect bands’, with abandwidth of the order of 0.1 eV for 64-atom supercells. This translatesinto a difficulty in placing accurately the ionization levels, also in the total-energy basedΔSCF approach. The one-electron Kohn-Sham states associatedwith the defect are often assigned a position by averaging over the supercellBrillouin zone, or just values at the Brillouin zone origin (the Γ point) arediscussed as they exhibit the full symmetry of the defect. It can be arguedthat the defect-state dispersion has a smaller effect on the total energieswhich include summations and integrations of the Kohn-Sham states overthe entire Brillouin zone. Thus ionization levels determined from the total-energy differences would also be less sensitive to the defect-band dispersion.However, this is hard to prove systematically.

Another source of difficulty is the accurate determination of the host crys-tal band edges (valence band maximum, conduction-band minimum), whichare the natural reference energies for the defect-induced gap states. In thedefect-containing supercell, the band-edge states themselves are affected bythe defect. The band edge positions are often determined by aligning a chosenreference level between the defect-containing and perfect solid. For example,the effective potential in a localized region far from the defect can be alignedwith the potential in the same region in a perfect crystal. The calculatedband-edge distances from the reference energy in a perfect crystal can thenbe used to align the band edges in the system containing defect. Alternatively,one can consider a deeper, core-level energy of an atom as a local reference.Another possibility is to define as the ‘crystal zero’ the electrostatic potentialat a Wigner-Seitz cell boundary far away from the defect. This is particu-larly convenient when the atomic-sphere approximation (ASA) is used in thecontext of minimal basis sets, such as the LMTO. [14]

Another, more difficult problem arises with charged defects. In order toavoid divergences in electrostatic energies, a popular solution is to introduce

2 Supercell methods for defect calculations 33

a homogeneous neutralizing background charge to the supercell array, whichenables the evaluation of electrostatic (Coulomb) energies. This, however, in-troduces an electrostatic interaction between the periodic charge distributionin the supercells and the background, which vanishes only at the limit of

infinitely large supercells. The influence of the fictitious charge has to besubtracted in the end, and this is a highly nontrivial task, as discussed below.

Point defects induce elastic stress in the host lattice, which is relievedby ion displacements, i.e. lattice relaxation. The lattice relaxation pattern isrestrained by the supercell geometry. The argument, often used in supercellcalculations, is that the ion displacements vanish near the borders of thesupercell. However, this does not necessarily guarantee that the long-rangeionic relaxations are correctly described, as the supercell symmetry itself mayfix the positions of the border ions. According to elastic continuum theory, thestrain field at large distances from the point defect should fall off as |r|−3 andthe ionic displacements should fall off as |r|−2. In tetrahedrally coordinatedcovalent materials the distances between the periodic images along the rigid[110] zigzag chains are typically more important to the convergence of latticerelaxations than the volume relaxations. A case in point is the vacancy insilicon, [15] where the sense of the lattice relaxation changes from outward toinward only at large enough cell sizes. The similar behavior is also observedin finite-cluster calculations. [16]

The finite size of the supercell thus restricts also the ionic relaxationsaround the defect. The relaxation pattern is truncated midway between adefect and its nearest periodic replica. In the case of long-range relaxationsthis cutoff may be reflected dramatically close to the defect, as was demon-strated by Puska et al. by detailed calculations for vacancies in silicon. [15]

Further comparison of the supercell to the finite-cluster and Green’s func-tion methods reveals the following. The supercell and Green’s function meth-ods have a well-defined electron chemical potential (they are coupled to areservoir of electrons) whereas the cluster method does not. This has conse-quences for the treatment of ionization levels, which is somewhat problematicfor the cluster method. The finite-size effect is there in the cluster method aswell, as the clusters can contain spurious surface-related effects. The Green’sfunction method is mathematically elegant and in principle the best for iso-lated defects, but its computational implementation is difficult in view of theaccuracy required for total electronic energies and derived quantities (suchas interatomic Hellmann-Feynman forces).

Finally, it is important to note that defect and impurity calculationsshould as a rule be carried out using the theoretical lattice constant, op-timized for the bulk unit cell. This is crucial in order to avoid spurious elasticinteractions with defects or impurities in the neighboring supercells. The pur-pose is to investigate properties of isolated defects or impurities in the dilutelimit. If the volume of the defect-containing supercell is relaxed (in additionto relaxing the positions of the atoms near the defect), the calculation would

34 Risto M. Nieminen

in fact correspond to finding the lattice parameter of the system containingan ordered array of defect at a high concentration.

2.5 The exchange-correlation functionals and thesemiconducting gap

The fundamental semiconductor gap Eg is defined as the difference betweenthe ionization energy I and the electron affinity EA of the bulk material,

Eg = I −EA (2.3)

As both quantities can be written in terms of total energies of systems withdifferent numbers of electrons, the value of the gapEg is in principle a ground-state property and can thus be calculated within the Kohn-Sham DFT. Thefundamental gap can also be written as

Eg = EKSg +Δxc . (2.4)

EKSg is the difference between the highest occupied and lowest unoccupied

Kohn-Sham eigenvalues εiσ. Δxc is the discontinuity (as a function of theoccupation number) of the exchange-correlation potential at the Fermi level.

The estimation of semiconductor band gaps and positions is one of theclassic failures of local(-spin-)density (L(S)DA) formulation of the Kohn-Sham DFT. There are two origins for this error. First is the self-interactionerror inherent in the LDA potential, and the second is the vanishing of thediscontinuity Δxc of the local-density exchange potential as a function of thelevel occupation at the Fermi level. They lead to both wrong absolute energypositions and too small or entirely absent band gaps for many materials. Thisis not remedied by semi-local approximations for exchange and correlation,such as the generalized-gradient approximations (GGA) where the disconti-nuity also vanishes. In fact, the experience is that GGA approximations tendto make the gap even smaller than local-density approximations. [17] Thiscan be traced to the property of GGA methods increasing the interatomicdistances and lattice constants compared to LSDA. It is another well-knownflaw of local-density methods that they overbind molecules and solids and un-derestimate bond and lattice distances. GGA methods correct and sometimesovercorrect this, but give even smaller band gap values.

A qualitatively opposite problem is encountered with the Hartree-Fockmethod, which is known to considerably overestimate the band gaps. More-over, its application to metallic systems leads to well-known problems, such asthe vanishing density of states at the Fermi level. Thus, intuitively, a hybridmethod ‘interpolating’ between the two seems an attractive approach.

Non-local descriptions of exchange are expected to improve the Kohn-Sham band gaps as they remove the self-exchange error and exhibit discon-tinuity at the Fermi level. In the ‘exact exchange’ (EXX) formulation one

2 Supercell methods for defect calculations 35

evaluates the exchange energy and potential using the Kohn-Sham (ratherthan Hartree-Fock) wavefunctions. Stadele et al. [18] have shown that thediscontinuity Δx in the exchange potential is much larger than the

band gap for several semiconductors, which points to the existence of alarge cancelation between the exchange Δx and the corresponding correlationΔc.

The EEX method has been reported to give very good values for theband gap and many other properties for sp semiconductors. [19] However,when applied to rare-gas solids, EXX fails to reproduce the experimentalband gaps. [20] A related problem is the position of the d band in manysolids. Again, the self-interaction error in local-density methods displaces thed states and contributes to the band-gap error. The d band positions can beimproved by the explicit removal of self-interactions, [21] but the applicationof the non-local EXX does not seem to lead to a consistent improvement ofboth d band positions and band gap values.

This points to the important interplay between treatment of non-localexchange and the description of the lower-lying and core states in semicon-ductors and insulators. It has recently been pointed out [22] that the lack ofexplicitly treated core-valence interactions can lead to an anomalously goodagreement of the calculated Kohn-Sham gap with the experimental gap in spsemiconductors. The inclusion of these interactions worsens the agreementwhile leading to a consistent treatment of semiconductors and insulators. Itthus appears that the EXX alone does not solve the band-gap problem, al-though it is a marked improvement over LDA or GGA. A major drawbackis also the high computational cost of EEX, which has so far prohibited itswidespread use in defect calculations.

In the screened-exchange LDA (sx-LDA) scheme [23] the Kohn-Shamsingle-particle orbitals are used to construct a non-local exchange operator,again improving the description compared to the local-density approximation.The many-body correlations and screening missing from the Hartree-Fockmethod are treated in a form of model screening and LDA correlation. Thisapproach can solve the gap problem, but contains a phenomenological con-stant (e.g. the Thomas-Fermi screening constant). Lento and Nieminen [24]have applied this method for a prototypical supercell defect calculation (va-cancy in Si). While the method can in principle give accurate total energiesand thus ionization levels (and the correct gap), its full exploitation is stillprohibitively expensive computing-wise. As with other hybrid functionals andwith EXX, the computational cost is typically two orders of magnitude largerthan with LDA or GGA. Thus the calculations with full lattice relaxationhave so far been limited to supercells not large enough to allow the correctpattern to emerge (see Sec. 16 below).

There are other suggested approaches to overcome the failure of DFTto produce accurate band gaps and excited-state properties. However, cur-rently no method is available that would go beyond standard DFT and yet

36 Risto M. Nieminen

be feasible for total-energy calculations for the large supercells required toinvestigate defects. The class of methods, such as the GW approach. [25]which are mainly aimed at calculating the (quasiparticle) band structureand properties derived thereof, are prohibitively expensive for large super-cells. Presently such calculations can be carried out with supercells with lessthan twenty atoms. However, this size is perfectly adequate for treating well-localized excitations, such as self-trapped excitons. [26]

The gap problem may sometimes make calculations for certain defectlevels impossible. The too narrow gap may induce the defect level to be aresonance in the conduction or valence band. In this case some charge stateswould not be accessible at all. It should be noted that this can also happen injust some part of the Brillouin zone because of the defect level dispersion inthe supercell approach. A possible remedy is then to simply occupy the defectlevel, ignoring the fact the state lies above the conduction band minimum atother k points.

Apart from underestimating the band gap and thus hampering the posi-tioning the possible ionization levels, the local (or quasi-local)-density approx-imation for exchange and correlation can sometimes lead to other difficultieswith the defect-localized electron states. This has recently been discussed byLaegsgaard et al. [27] They considered a substitutional Al impurity on theSi site in SiO2. LDA and GGA calculations predict a structure which hasthe full tetrahedral symmetry. However, ENDOR experiments show unam-biguously that the defect wave function is localized on just one of the fourneighboring oxygen atoms, as the hyperfine splittings are radically differentfrom those expected for four equivalent neighbors. The source of this prob-lem can again be traced to the self-interaction error in LDA and GGA. InLDA/GGA, there is a penalty for localizing a single electron as the cancela-tion between self-exchange/correlation and self-Coulomb interactions is notcomplete. Thus one overestimates the Coulomb energy of a localized state,which then leads to charge delocalization. In the true situation, there is justa single electron state contributing to the Coulomb energy. Thus the calcu-lated charge delocalization is completely due to self-interaction. If exchange istreated correctly as in the Hartree-Fock theory, self-interaction is eliminatedand one obtains the correct distorted structure with the wave function local-ized on a single oxygen atom. As noted above, the problem with Hartree-Fockis that it seriously overestimates the band gap and spin splittings, and is assuch usually not a good tool for calculations of semiconductor defects.

A self-interaction-corrected (SIC) scheme for the case of hole trapping inSiO2 was recently suggested by d’Aveszac et al. [28] In their approach the

Coulomb energy arising from the charge density of a single defect or-bital and the associated contribution to the exchange-correlation energy areexplicitly subtracted. This is technically straightforward for methods usinglocalized basis sets, but can also be implemented using projection techniqueswith plane-wave methods. The physical idea of this self-interaction correction

2 Supercell methods for defect calculations 37

is that delocalized states do not suffer as much from self-interaction error aslocalized states do. In the standard SIC [29, 30] approaches there is someambiguity in choosing the states to which the correction should be applied.For defect calculations, the localized state in the gap is well-defined and theapplication of the correction is quite straightforward.

The LDA/GGA gap error is a persistent and rather harmful problemin semiconductor defect physics. The Kohn-Sham eigenvalues of Eq. 2.2 areLagrange parameters used to orthogonalize the states in total-energy min-imization, but cannot be interpreted as true electron excitation energies.These could be obtained from quasi-particle theories such as Hedin’s GWapproach [31] or time-dependent density-functional theory. [32] The GWmethod is computationally expensive and cannot be applied to defect su-percells with many atoms. Its ability to give converged total energies is notclear. However, it is a powerful approach for doing perturbative calculationsfor optical and other excitation spectra, and can be generalized to includethe final-state electron-hole interactions through the Bethe-Salpeter equa-tion. [33]

One simple, obvious and often used possibility is to use the ‘scissor op-erator’, i.e. to correct the band gap by simply shifting the unoccupied andoccupied states further away from each other. This of course destroys theself-consistency between Kohn-Sham eigenvalues and eigenfunctions, whichmay be detrimental for observables calculated from them. For defect physics,the question arises whether to shift the defect levels or not in such a process.Intuitively, one would expect acceptor-like levels move with the valence-bandedge and donor-like states with the conduction-band edge , while levels deepin the band gap would not move at all. This reasoning is, of course, quitearbitrary and unsatisfactory. A recent example of these difficulties is the workon the oxygen vacancy in ZnO, where the LDA gap is again considerably inerror. Two calculations [34,35] have addressed this system using the LDA+Umethod, [36] where an on-site Coulomb repulsion U is added ‘by hand’ whencalculating the Zn d-states. This shifts the occupied d bands and subsequentlythe valence band minimum downwards. The defect level associated with theoxygen vacancy then moves deeper into the gap from the valence-band max-imum. If one leaves the defect level in place and shifts just the conductionbands up, there is a single level in the lower half of the gap. If, on the otherhand, one decides from LDA+U calculation how much valence/conductionband character the state has, one ends up moving the state further up in thegap, into the upper half. As the experiments are ambiguous at this time, thematter is undecided and again calls for a better solution to the gap problem.

Finally, the Quantum Monte Carlo (QMC) methods which look promisingfor addressing total-energy calculations for defects and impurities. They ap-proach the problem of interacting electrons by addressing the correlations ex-plicitly and build in the Pauli exchange by starting from a Slater determinant.The fixed-node QMC method has been applied to study self-interstitials in

38 Risto M. Nieminen

Si. [37] The formation energy of the split-<110> interstitial defect was foundto be significantly higher with QMC than LDA, which can be explained bythe upward shift of the defect-induced level in the band gap. Further workincludes QMC studies of optical and diffusive properties of vacancies in dia-mond. [38] If QMC methods can be extended to large enough supercells, toforce and relaxation calculations, and to secured convergence with respect tok point summation, they can provide a most useful set of benchmarks foraccurate total-energy calculations for defects in semiconductors.

The above discussion may give an overly pessimistic impression of the ca-pabilities of DFT calculations for defects in semiconductors. The other sideof the coin is much brighter. There is a vast inventory of ‘standard’ DFTcalculations which have produced robust and reliable results for semiconduc-tor defects and have thus decisively contributed to their identification. Anillustrative example is the extensive work on III-V nitride semiconductors,recently reviewed by Van de Walle and Neugebauer. [39]

2.6 Core and semicore electrons: pseudopotentials andbeyond

The widespread use and success of DFT in predicting various materialsproperties stems largely from the successful application of the pseudopo-tential(PP) concept. Only relatively few ‘valence’ electrons of an atom arechemically active. The replacement of chemically inert ‘core’ electrons withan effective core potential both reduces the number of electronic degrees offreedom and avoids the numerical challenge posed by rapidly varying wavefunctions near the nucleus. Efficient numerical techniques such as the fastFourier transform (FFT) can be used. The replacement of the all-electronpotential with a PP is, however, a non-trivial task where a balance has to bestruck between optimal transferability (accurate reproduction of all-electronatom behavior) and computational efficiency (slow spatial variability).

Several methods are popular for generating PPs. They include the gener-alized norm-conserving pseudopotentials of Hamann, [40] Troullier and Mar-tins, [41] Hartwigsen et al. [42] and the ultra-soft, non-norm-conserving PPsof Vanderbilt. [43] It is important to realize that the true figure of merit of aPP is not how well its results match experiment, but how well it reproducesthe results of all-electron calculations when using otherwise similar methods.

The separable PPs commonly used in the context of plane-wave calcu-lations, such as those of the Kleinman-Bylander [44] form, are built fromnorm-conserving PPs. The separation process can sometimes add complica-tions for example in the form of so-called ghost states. The construction ofreliable ultrasoft pseudopotentials [43] or projected-augmented-wave (PAW)potentials [45, 46] presents additional challenges, but the latter is quicklygaining in popularity as it has proven a particularly robust and transparentapproximation.

2 Supercell methods for defect calculations 39

To compare calculated values reported in literature for defects in semicon-ductors, specific details of the PP generation are important for the reproduc-tion of a given result. Several parameters usually enter the PP construction,such as the core radii for different angular-momentum channels and the corematching radii. The effectiveness of a given PP needs to checked and vali-dated in every new chemical environment before being used in quantitativestudies.

Constructing PPs thus involves compromises, and in reporting computa-tional results it is important to state those compromises clearly. In particular,the choice of a given state as an ‘active’ valence state vs. its treatment viaperturbative, nonlinear core-valence corrections [47] are crucial. For exam-ple, it has been demonstrated [48] that a nonlinear core-valence correction isnecessary to get spin-polarized states of first row atoms O and N properlydescribed. Unlike previously assumed, the 1s core in these atoms is not sodeep as to be uninvolved in the chemistry important in defining the valence-electron potential.

An interesting approach that yields bulk band structures in good agree-ment with experiment is based on the idea of self-interaction and relaxation-corrected pseudopotentials (SIRC). [49] The removal of self-interactions hasimportant consequences for ionization levels in the semiconducting gap, asdiscussed above. However, it seems that the SIRC approach does not allow asimple self-consistent calculation of total energies.

Finally, it should be emphasized that consistency in the choice of theexchange-correlation functional for pseudopotentialconstruction and their ap-plication is highly advisable. The problems of using a different functional forthe two have been demonstrated, especially in the cases where polarizablesemicore states are present. [50]

2.7 Basis sets

The use of DFT is not limited to static electronic-structure problems such astotal energies and bond distances. The potential-energy (Born-Oppenheimer)surface defined by the positions of the nuclei defines the interatomic forces,and in recent years there has been rapid growth in the use of first-principlesmolecular dynamics (FPMD) within the finite-temperature DFT frameworkfor studying many materials properties, including atomic vibration and mi-gration, and the full equation of state (EOS). If the force fields or pressureor stress tensor of the system are of interest, additional care must be takento ensure that the calculated quantities are of sufficient numerical accuracy.Whereas errors in the total energy are second order with respect to errors inthe electron density or Kohn-Sham eigenfunctions, errors in the interatomicforces and pressure are first order. The errors are significantly affected by thecompleteness and accuracy of the basis set.

A popular choice is the plane-wave expansion for the Kohn-Sham states,

40 Risto M. Nieminen

ψiσ(r) = ψnσk(r) =∑G

ψnσk(G) exp [−i(k + G) · r] . (2.5)

The summation is over the reciprocal lattice vectors G of the superlattice.In plane-wave calculations the accuracy of the basis set is determined by

the chosen kinetic energy cutoff or maximum wavevector Ecut = h2(k+Gmax)2

2m .It is important to realize that while a lower cutoff may be sufficient for theconvergence of total energies, it may be totally unacceptable for calculatinginteratomic force constants or pressure/stress tensors.

The reason that forces are more sensitive to the plane-wave cutoff thantotal energies is that the real-space force formula contains the derivatives withrespect to coordinates (see Eq. (2.6)). This derivative will introduce an extrafactor of the reciprocal lattice vectors G into the reciprocal-space formulaeand the maximum value Gmax is determined by the cutoff. It also follows thatexchange-correlation functionals involving gradients of the density (such asGGA) are more sensitive to the cutoff than LDA.

The issue of basis set sufficiency is even more critical when using localbasis sets, for example linear combinations of numerical atomic orbitals [51]or Gaussians. [52] A great advantage of local-orbital methods is the muchsmaller size of the basis (typically up to twenty basis functions per atom), re-flecting their economy when fitting rapidly-varying wavefunctions. However,as a discrete set the convergence of a local basis is not simply controlled bya single parameter. Local basis sets always require ‘optimization’ and experi-ence, and tested basis sets are collected into libraries with limited possibilitiesfor modification. The words of caution

about plane-wave basis set convergence apply even more strongly to localbasis sets. While a well-designed basis set (for example, the ‘double-zeta pluspolarization’ (DZP)) does match the accuracy of a fully-converged plane-wave calculation, the omission of the polarization term or calculation withsingle s and p functions can lead to large errors. This is particularly truefor forces. For atoms far from equilibrium (e.g. under large strain) it maybe necessary to increase variational freedom by expanding the basis to triplefunctions to achieve convergence. In some cases, especially for systems withlarge open volumes, it is advisable to augment the basis set by introducing‘floating orbitals’, i.e. basis functions not centered at atoms.

Finally, one should mention the real-space methods quickly gaining popu-larity. [53,54] They use finite-difference approximations for the kinetic-energyoperator (finite elements are also a possibility), and the quality of the basisset can be controlled simply by the choice of the spacing of the real-spacegrid. The great advantage of real-space techniques is their easy adaptabil-ity to different boundary conditions (periodic, open etc.), which is especiallyuseful for studies of defects in low-symmetry nanostructures, where supercellmethods can be quite wasteful.

2 Supercell methods for defect calculations 41

2.8 Time-dependent and finite-temperature simulations

Within the Born-Oppenheimer approximation, the electronic and ionic mo-tion and timescales are well separated. First-principles molecular dynamics(FPMD) can be implemented on the Born-Oppenheimer hypersurface by cal-culating an instantaneous value for the electronic total energy Etot[{Ri}] asdetermined by all the ionic positions {Ri}. The forces acting on the atomsare then evaluated as

F i = −∂Etot

∂Ri(2.6)

and Newton’s equations of motion are solved for the ionic degrees of free-dom, using standard algorithms for the second-order temporal differentialequations. Periodic boundary conditions are routinely implemented, and al-low simulations of disordered amorphous or liquid phases. A fixed-volumesupercell with conserved total energy corresponds to the microcanonical sta-tistical ensemble. The constant-temperature (canonical) ensemble can be sim-ulated using the Nose-Hoover thermostat. [55] Allowing the supercell volumeto change as a dynamical variable leads to the constant-pressure ensemble,and structural phase transformations are enabled by making the directionalvectors spanning the supercell also dynamical. [56]

The molecular dynamics techniques also enable free-energy simulations.The thermodynamic integration formula for the (Helmholtz) free energy reads

ΔF =

1∫0

dλ 〈H1 −H0〉λ (2.7)

where ΔF is the free-energy difference between the actual system with theHamiltonian H1 and a reference system with the Hamiltonian H0, and 〈...〉λdenotes a temporal average along an isobaric-isothermal trajectory generatedfrom the Hamiltonian

H(λ) = λH1 + (1 − λ)H0 (2.8)

Under ergodic conditions this is equivalent to an ensemble average. If thefree energy of the reference system is known, Eq. (2.7) allows one to computefree energy of the actual system. The FPMD method is particularly welladaptable to the supercell geometry.

The quantum dynamics associated with the electronic degrees of freedomcan be described by the time-dependent Schrodinger equation. Within thedensity-functional formulation of many-body systems, this leads to the pow-erful formulation of time-dependent density-functional theory. [57] It can becast in the form of time-dependent Kohn-Sham equations with a general-ized, time-dependent exchange-correlation potential. These equations can beexplicitly solved by time-propagation algorithms and starting from an appro-priately prepared initial state. The time evolution of the wave functions and

42 Risto M. Nieminen

the density can then be analyzed to obtain the desired physical quantities,such as the full response of the system to an external, polarizing field orexciting pulse.

The Kohn-Sham equations with an external time-dependent perturbationcan also be treated in the linear-response limit if the perturbation is weak.As the Kohn-Sham states are non-interacting, the linear-response function ofthe system can be constructed. The linearization leads to integro-differentialequations with an exchange-correlation kernel derived from the description ofExc. Linear-response TDDFT is a popular method [58] to estimate excitationenergies, polarizabilities and inelastic-loss properties (such as photoabsorp-tion cross sections), especially for finite systems. The treatment of extendedsystems (periodic boundary conditions) within TDDFT requires the consis-tent handling of electronic currents in the system, and is best characterizedas time-dependent current-density functional theory (TCDFT). [59]

2.9 Jahn-Teller distortions in semiconductor defects

The Jahn-Teller effect is the removal of electronic degeneracy by spontaneouslowering of spatial symmetry of atoms surrounding a defect, which leads tothe lowering of total energy. This is a consequence of a general theorem, [60]with several well-known consequences in solid-state physics.

It is easiest to discuss the Jahn-Teller mechanism by using the monova-cancy and a substitutional metal impurity in a tetrahedrally bonded semi-conductor (Si) as examples. These same systems will later be presented asexamples of state-of-the-art supercell DFT calculations in Sec. 16.

2.9.1 Vacancy in silicon

In the pioneering linear-combination-of atomic-orbitals (LCAO) model byWatkins [61] (see Fig. 2.2), the electronic configuration of a neutral vacancyin Si (VSi) with full cubic (Td) symmetry is a2

1t21, analogous to a Si atom

([Ne]3s22p2). The a1 state lies in the valence band (is a resonance) and isdoubly occupied. The other state t1 is in the fundamental gap and can containup to six electrons. It is pushed into the gap by the repulsive potential dueto the removal of a positive ion in the lattice. In compound semiconductors,therefore, the states are expected to lie higher for anion vacancies with alarger number of removed valence electrons than for cation vacancies.

The filling of the gap states with rising Fermi level is associated with therelaxations of the atoms neighboring the defect. If there are no electrons inthese levels, the Td symmetry of the ideal vacancy (ideal crystal) is preserved.The only possible relaxation is of the breathing-mode symmetry around thedefect center. Adding one or more electrons to the levels splits the degener-acy through the Jahn-Teller effect, and lowers the symmetry through atomicrelaxation. The main component of this distortion is tetragonal, giving D2d

2 Supercell methods for defect calculations 43

Td

a1

t 2

+2

a1

t 2

Td → D2 d

+1

a1 a1

e

b2

b2

e

D2d → D2 d

0

a1′a1″

b1

b2

D2d → C2v

−1Fig. 2.2. The splitting of the defect-realted gap states for a vacancy in Si. Thepoint symmetry group is denoted for defects in different charge states. The arrowsdenote the population of the two spin states.

Td

=

D2d (A)

=

D2d (B)

C2v

Fig. 2.3. The bounding boxes for the Si vacancy with distortions of different pointsymmetry orientation.

symmetry. This can occur in two senses, giving either a short, broad (‘typeA’) or long, thin (‘type B’) shape for the box bounding the nearest-neighboratoms (see Fig. 2.3). This determines the order of the resulting b2 and estates. For VSi, the latter option turns out to be valid.

44 Risto M. Nieminen

Ev

Ec

Cu

3d10

4s1

CuSi

e

t2

a1

t2*

VSi

a1

t2

Fig. 2.4. The evolution of the defect-related states of substitutional Cu in Si fromthose for a Cu atom and Si vacancy.

One electron added to the gap level thus results in a singly positivelycharge vacancy with D2d symmetry. Adding more electrons usually implieshigher total energies because of the intra-electron Coulomb repulsion betweenthe localized electrons in the gap states. However, adding a second electronto the vacancy in Si can enhance the pairing distortions (while retaining D2d

symmetry) and the Jahn-Teller splitting, possibly outweighing the increasedCoulomb repulsion. This is the negative-U phenomenon [62] often encoun-tered in semiconductor defects. Another way of stating the effect is to saythat the energy to remove two electrons from V 0

Si is less than to remove oneelectron. Thus V +

Si would be unstable.A trigonal distortion from Td symmetry occurs when a third electron is

added and the vacancy is in the charge state q = −1. This lowers the sym-metry to or C2v or C3v and splits the e state to b2 and b1. The quantitativeresults of state-of-the-art supercell calculations for the charge-state depen-dent relaxation and how they match the Watkins model are in Sec. 16,

2.9.2 Substitutional copper in silicon

Neutral CuSi [63] is isoelectronic with the negatively charged VSi (seeFig. 2.4). The triplet state t32 is partially occupied, in the band gap, and sus-ceptible to the Jahn-Teller effect. If on-site electronic repulsion would prevail,the high-spin S = 3/2 states would be stabilized, as for example in the caseof negatively charged substitutional Ni in diamond or the silicon vacancy inSiC. [64] Spin-orbit coupling also affects the orbital degeneracy, especiallyfor the heavier impurity elements.

In general, it is in principle possible to reach higher positive and negativecharge states by adding/removing electrons to/from the localized levels. Thestability of a given charge state requires accurate calculation of its total

2 Supercell methods for defect calculations 45

energy, and is subject to the computational limitations discussed above andfurther expanded upon Secs. 13-15.

2.10 Vibrational modes

The phonon modes of a vibrating solid can be calculated using the power-ful linear-response formalism, [65] where the DFT total-energy hypersurfaceenters as the ground state and determines the elastic response. This formal-ism has been implemented using different basis sets for bulk calculations,including short-range local orbitals. [66] For defects in semiconductors, it ishowever the localized vibrational modes (LVMs) associated with defects thatare often useful fingerprints for defect identification. In the case of defect-associated localized modes, a more economical route than the full phononresponse is provided by considering only a single supercell. This is becausethe interaction between the defect replicas can be made small by using alarge enough cell, whereby the vibrations do not mix and there is no phonondispersion. The total energy of the supercell can be written as a Taylor seriesaround the equilibrium positions in the harmonic approximation as

Etot[{Rαi + sαi}] = Etot[{Rαi } +12

∑αi,βj

∂2Etot

∂Rαi∂Rβjsαisβj + ... (2.9)

where sαi is the ith Cartesian component of the atomic displacement of theion α with the coupling constants

Φαiβj ≡ ∂

∂Rαi(∂Etot

∂Rβj) (2.10)

are conveniently obtained as numerical derivatives of the (Hellmann-Feynman)forces on the atoms as they are shifted a small distance from their equilibriumpositions around the defect (‘frozen phonons’) . The frequencies ω and ampli-tudes u of the localized vibrations can then be obtained as the normal-modesolutions to the equations

−ω2uαi +∑βj

Dβjαiuβj = 0 (2.11)

where Dβjαi is the dynamical matrix. If Mα is the mass of ion α,

Dβjαi ≡ 1√

MαMβ

Φαiβj (2.12)

In practical LVM calculations, selected atoms in the supercell are dis-placed to all three Cartesian directions, and after each displacement theground-state electronic total energy is obtained and the Hellmann-Feynman

46 Risto M. Nieminen

forces calculated. This is done for all atoms which are a priori considered im-portant for the description of the local vibrational modes around the defect.The dynamical matrix is then calculated by finite-difference

approximation using these forces and displacements. The normal modesand the corresponding vibrational frequencies can then be obtained by diago-nalizing the dynamical matrix. The harmonic modes can easily be quantizedand the

zero-point vibrational motion estimated. Within the harmonic approxima-tion, the phonon spectrum can be used to estimate the vibrational entropy ofa given defect, which can then be entered as the prefactor when estimatingabsolute defect concentrations in thermal equilibrium.

Alternatively, the phonon spectrum, including the localized modes, canbe obtained by MD simulation. The simulation time has to be long enoughso that all the relevant vibrational modes have undergone several oscilla-tion periods. The vibrational frequencies can then be obtained by Fourier-transforming the velocity autocorrelation function of the moving ions. Thisusually requires that the LVM is well separated from the density of statesfor host lattice vibrations. The MD approach includes also the anharmoniccontributions to the vibrations, but not the zero-point motion.

2.11 Ionisation levels

The formation energy of a defect or impurity X in the charge state q is

Ef [Xq ] = Etot [Xq ]− Etot [bulk] −∑

i

niμi + q [μe + Ev] (2.13)

where Etot [Xq ] is the total energy from a supercell calculation with a defector an impurity in the cell, and Etot [bulk] is the total energy of the equivalentbulk supercell. ni indicates the number of atoms of type i (host or impurity),either added (ni > 0) or removed (ni < 0) from the defected supercell, andμi denote the corresponding atomic chemical potentials. They represent theenergy of the reservoirs with which atoms are exchanged when assemblingthe crystal in question. μe is the electron chemical potential (Fermi level),referenced here to the valence-band maximum Ev in the bulk material. Thevalence-band maximum of defect supercell has to be aligned with that in thebulk (see below).

For a monoatomic solid, the formation energy for a defect in the chargestate q reads

Efq = Edef

q −Nμ+ q(EV + μe) (2.14)

where Edefq is the total energy of the defect-containing supercell and N total

number of atoms in the it.The thermodynamic ionization (transition) level of a given defect Ed(q/q′)

is defined as the Fermi level position where the charge states qand q′ have

2 Supercell methods for defect calculations 47

equal total energy. In experiments where the final charge state can relax to itsequilibrium configuration after the transition (excitation), this is the energylevel that would be observed. The ionization levels are therefore observed inDLTS experiments or (for shallow levels near band edges ) as the thermalionization energies derived from temperature-dependent conductivity or Halleffect data.

The optical energy Eoptd (q/q′) associated with a transition between charge

states q and q′ is defined similarly to the thermodynamic transition energy,but now the final state with charge q′ is calculated using the atomic geometryof the initial state q. The optical level is observed in ‘vertical’ absorptionexperiments where the final charge state cannot relax to its equilibrium.In emission experiments, on the other hand, the initial excited state hasevolved towards its minimum-energy configuration, which is different fromthe ground-state structure. optical emission and absorption peaks are thusseparated by the so-called Stokes shift. This poses challenges for both theexperimental assignment of peaks and the theoretical analysis.

For a two-component (compound) semiconductor, Eq. (2.13) can berewritten [13] in a form that reflects the stoichiometry of the host material.It is gauged by the chemical potential difference Δμ of an AB compound as

Δμ = μA − μB − (μA(bulk) − μB(bulk)) (2.15)

with an allowed range−ΔH ≤ Δμ ≤ ΔH (2.16)

where the lower limit corresponds to A-rich and the upper limit B-rich mate-rial.ΔH is the heat of formation of the compound from its bulk constituents,

ΔH = μA(bulk) + μB(bulk) − μAB(bulk) (2.17)

By defining

Ef1 [Xq ] ≡ Etot [Xq ] − 1

2 (nA + nB)μAB(bulk)−1

2(nA − nB)[μA(bulk) − μB(bulk)] + qEv(2.18)

one can write Eq. (2.13) in the short form

Ef [Xq ] = Ef1 [Xq ] + qμe −

12(nA − nB)Δμ (2.19)

Eqs. 2.14 or 2.19 provide a basis for evaluating the ionization level betweencharge states q and q+1 as the value of μe where the two formation energiesare equal. This is a widely used technique for defects in semiconductors. Itrequires the accurate determination of the supercell total energies, and usesthe valence-band maximum as the absolute reference energy (alternatively,the electron chemical potential can be measured from the conduction-bandminimum). Although it makes no explicit reference to the eigenvalues of theKohn-Sham gap states, the method can suffer from the DFT underestimationof the band gap, especially if the state in question is far from the referenceenergy, close to the opposite band edge.

48 Risto M. Nieminen

2.12 The marker method

The problems of band-gap underestimation and valence-band alignment canto some extent be avoided in using the so-called marker method for calculat-ing ionization level positions. [67] Let us define the configuration energy fora defect labeled d as the total energy difference per unit charge between twocharge states

Cd(q/q′) = (Etot

[Xq′]

−Etot [Xq ])/(q − q′) (2.20)

If there is a measured or otherwise accurately known reference (‘marker’)defect m with the ionization level Em(p/p′) between charge states p and p′

in the region of interest in the semiconductor band gap, one can calculate itsconfiguration energy Cm(p/p′) and the configuration energy difference of dfrom the marker as

Dd(q/q′) = Cd(q/q′) − Cm(p/p′) (2.21)

Then the ionization level of interest can be estimated as

Ed(q/q′) ≈ Dd(q/q′) + Em(p/p′)(2.22)

The idea is to achieve systematic cancelation of computational errors. Thishappens best when the defects d and m are similar and all computationalparameters are the same.

The reference energy may also be taken as the valence band maximumand the conduction band bottom of the bulk supercell with the same numberof atoms as the defect supercell, whereby

Cbulkm (0/+ 1) = Ev and Cbulk

m (−1/0) = Ec (2.23)

and the band gap is

Eg = Cbulkm (−1/0) −Cbulk

m (0/+ 1) (2.24)

The marker method can provide a useful alternative for ionization-level de-termination, but requires the knowledge of suitable known transition level inthe vicinity of the level under scrutiny.

2.13 Brillouin-zone sampling

For a perfect (no defect) lattice, the convergence of electronic propertiescan be achieved either by increasing the number of k points in the BZ orthe product of the number of atoms (N) in the calculational unit cell andthe number of k points. The computational cost increases linearly with the

2 Supercell methods for defect calculations 49

number of k points but is proportional to at least N3. Thus increasing simplythe number of k points would seem most economical. However, for defectcalculations the spurious defect-defect interactions are a more subtle issue.For defects in metals already quite small supercells may give well-convergedresults if the number of k points is large enough. For defects in semiconductorsthe situation is more difficult. Their description involves localized gap states.The description of these is not straightforwardly improved as the number ofk points increases, i.e. the detailed choice of the k-point sampling may affectthe convergence.

In supercell calculations, the evaluation of a ground-state property P ofthe system, such as total energy, requires integration over the Brillouin zone(BZ) of the reciprocal cell

P = (1VBZ

)∫

BZ

d3k∑

n

pn(k)f(εn(k)) (2.25)

where VBZ is the volume of the BZ, n enumerates Kohn-Sham stateswith the wave-vector k and eigenvalue εn(k), and pn defines the physicalproperty. f(εn(k)) is the occupation number of the Kohn-Sham state, givenby the Fermi-Dirac distribution around the chemical potential μe as

f(εn(k)) =1

eμe−εn(k)

kBT + 1(2.26)

For semiconductors with a finite gap between occupied and unoccupied states,the integrand in Eq. (2.25) is continuous, and consequently the integral can inprinciple be replaced by a finite sampling of discrete k points. The computa-tional cost increases linearly with the number of k points, and ‘special point’schemes are popular to reduce the computational cost to obtain the desiredaccuracy. [68] In particular, the uniform k point mesh approach suggested byMonkhorst and Pack [69] has been widely used in practical calculations. Forvery large actual supercell sizes, the BZ shrinks towards a point (the Γ point,the k= 0 of the BZ). Γ point sampling offers additional savings in computingas the Kohn-Sham wave functions are purely real at k=0.

The simplest scheme to sample the BZ in supercell calculations is touse the Γ point only. When the size of the supercell increases, the wavefunctions calculated correspond to several k points of the underlying perfectbulk lattice, and the perfect lattice k space is evenly sampled. In order toimprove the description of in particular that of the delocalized bulk-like statesand the description of the electron density, it is beneficial to use k points otherthan the Γ point. Thereby also components with wavelengths larger than thesupercell lattice constant are included. This idea leads to the special k-pointschemes mentioned above. They are widely used to sample the BZ also indefect calculations.

The accuracy of a given k point mesh depends naturally on the supercellsize (BZ volume). Makov et al. [70] utilized this fact to suggest sampling

50 Risto M. Nieminen

meshes that would in fact extrapolate the integration result towards largerunit-cell sizes. They proposed a scheme to choose k points for supercell cal-culations so that the electronic defect-defect interactions are minimized inthe total energy.

Defect calculations often focus on total-energy differences, and a tacit as-sumption is that errors due to BZ sampling largely cancel out when taking thedifferences (bulk vs. defect, for example) treated with equal-size supercells.This assumption is not a priori warranted and should be checked carefully.It has been demonstrated [71] that even for relatively large supercells thesampling errors can depend on the defect type and charge state and thusdo not cancel when taking the difference. Shim et al. [71] investigated va-cancy and interstitial defects in diamond and silicon, and found that for agiven supercell size, the k point sampling errors in the total energy can varyconsiderably depending on the charge state and defect type.

2.14 Charged defects and electrostatic corrections

Although the supercell approximation describes accurately and in an eco-nomical way the crucial local rearrangement of bonding between atoms andthe underlying crystal structure, it also introduces artificial long-range in-teractions between the periodic images. The most dramatic artefact is thedivergence of the overall electrostatic (Coulomb) energy for charged defectsin the periodic superlattice.

This divergence of the total energy of charged defects in the supercellapproximation is usually circumvented by the introduction of a fictitiousneutralizing background charge, often in the form of a uniform ‘jellium’ dis-tribution. The influence of this neutralizing charge in the total energy of thesupercell needs to be included. This is known as the electrostatic or Madelungcorrection ΔEc.

With the neutralizing background added, in the large-supercell limit theelectrostatic interaction of a charged defect with its periodic images in anoverall neutralized system becomes in principle negligible. However, there isno guarantee that the convergence of the Coulomb energy as a function ofthe supercell size is particularly fast. [72] In fact, classical electrostatics forlocalized charges in an overall neutral system predicts an asymptotic L−1

dependence, where L = 3√V and V is the supercell volume. This scaling law

is unfortunately slow in converging, and moreover its prefactor is in generalunknown. Consequently, electrostatic errors are easily introduced into total-energy calculations and they do not necessarily cancel when taking energydifferences, as the magnitude depends on the details of the charge distributionwithin the supercell. Over the years, several attempts to reliably estimateΔEc have been proposed.

Leslie and Gillan [73] and Makov and Payne [74] have developed correctionformulae to be applied for electrostatic correction of charged-defect arrays.

2 Supercell methods for defect calculations 51

They considered an array of localized charges immersed in a structurelessdielectric and neutralized by jellium compensation. By considering the mul-tipole expansion of the defect charge distribution, the correction formula canbe derived in the form

ΔEc ≈ − q2α

2εL− 2πqQ

3εL3+ O(L−5) (2.27)

where q is the charge of the defect, α the Madelung constant of the super-lattice, and ε the static dielectric constant of the host material. For cubicgeometries, α is 2.8373, 2.8883 and 2.885 for SC, BCC and FCC supercells,respectively. The first term corresponds to a point-charge array and a com-pensating background in a uniform dielectric.

The second term in the right-hand side of Eq. (2.27) arises from theshape-dependent interaction charge distribution inside the supercell with theneutralizing background. The parameter Q is the second radial moment ofthe defect charge density

Q =∫d3rr2[ρd(r) − ρb(r)] (2.28)

Kantorovich [75] re-examined the method for arbitrary supercell shapes, andsuggested a modified formula ignoring dipole-dipole interactions.

There is obvious ambiguity in this correction scheme. First, the staticdielectric constant is introduced as an external parameter and is not con-sistently defined. Secondly, the correction scheme is derived assuming thatthe charge perturbation introduced by the defect is well localized within thesupercell. This is strictly speaking not valid, as in classical theory the polar-ization and thus the aperiodic screening charge distribution extend over thewhole macroscopic crystal. Thus, as described in detail by Lento et al., [76]Q depends on the size of the supercell and Eq. (2.27) does not lead to awell-defined correction value. Nevertheless, Eq. (2.27), often known as theMakov-Payne formula, is popular in estimations of electrostatic corrections.

Segev and Wei [77] showed that for charges with a Gaussian distributionwith a certain width σ interacting with the background, the Madelung correc-tion should vanish as σ becomes large. Moreover, situations may arise wherelocal symmetry-breaking distortions in fact induce a net dipole in the super-cell. Dipole contributions cannot be discarded in such a situation, as theydo not cancel in the total-energy differences between perfect and defectedsupercells.

Shim et al. [78] have systematically studied the behavior of the electro-static correction for supercells of different sizes in the case vacancy and in-terstitial defects in diamond. For negatively charged vacancies and positivelycharged interstitials, the formation energies show a clear dependence on thesupercell size and are in qualitative agreement with the Makov-Payne trend.For positively charged vacancies and negatively charged interstitials, the elec-

52 Risto M. Nieminen

trostatic corrections are weak. An analysis of the spatial charge density dis-tributions reveals that these large variations in electrostatic terms with defecttype originate from differences in the screening of the defect-localized charge.A strongly localized charge is close to the point-ion model, whereas delocal-ized defect states spread out and result in weak Madelung corrections. Toconvince oneself of the proper elimination of the spurious electrostatic en-ergy, one should in the general case carry out a proper scaling analysis of theL−1 behavior and extrapolate to the infinite-L limit.

Another approach is based on fitting a multi-center Gaussian distributionto the defect charge density, and calculating and subtracting explicitly theelectrostatic interactions between the Gaussian distributions and the back-ground. This approach has so far only been used for charged molecules invacuum, [79] but it may be possible to generalize to defect supercells withaperiodic densities.

A different route to the electrostatic correction for charged defects wastaken by Schultz, [80] based again on the linearity of the Poisson equationbut avoiding the neutralizing homogeneous background. An aperiodic defectcharge distribution ρLM (r) is constructed so that it matches the electrostaticmoments of the system up to a given order. When this charge density is sep-arated from the supercell charge distribution, the remaining periodic distri-bution has zero net charge and is momentless. Thus its electrostatic energycan be accurately calculated within the periodic boundary conditions. TheCoulomb energy of ρLM (r) is calculated using ‘cluster boundary conditions’,with the surrounding polarizable, defect-free crystal replaced by perfect, non-polarizable bulk crystal. There is no analytical formula similar to Eq. (2.27)for this local-moment counter-charge (LMCC) method. Its main content isto use periodic boundary conditions to solve the Kohn-Sham equations butnot for the Poisson equation.

One can then calculate the electrostatic energy of a single defect-containingsupercell surrounded by cells with the perfect crystal charge density. In orderto correctly align the arbitrarily referenced potentials, Schultz [81] has sug-gested a method where a common reference potential can be calculated forall types of defects and also for the valence band edge of the perfect crystal.

The LMCC method should in principle lead to a rigorous 1/L conver-gence of the supercell energy, with a prefactor proportional to (1 − ε−1)q2.This method does not need a compensating background, and there are no in-teractions between the defect charge and its periodic images. It also providesa way to define a reference energy independent of the defect charge state foraligning the band edges. The implementation of the LMCC method wouldthus seem very much worth the effort for calculations of charged defects.

2 Supercell methods for defect calculations 53

2.15 Energy-level references and valence-band alignment

As is obvious from Eq. (2.13), another important parameter for supercell-based total-energy calculations is the position of the valence-band maximum(VBM), EV , which is usually taken as the reference energy for the electronchemical potential. The position of the VBM of the defect-containing su-percell is different from that of the defect-free supercell, and this differencedepends in general on the charge state of the defect. Consequently, a valence-band alignment (VBA) correction is usually applied by matching the twovalues. The magnitude of this correction is

ΔEV BA =⟨V eff

bulk

⟩−⟨V eff

defect

⟩(2.29)

where V eff is the effective Kohn-Sham potential and the brackets denoteaveraging over the bulk and defect supercells, respectively.

Other potential energy references are naturally possible. One possibility isto align a electronic core or semicore level energy (in all-electron calculations),or define all energies with respect to the so-called crystal zero, the potentialenergy at the surface of a neutral Wigner-Seitz cell. This choice is particularlysimple and useful when using the atomic-sphere approximation (ASA), forexample in the context of the linear-muffin-tin-orbital (LMTO) method.

2.16 Examples: the monovacancy and substitutionalcopper in silicon

The literature on supercell calculations of defects in semiconductors is vastand not even a partial review is attempted here. Instead, we discuss in moredetail a recent systematic case study, where the practical implementationof two popular approaches were critically examined. The study involves theprototypical (but far from trivial) point defects in Si, the monovacancy VSi

and substitutional copper CuSi.The fundamental nature of VSi means that it is a much studied defect. It

has four unpaired electrons, and in a simple LCAO picture they occupy fourorbitals in the atoms surrounding the vacancy in the diamond structure. [61]These induce electronic states in the gap, which are modified by symmetry-lowering structural distortions driven by the Jahn-Teller effect.

A comprehensive study of the convergence of the formation energy of theneutral silicon vacancy as a function of the supercell size has been givenby Puska et al. [15] (see also Ref. [82]). The result is shown in Fig. 2.5. Itturns out that very large supercells are necessary for proper convergence ofthe vacancy formation energy, and that the relaxation pattern around thevacancy undergoes a qualitative change (from outward to inward relaxation)as the supercell size grows. This is also shown by calculations for large

54 Risto M. Nieminen

Fig. 2.5. Formation energy of the neutral vacancy in Si. The supercell size (numberof atoms) is given on the top of each panel, and the k-point set used for Brillouin-zone sampling is indicated as 1 = Γ , 2 = 23, 3 = 33, 4 = (1/4, 1/4, 1/4) and5 = Γ + L. From Ref. [15].

finite clusters. [16] The total-energy hypersurface is flat, especially fornegatively charged defects, which makes the unambiguous determination ofthe point symmetry difficult. This is also reflected in the difficulty of deter-mining the ionization levels accurately. This will now be discussed for bothVSi and CuSi.

CuSi has a similar structure to VSi: the vacancy is perturbed by theCu 3d states that lie in the valence band. The Jahn-Teller theorem predictsthat CuSi should undergo a symmetry-lowering distortion with magnitudeconstrained by the presence of the metal atom. Spin-orbit coupling effectsmay also reduce the amount of distortion by lifting the orbital degeneracy.

2 Supercell methods for defect calculations 55

2.16.1 Experiments

The greater structural freedom of VSi introduces another possible (and oftenencountered) complication. The total energy gained by pairing electrons indangling bonds, associated with a structural distortion, can outweigh theirmutual Coulomb repulsion. This is the famous ‘negative effective-U phe-nomenon, which leads to unexpected ordering of the ionization levels in theband gap. In the case of VSi the pairing for the neutral defect is much strongerthan for the singly positive +1 state, which shifts the ionization (donor) levelEd (0/+1) below that for the second donor Ed (+1/+2). Thus the singlypositive silicon vacancy is a metastable state, a fact confirmed with DLTSand EPR measurements by Watkins et al. [83] The energy position measuredin these experiments gives for the double donor level the value Ed (+1/+2)≈ Ev + 0.13 eV, while EPR studies show that V+

Si ionized by photoexci-tation decays to the neutral state from a donor level at Ed (0/+1) ≈ Ev +0.05 eV. EPR studies together with stress-alignment experiments also demon-strate that the Jahn-Teller effect causes structural distortions in accord witha straightforward one-electron model. This model predicts that the neutraland positively charged defects have D2d symmetry, while the negative chargestates have C2v symmetry. The energies of

the acceptor levels Ed (-1/0) and Ed (-2/-1) are experimentally unknown.For CuSi, there are accurate experimental values [84] for its ionization

levels and thus it provides a good testing ground for theory and computation,even if experimental structural information is lacking. The defect has a singledonor level at Ed (0/+1) ≈ Ev + 0.207 eV, an acceptor level Ed (-1/0) at≈ Ev + 0.478 eV ≈ Ec − 0.69 eV, and a double acceptor at Ed (-2/-1)at Ec − 0.167 eV.

The Jahn-Teller distortions for VSi and CuSi (and more generally sub-stitutional transition-metal impurities in Si) possess two components. Themain one is tetragonal in character and gives the defects D2d symmetry. Thedirection of this distortion may occur in two senses, and this determines therelative energetic ordering of the resulting electronic states. The two sensescorrespond to two different shapes for the bounding box with {100} faces thatcontains the four atoms surrounding the center: one is short and broad whilethe other is long and thin. They correspond to different relative lengths forthe six Si-Si distances between these four atoms. For the D2d point group,four the distances belong to one equivalent set and two belong to secondequivalent set. One type of distortion (type A) has the pair longer than thefour, and for the other (type B) it is the other way round. Type A splitsthe t2 state into a singlet a1 state above the doublet e state, while type Bdoes the opposite. Transition-metal impurities relax in the A-pattern, andmonovacancies in Si relax in the opposite sense, B.

When the system contains sufficient electrons to occupy the e state, aweaker trigonal distortion is expected. It lowers the symmetry to C2v. Theequivalent pair of Si-Si lengths in the D2d case become unequal. This splits

56 Risto M. Nieminen

the e state into orbitals of b1 and b2 character. Spin-orbit coupling may alsoaffect the splitting of these states for transition-metal impurities. [85]

2.16.2 Calculations

In a recent study, Latham et al. [86] have carried out a systematic quan-titative study of the electrical levels and associated structural relaxationsof VSi and CuSi. To conduct the study, two different computer programs,AIMPRO [87] and VASP [88] were used. The former uses localized Gaussianorbitals as the basis set for the Kohn-Sham wavefunctions, while the chargedensity is expanded in plane waves. VASP uses plane waves throughout. InAIMPRO, the core electrons of atoms are represented by pseudopotentialsconstructed according to the Hartwigsen-Goedecker-Hutter (HGH) norm-conserving scheme. [42] The VASP package includes pseudopotentials basedon the Vanderbilt ultrasoft [43] construction or, alternatively, the projected-augmented-wave (PAW) method. [45] The Cu 3d electrons are included ex-plicitly in the HGH and PAW schemes, while they are not in the Vanderbiltpseudopotential.

In both methods, the exchange-correlation energy is evaluated using theLSDA formula described by Perdew and Wang. [89] Unit cells of 216 atomsare used in all cases, and the supercell band structure is sampled by using theMonkhorst-Pack scheme with 23k points, folded according to the symmetryof the system and shifted to avoid the Γ point. This enables the total energyconvergence of better than 10−4 eV per supercell, with the plane wave kineticenergy cutoffs chosen to meet this requirement (in AIMPRO the Coulombenergy is evaluated also using the plane-wave expansion). The actual valueof the cutoff depends on the atoms present and the chosen pseudopotential.For charged defects, a compensating background charged is introduced asdiscussed in Sec. 13, and a finite-size scaling is performed.

For bulk Si , both methods reproduce the lattice parameter a and thebulk modulus B0 well (see Table 6.2). The calculation of the band gap Eg isa deficiency of LSDA-DFT formalism, as discussed above. The two methodsgive similar values, Eg(AIMPRO) = 0.71 while Eg(VASP) = 0.60 eV. TheVASP results does not depend on whether the core electrons are treated bythe ultrasoft pseudopotential or PAW.

Table 2.1. Calculated and measured values for the lattice parameter a and thebulk modulus B0 of Si.

Method a (A) B0 (GPa)

AIMPRO+HGH 5, 395 97.0VASP+VUS 5.390 87.7VASP+PAW 5.403 72.8experiment 5.431 97.9

2 Supercell methods for defect calculations 57

All the defect electron levels considered here are sufficiently deep so thatthe energies of donors can be given with respect to Ec and acceptors withrespect to Ev. As reference (marker) energies Latham et al. choose addi-tion/removal energies of ideal Si supercells with same size (216 atoms) asused for the defect calculations. This means that the energy levels are ref-erenced to the nearest ideal-crystal band edge (i.e. the marker method withrespect to the two band edges). None of the studied defects are so shallowthat their energy would be between the theoretical and true band gap. If thiswere the case, it would be necessary to use the opposite ideal-crystal bandedge as the reference potential.

A summary of the calculated electrical levels, compared with experimentaldata, is presented in Table 2.2.

Table 2.2. Calculated and measured electrical levels [86] for VSi and CuSi.

Transition AIMPRO VASP VASP Ref. 15 Expt.state VUS PAW [61,84]

VSi 0/+2 Ev + 0.0 Ev + 0.06 Ev + 0.15 Ev + 0.09VSi +1/+2 Ev + 0.05 Ev + 0.11 Ev + 0.19 Ev + 0.13VSi 0/+1 Ev − 0.04 Ev + 0.02 Ev + 0.11 Ev + 0.05VSi -1/0 Ec − 0.31 Ec − 0.34 Ec − 0.57 existsVSi -2/-1 Ec − 0.43 Ec − 0.23 Ec − 0.40 existsVSi -2/0 Ec − 0.37 Ec − 0.29 Ec − 0.49CuSi 0/+1 Ev + 0.17 Ev + 0.07 Ev + 0.17 Ev + 0.21CuSi -1/0 Ec − 0.50 Ec − 0.45 Ec − 0.34 Ec − 0.69CuSi -2/-1 Ec − 0.24 Ec − 0.18 Ec − 0.26 Ec − 0.17

When both program packages are set to use similar thresholds in termsof energies and forces to decide structural optimization, the outcome showssome differences. The AIMPRO package finds that the direction of the maintetragonal distortion (D2d symmetry) is in the expected direction, type Bfor VSi and type A for CuSi. The VASP package also finds type B D2d forneutral VSi. VASP yields type A D2d distortion for CuSi when the Vander-bilt pseudopotentialis used, but no significant deviation from the perfect Td

symmetry when PAW is used.Both methods give equal energies for the (0/+1) level of CuSi. Regard-

less of constraint choice, no clear evidence is seen for the expected trigonalcomponent of the distortion of CuSi. The situation is more complex for VSi.While the optimized structure in positive and neutral charge states has D2d

symmetry as expected, the unconstrained lowest-energy structure is a D3d

symmetry ‘split-vacancy’ configuration. The split-vacancy configuration isone where a Si atom is located between two unoccupied lattice sites, andis 0.04 eV lower in energy than the ideal Td monovacancy. This result fornegatively charged vacancies was also found by Puska et al. [15]

58 Risto M. Nieminen

According to Latham et al., the energy differences between the C2v andC3v configurations of VSi in the q = −1 and q = −2 charge states are 0.02 and0.20 eV, respectively. This means that VSi in the C3v symmetry is calculatedto have a negative-U behavior also for the acceptor levels. Little is presentlyknown from experiment about the negative charge states other than thatthey apparently exist.

The calculated energy for the (-1/0) acceptor level of CuSi near midgapis somewhat deeper than the measured value, while the second acceptor isshallower. The Jahn-Teller effect moves these levels apart, moving each by0.1 eV. Thus the calculations seem to slightly underestimate the magnitudeof the Jahn-Teller effect. It is reasonable to suppose that a similar patternwill be found for the acceptor states of VSi if they can be measured, and theexpected C2v model without a negative-U effect would prevail.

2.17 Summary and conclusions

Calculations based on the density-functional theory and the supercell methodenable quantitative estimates for the formation energies, diffusion barriers,structural parameters and electrical levels of defects in semiconductors. More-over, vibrational entropies and free energies can be estimated from the calcu-lated force fields, which also enable first-principles molecular dynamics simu-lations. It is also possible to feed the first-principles results for migration andreaction barriers into kinetic Monte Carlo simulations based on master equa-tions. This opens up the possibility for multiscale simulations of annealingkinetics and related phenomena. Quantitatively accurate total-energy calcu-lations open the way to first-principles thermodynamics and kinetics.

However, reaching the desired quantitative accuracy in the total-energycalculations is not always straightforward, and requires careful considerationof several possible sources of error. In particular, the positioning of defect-related electronic levels in the fundamental semiconducting gap can be quiteproblematic. The sources of errors may be divided into three main categories.One is the underlying theory, especially the treatment of electronic exchangeand correlation. Local and semilocal approximations do not properly describethe discontinuity of the corresponding density functional, which contributesto the underestimation of the fundamental gap. The second source of errorare the finite-size effects inherent in the supercell method, which need properscaling analysis as function of the cell size. These errors include the spu-rious defect-defect interaction, dispersion of defect-related electronic states,incomplete sampling of the reciprocal cell, and the constrained relaxation ofatoms around the defect. The third source of errors are the approximationsrequired to construct a specific implementation, including the pseudopoten-tialgeneration, basis set construction, and numerical accuracy of the algo-rithms.

2 Supercell methods for defect calculations 59

The power of the density-functional methods is considerable in revealingand rationalizing the systematic trends in electronic and structural propertiesof defects in semiconductors. These include the nature (acceptor or donor) ofthe deep levels, their spin structure, point symmetry, and energetics. Theycan also predict the localized vibrational modes associated with defects, andprovide the starting point (for example the ground-state electron density)for quantitative calculations of many experimentally observed characterizingsignals (such as positron-annihilation parameters). Total-energy calculationsalso reveal the details of the potential-energy hypersurfaces for defects movingin the lattice, which makes it possible to study the long-time kinetics by usingMonte Carlo techniques.

Quantitatively accurate supercell calculations for defects in semiconduc-tors are notoriously difficult and demand considerable computational re-sources. While many confusing and seemingly contradictory results have beenpublished in the scientific literature, the available computational methodshave steadily matured. With the increase of computer power available forsuch calculations, it is now possible to perform well-benchmarked calcula-tions with considerable predictive power.

Acknowledgments

I have benefited from several useful discussions with Chris Latham and MariaGanchenkova. This work has been supported by Academy of Finland throughthe Center of Excellence Grant (2000-2005).

References

1. J. Bourgoin and M. Lannoo, Point Defects in Semiconductors II. ExperimentalAspects, Springer Series in Solid State Sciences 35 (Springer, Heidelberg, 1983).

2. R.M. Nieminen, J.Phys.: Condens. Matter 14, 2859 (2002).3. W. Kohn and L.J. Sham, Phys. Rev. 136, B864 (1964); R.M. Martin, Electronic

Structure: Basic Theory and Practical Methods(Cambridge University Press,Cambridge 2004).

4. A.E. Mattsson, P.A. Schultz, M.P. Desjarlais, T.R. Mattsson, and K. Leung,Modelling Simul. Mater. Sci. Eng. 13, R1 (2005).

5. M.J. Puska and R.M. Nieminen, Rev. Mod. Phys. 66, 841 (1994).6. C.G. van de Walle and P.E. Blochl, Phys. Rev. B 47, 4244 (1993).7. C.J. Pickard and F. Mauri, Phys. Rev. B 63, 245101 (2001); C.J. Pickard and

F. Mauri, Phys. Rev. Lett. 88, 086403 (2002).8. R.P. Messmer and G.D. Watkins, Phys. Rev. Lett. 25, 656 (1970).9. S. Ogut, H. Kim and J.R. Chelikowsky, Phys. Rev. B 56, R11353 (1997).

10. G. Baraff and M. Schluter, Phys. Rev. B 19, 4965 (1979).11. O. Gunnarsson, O. Jepsen and O.K. Andersen, Phys. Rev. B27, 7144 (1983);

M.J. Puska, O. Jepsen, O. Gunnarsson and R.M. Nieminen, Phys. Rev. B 34,2695 (1986).

60 Risto M. Nieminen

12. S.K. Estreicher and D.S. Marinyck, Phys. Rev. Lett. 56, 1511 (1986).13. S.B. Zhang and J.E. Northrup, Phys. Rev. Lett. 67, 2339 (1991).14. H. Dreysse (ed.), Electronic Structure and Physical Properties of Solids. The

Uses of the LMTO Method, Lecture Notes in Physics (Springer, Berlin 2000).15. M.J. Puska, S. Poykko, M. Pesola and R.M. Nieminen, Phys. Rev. B 58, 1318

(1998).16. S.Ogut and J.R. Chelikowsky, Phys. Rev. B 64, 245206 (2001).17. K. Laaksonen, H.P. Komsa, E. Arola, T.T. Rantala and R.M. Nieminen, Phys.

Rev. B (in press).18. M. Stadele, J.A. Majewski, P. Vogl and A. Gorling, Phys. Rev. Lett. 79, 2089

(1997).19. M. Stadele, M. Moukara, J.A. Majewski, P. Vogl and A. Gorling, Phys. Rev.

B 59, 10031 (1999).20. R.J. Magyar, A. Flezsar and E.K.U. Gross, Phys. Rev. B 69, 045111 (2004).21. D. Vogel, P. Kruger and J. Pollmann, Phys. Rev. B 54, 5495 (1996).22. S. Sharma, J.K. Dewhurst and C. Ambrosch-Draxl, Phys. Rev. Lett. 95, 136402

(2005).23. A. Seidl, A. Gorling, P. Vogl, J.A. Majewski and M. Levy, Phys. Rev. B 53,

3764 (1996).24. J. Lento and R.M. Nieminen, J. Phys.: Condens. Matter 15, 4387 (2003).25. M.S. Hybertsen and S.G. Louie, Phys. Rev. B 34, 5390 (1986).26. S. Ismail-Beigi and S.G. Louie, Phys. Rev. Lett. 95, 156401 (2005).27. J. Laegsgaard and K. Stokbro, Phys. Rev. Lett. 86, 2834 (2001).28. M. d’Aveszac, M. Calandra and F. Mauri, Phys. Rev. B 71, 205210 (2005).29. J. Perdew and A. Zunger, Phys. Rev. B 23, 5048 (1981).30. A. Svane and O. Gunnarsson, Phys.Rev. B 37, 9919 (1988).31. L. Hedin, Phys. Rev. 139, A796 (1965).32. G. Onida, L. Reining and A. Rubio, Rev. Mod. Phys. 74, 601 (2002).33. M. Rohlfing and S.G. Louie, Phys. Rev. B 62, 4927 (2000).34. A. Janotti and C.G. van de Walle, Appl. Phys. Lett. 87, 122102 (2005).35. S. Lang and A. Zunger, Phys. Rev. B 72, 035215 (2005).36. V.I. Anisimov, F. Aryasetiawan and A.I. Liechtenstein, J.Phys.: Cond. Matter

9, 767 (1997).37. W.K. Leung, R.J. Needs, G. Rajagopal, S. Itoh and S. Ihara, Phys. Rev. Lett.

83, 2351 (1999).38. R.Q. Hood, P.R.C. Kent, R.J. Needs and P.R. Briddon, Phys. Rev. Lett. 91,

076403 (2003).39. C.G. Van de Walle and J. Neugebauer, J. Appl. Phys. 95, 3851 (2004).40. D.R. Hamann, Phys. Rev. B 40, 2980 (1989).41. N. Troullier and J.L. Martins, Phys. Rev. B 43, 1993 (1991).42. C. Hartwigsen, S. Goedecker and J. Hutter, Phys. Rev. B 58, 3641 (1998).43. D. Vanderbilt, Phys. Rev. B 41, 7892 (1990).44. L. Kleinman and D.M. Bylander, Phys. Rev. Lett. 48, 1425 (1982).45. P.E. Blochl, Phys. Rev. B 50, 17953 (1994).46. G. Kresse and D. Joubert, Phys. Rev. B 59, 1758 (1999).47. S.G. Louie, S. Froyen and M.L. Cohen, Phys. Rev. B 26, 1738 (1982).48. D. Porezag, M.R. Pederson and A.V. Liu, Phys. Rev. B 60, 14132 (1999).49. D. Vogel, P. Kruger and J. Pollmann, Phys. Rev. B 55, 12836 (1997).

2 Supercell methods for defect calculations 61

50. M. Fuchs, M. Bockstedte, E. Pehlke and M. Scheffler, Phys. Rev. B 57, 2134(1998).

51. P.Ordejon, E. Artacho and J.M. Soler, Phys. Rev. B 53, R10441 (1995); J.M.Soler, E. Artacho, J.D. Gale, A. Garcia, J. Junquera, P. Ordejon and D.Sanchez-Portal, J. Phys.: Condens. Matter 14, 2745 (2002).

52. O. Treutler and R. Ahlrichs, J. Chem. Phys. 102, 346 (1995).53. T.L.Beck, Rev. Mod. Phys. 72, 1041 (2000).54. M. Heiskanen, T. Torsti, M.J. Puska and R.M. Nieminen, Phys. Rev. B 63,

245106 (2001).55. S. Nose, J. Chem. Phys. 81, 511 (1984); G.H.Hoover, Phys. Rev. A 31, 1695

(1985).56. M. Parrinello and A. Rahman, J. Appl. Phys. 52, 7182 (1980).57. E. Runge and E.K.U. Gross, Phys. Rev. Lett. 52, 997 (1984).58. M. Petersilka, U.J. Gossmann and E.K.U. Gross, Phys. Rev. Lett. 76, 1212

(1996).59. G. Vignale and W. Kohn, Phys. Rev. Lett. 77, 2037 (1996)60. H. A. Jahn and E. Teller, Proc. Roy. Soc. A 161, 220 (1937).61. G.D. Watkins, The Lattice Vacancy in Silicon (Gordon and Breach, Yverdon,

Switzerland 1992).62. G. Baraff, E.O. Kane and M. Schluter, Phys. Rev. Lett. 43, 956 (1979).63. G.D.Watkins, Physica B 117-118, 9 (1983).64. L. Torpo, R.M. Nieminen, S. Poykko and K. Laasonen, Appl. Phys. Lett. 74,

221 (1999).65. S. Baroni, S. de Gironcoli and A. Dal Corso, Rev. Mod. Phys. 73, 515 (2001).66. J.M. Pruneda, S.K. Estreicher, J. Junquera, J. Ferrer and P. Ordejon, Phys.

Rev. B 65, 075210 (2002).67. J. Coutinho, V.J. B. Torres, R. Jones and P.R. Briddon, Phys. Rev. B 67,

035205 (2003).68. D.J. Chadi and M.L. Cohen, Phys. Rev. B 8, 5747 (1973).69. H.J. Monkhorst and J.D. Pack, Phys. Rev. B 13, 5188 (1976).70. G. Makov, R. Shah, and M.C. Payne, Phys. Rev. B 53, 15513 (1996).71. J. Shim, E.-K. Lee, Y.J. Lee and R.M. Nieminen, Phys. Rev. B 71, 035206

(2005).72. S.W. de Leeuw, J.W. Perram, and E.R. Smith, Proc. Roy. Soc. London, Ser. A

373, 27 (1980).73. M. Leslie and M.J. Gillan, J. Phys. C 18, 973 (1985).74. G. Makov and M.C. Payne, Phys.Rev. B 51, 4014 (1995); M.R. Jarvis, I.D.

White, R.W. Godby, and M.C. Payne, Phys. Rev. B 56, 14972 (1997).75. L.N. Kantorovich, Phys. Rev. B 60, 15476 (1999).76. J. Lento, J.-L. Mozos and R.M. Nieminen, J. Phys.: Condens. Matter 14, 2637

(2002).77. D. Segev and S.H. Wei, Phys. Rev. Lett. 91, 126406 (2003).78. J. Shim, E.-K. Lee, Y.J. Lee and R.M. Nieminen, Phys. Rev. B 71, 245204

(2005).79. P. Blochl, J. Chem. Phys. 103, 7482 (1995).80. P.A. Schultz, Phys. Rev. B 60, 1551 (1999).81. P.A. Schultz, Phys. Rev. Lett. 84, 1942 (2000).82. D.A. Drabold, J.D. Dow, P.A. Fedders, A.E. Carlsson, and O.F. Sankey, Phys.

Rev. B 42, 5345 (1990).

62 Risto M. Nieminen

83. G.D. Watkins and J.R. Troxell, Phys. Rev. Lett. 44, 593 (1980).84. S. Knack, J. Weber and H. Lemke, Physica B 273-274, 387 (1999); S. Knack,

J. Weber, H. Lemke and H. Riemann, Phys. Rev. B 65, 165203 (2002).85. G.D. Watkins and P.M. Williams, Phys. Rev. B 52, 16575 (1995).86. C.D. Latham, M. Ganchenkova, R.M. Nieminen, S. Nicolaysen, M. Alatalo, S.

Oberg and P.R. Briddon, to be published.87. J. Coutinho, R. Jones, P.R. Briddon and S. Oberg, Phys. Rev. B 62, 10824

(2000).88. G. Kresse and J. Furthmuller, Comp. Mater. Sci. 6, 15 (1996); Phys. Rev. B

54, 11169 (1996).89. J.P. Perdew and Y. Wang, Phys. Rev. B 46, 12947 (1992).

3 Marker-method calculations for electrical

levels using Gaussian-orbital basis-sets

J P Goss, M J Shaw and P R Briddon

School of Natural Science, University of Newcastle, Newcastle upon Tyne, NE17RU [email protected], [email protected],

[email protected]

3.1 Introduction

It is well known that the electrical and optical properties of a crystalline ma-terial are influenced by impurities and other defects. In particular, conduc-tivity at a given temperature is related to the free-carrier concentration, thatin turn depends on the depth of the donor or acceptor-level of the dominantdopant. Therefore one requires dopants with the smallest ionization energy,a value that is limited to the effective-mass level of the host material [1].Conversely, electron and hole traps deep in the band-gap are detrimentalto the conductivity because they reduce the free-carrier concentration andthe resultant charged defects are scattering centers that reduce carrier mobil-ity. Deep levels are also non-radiative recombination centers affecting opticalproperties.

The importance of the electrical characteristics of defects has driven ahuge effort to develop reliable computational methods that can both explainexperimental observations and predict new dopants with the desired proper-ties. Currently, most calculations in this field are performed using density-functional theory (DFT) [2, 3], chiefly simulating crystalline materials usingperiodic boundary conditions (PBCs), and plane-wave (PW) basis-sets [4] torepresent the wave-functions of the electrons.

With PBCs, a section of host material containing the dopant or defectis constructed and this ‘supercell’ is repeated periodically throughout space.Such a construction represents a crystal of defects, and thus the use of PWs isa natural choice. If the defect and its periodic images are sufficiently spatiallyseparated they do not interact significantly and the calculated propertiesrepresent those of isolated defects. This means that in order to obtain reliabledata for defects (especially shallow dopants with large wave function extents)large supercells must be used. However, the computational cost increasesrapidly with system size and in practice most of today’s calculations containup to a few hundred atoms.

The DFT-PBC approach has at least two major problems when calculat-ing electrical properties. The first is that within most commonly used local-density (LDA) or generalized-gradient (GGA) approximations, the band-gapis underestimated, sometime catastrophically. For example,crystalline Ge, in

64 J P Goss, M J Shaw and P R Briddon

reality a semiconductor with a band-gap 0.74 eV [5], is found to be a semi-metal [6]! Secondly, in order to obtain information about the capacity fordefects to adopt different charge states one needs to know the properties ofcharged systems. Strictly this is not possible with PBCs, but charged systemscan be approximated by the use of a uniform background charge that exactlycancels the charge on the defect. Even then, this represents a space-fillingarray of charges that have an associated electrostatic energy that does notcorrespond to the properties of isolated defects.

These problems are both mitigated by the use of atomic clusters. Heredefects are imbedded into large sections of the host material, but instead ofrepeating it periodically, the surface of the material is passivated (usuallyby hydrogen) effectively forming a large molecule. The cluster geometry im-poses an artificial confining potential increasing the band-gap and localizingelectronic states. However, although this effect counteracts the inherent un-derestimate of the band-gap, the surface–defect interactions are of a similarnature to the defect–defect interactions in supercells.

Although there are several sources of error in the calculations, many ofthem are somewhat systematic in nature, and hence relative locations ofelectrical-levels may be more reliable than absolute values. This has led tothe use of so-called marker-methods [7–9], where one references the systemof interest to one which is well understood: for example, the donor-level of Pin diamond is known, whereas those of As and Sb are not. Theory shows thatthey are shallower than phosphorus [10] and may therefore represent an im-provement in the production of n-type diamond. (Of course, such calculationsdo not reflect on the relative solubilities of these dopants.)

Finally, let us introduce the role of the basis-functions used to describe theelectrons in our problem. The use of PWs does not represent a fundamentalproblem, but in order to treat many systems of interest where the wavefunctions and charge-density vary rapidly in space, the use of PWs oftennecessitates additional effort (such as high energy cut-offs) or the use ofso-called ultra-soft pseudo-potentials where one systematically removes theaspects of the electrons which are difficult to represent efficiently with PWsdue to their localized nature. It is true that PWs may be viewed as a near idealbasis-set for effective-mass-like shallow-levels, since such donor and acceptorstates are built out of the band-edge Bloch-states. However, for deep levelsthe wave functions are often highly localized and can be characterized rathermore simply by the use of molecular orbitals.

The DFT package AIMPRO (Ab-Initio Modeling PROgram) uses a real-space basis made up from Gaussian functions [11, 12]. The real-space basisalso allows the same electronic structure methods to be applied to atomicclusters and molecules. In this paper, we first present a detailed descrip-tion of the basis-sets used in our calculations. In Sect. 3.3 we describe thecomputational methods for calculating electrical levels, and in particular the

3 Electrical levels with Gaussian bases 65

marker-method and present a review a range examples of its use in differentgroup-IV materials in Sect. 3.4.

3.2 Computational method

Our intention is to detail the calculation of electrical levels and not reviewthe entire formalism used in the calculations. However, briefly, AIMPRO is anelectronic structure program using LDA- and GGA-DFT. Atoms are treatedusing pseudo-potentials [13] to remove the core electrons, and the distributionof the electrons solved self-consistently for a given set of atoms. AIMPRO canbe used with a range of pseudopotentialtypes, including the semi-non-localforms [14–16], and the dual-space separable pseudopotentials [17]. In additionto the efficiencies offered by their separability, the latter pseudopotentialsoffer extended norm conservation, accounting for a large number of occupiedan unoccupied atomic levels. Currently, by default, AIMPRO uses the Perdew-Wang functional [18] for PBC calculations and a Pade approximation tothis functional [19] for cluster calculations, although a number of alternativefunctionals are available. Historically, cluster calculations used an extremelyfast parameterized analytic form for the exchange-correlation functional asdescribed in detail previously [12]. GGA calculations are performed using aWhite-Bird [20] implementation of the GGA functional [21].

AIMPRO can be used both in periodic and cluster modes and the atomicstructures optimized either using static minimization methods or moleculardynamics. From these calculations a range of observables can be derived,such as the electrical levels introduced above, vibrational mode frequenciesand symmetries, electronic structures, hyperfine and zero-field splitting ten-sors, electron energy loss spectra, migration barriers, and so on. However,in contrast to the majority of PBC calculations, AIMPRO represents the wavefunctions using a range of real-space, Gaussian-orbital functions.

3.2.1 Gaussian basis-set

In order to construct and solve the Hamiltonian for our systems we expandour wave functions in terms of a basis-set. Many different basis types areused for first-principles calculations, and each has associated advantages anddisadvantages. There is not a firm consensus on the optimum basis-set touse, and since the choice of basis-set plays a significant role in determiningthe structure and nature of the calculations, the particular bases used andissues concerning them are worthy of some comment here. In the discussionwhich follows we shall consider only the arguments surrounding the selectionof basis-sets for pseudopotential calculations: clearly all-electron calculations,which must describe the rapid fluctuations of the wave functions in the ioniccore region place additional demands on the basis-set.

66 J P Goss, M J Shaw and P R Briddon

As alluded to above, of the many basis-sets which are routinely used,perhaps the most significant distinction lies between the expansion in termsof PWs, and the use of basis functions which are localized functions in real-space. AIMPRO uses the latter type, specifically a set of Cartesian Gaussianorbitals (CGOs) centered on atoms and possibly other positions. These basisfunctions are products of a simple Gaussian and a polynomial in the positionvector relative to the center of the Gaussian:

φi,n1n2n3(r) = xn1yn2zn3e−αir2, (3.1)

where αi reflects the spatial extent of the function; larger(smaller) valuescorrespond to more localized (diffuse). The term xn1yn2zn3 defines the an-gular dependence, and relates to the spherical-harmonic associated with theangular-momentum and magnetic quantum-numbers, l and m. For example,a px-orbital has n1 = 1, n2 = n3 = 0. These polynomial combinations enablethe basis to describe atomic-like states of arbitrary angular-momentum.

The basis-set is defined by the combination of Gaussian exponents andangular-momentum to which the combinations of polynomial extends. Theadvantages and disadvantages of PW and these real-space basis-sets are al-ready documented in the literature [12], however it is useful to consider someof the key differences between them.

A key advantage of CGOs is that they can be non-uniformly distributed inspace, lending a more flexible basis to regions which require it (for examplein the region of a defect in a crystal). Such an approach cannot be takenwith PW bases: the region with greatest complexity defines the density ofPWs everywhere. For systems containing a small region in which the wavefunctions vary rapidly a high energy cut-off might be required, and this densePW basis-set must exist throughout the whole crystal. This leads to a verylarge Hamiltonian and slow calculation. A particular example of this problemis where surfaces are treated using a PBC via the inclusion of regions ofvacuum.

An advantage of a PW representation is that the basis functions are neces-sarily orthogonal. Consequently, the number of PWs can be increased withoutlimit while retaining the stability of the calculation. CGOs are not orthog-onal and the use of too large a basis-set can result in numerical instabilitydeveloping. The stability of the basis becomes a significant issue when oneattempts to address the question of how one should choose the basis for aparticular problem. For a PW calculation it is possible to ensure that one hasarrived at a converged result by systematically increasing the PW cut-off. Fora CGO basis-set it is not possible to perform a systematic test to ensure thata convergent basis has been arrived at. Indeed, the choice of exponents forthe functions in the basis-set is a non-trivial one. We shall discuss the detailsof the basis-set optimization procedure in more detail below.

We now consider the CGOs in more detail. The task is to obtain a setof exponents for which the pseudo-wave-functions are accurately described

3 Electrical levels with Gaussian bases 67

without rendering the calculations unstable. In order to achieve this, thevariational principle is employed: when one compares any two basis-sets, thatwhich allows the wave functions to describe the true wave function most ac-curately results in the lowest total-energy. The minimization of the totalenergy with respect to the exponents of the CGOs therefore results in theoptimum basis-set for a given number of functions. This process may then berepeated for different numbers of functions to assess whether the basis is largeenough. Again, the number of CGOs included cannot be increased withoutlimit as the calculations become unstable. However, it is typically possibleto obtain a convergent set of exponents prior to the onset of instability. Ofcourse, an increased number of exponents increases the size of the Hamilto-nian, and in turn the computational cost. Typically, the optimum number ofbasis functions is obtained by balancing the absolute quality of the basis andthe associated accuracy of the results against the computational load.

In addition to the number and value of the exponents, αi, the polyno-mial functions included with a given Gaussian (s, p, d, f and so on) mustalso be chosen. In general a basis set contains a range of values of angular-momentum and each CGO is treated independently. The angular-momentumhas a dramatic impact upon the number of functions in the basis set: for ex-ample, angular-momentum up to s, p and d results in 1, 4 and 10 functionsper exponent, respectively [22].

As an example, four exponents per atom each of which include s and p,gives rise to 16 functions per atom. Increasing the basis to include d-functionsincreases the basis size to 40, and since diagonalization of the Hamiltonianscales as the cube of the basis size, the time to obtain a total energy wouldincrease by more than an order of magnitude.

It is therefore paramount that basis functions of higher angular-momentumare used selectively where the physical demands of the system merit. Indeed,using a small number of high angular-momentum functions is common prac-tice in Hartree-Fock calculations where the basis sets are relatively small dueto the extreme computational cost of this method, and are built up usingthe angular-momentum of the atomic species with the addition of a single,higher angular-momentum orbital as a polarization function.

For most calculations all of the CGOs are centered on atoms, and movewith the atoms during structural optimization. It is also possible to centerbasis functions at sites where there are no atoms. For example, historically,centering basis functions at appropriate positions on the bonds between adja-cent atoms in the crystal was used to improve the description of the angularvariations due to the formation of bonding orbitals. Locating basis functionsat bond-centers allows for polarization to be incorporated without increasingthe maximum angular-momentum, reducing the computational cost, but in-troducing a poorly defined set of sites: during relaxation, structural changesmay change the atoms between which bonding is present.

68 J P Goss, M J Shaw and P R Briddon

However, increases in the speed of computational resources, together withimproved algorithms has meant that bond-centered orbitals have largely beensuperseded by the use of basis sets containing higher angular-momentum. Thehigher angular-momentum components provide the

necessary angular fluctuations without the need for the additional com-plexity of bond centered basis functions. Another situation in which localizedbasis functions may be centered away from atoms is the use of ‘ghost’ atomswhen treating surfaces. In this case, basis functions are placed in the vacuumregion as though atoms were there (although of course, no atomic potentialis included). This approach can provide additional basis functions to helpdescribe the evanescent wave function variations close to the surface.

For the majority of calculations the basis functions are atom-centeredCGOs. A set of basis functions is therefore associated with each atom. Thesize of the basis-set, and the particular exponents in it, will be optimizedfor each species of atom. Additionally, it is sometimes appropriate to treatdifferent atoms of the same species with different basis-sets within the samecalculation (for example surface atoms may require a larger basis than bulkatoms in a surface calculation).

In the basis-sets specified so far, all of the CGOs are independent basisfunctions, with coefficients free to change during the calculation, and eachcontributes to the overall dimension of the Hamiltonian. It is possible, how-ever, to exploit the physical properties of the system to take linear combina-tions CGOs to form more complex basis functions that provide a comparableaccuracy but with a reduced number of independent parameters. These arereferred to as contracted basis-set and are developed in the following way.

A small reduction in the basis-sets involving d-type (and higher angular-momentum) functions can be obtained by taking linear combinations of thefunctions in Eqn. 3.1. Formally,

ψi,nlm =∑

cnlm,n1n2n3φi,n1n2n3 . (3.2)

In the case containing up to quadratic prefactors there are ten functionsφ [22], from which we may choose to produce a set of nine functions, ψ,which transform with angular-momentum l = 0, 1, 2.

A more radical approach to reduce the size of a basis set uses combinationsof these ψi,nlm:

Ψnlm =∑

Cnli ψi,nlm, (3.3)

where the number of functions of the form of Ψ is dramatically less than therelated set of ψ. Particular examples of these contracted bases appear belowin Sect. 3.2.3.

3.2.2 Choice of exponents

As mentioned above, the values of αi may be chosen by consideration of thetotal energy. Typically, this optimization process is performed for a prototyp-

3 Electrical levels with Gaussian bases 69

ical example of the system to be studied. For example, in the case of studiesof defects in silicon, the basis associated with the Si atoms is chosen to bestrepresent a bulk Si cell. A problem of optimizing the values of the exponentsis a multi-dimensional minimization problem, and may be tackled by one ofthe many standard minimization procedures available. AIMPRO allows for theoptimization of the exponents by a downhill simplex or a conjugate gradientminimization of the total energy.

Given that a typical element will require of the order of four differentexponents in its basis-set, the free optimization of the exponents is a four-dimensional minimization problem. Calculations have shown that the four-dimensional energy surface generally is very flat and undulating, resulting inmany local minima, and causing difficulties in producing a unique choice ofexponent. This is mitigated by the use of ‘even-tempered’ basis-sets in whichthe n exponents are constrained to be in a geometric series, and although itis generally true that an unrestricted basis set will lower the total energy, itis usually found that this is often not a significant effect.

For contracted basis-sets, as with the full Gaussian basis-sets, fixed pa-rameters (in this case both coefficients and exponents) may be obtained withreference to an appropriate prototypical system. Each combined contractedbasis function adds just one to the dimension of the Hamiltonian, and yetcontains variations over a number of different length scales which reflect theproperties of the physical system used in the fitting procedure. Once thecontraction coefficients have been optimized, which is typically performedusing a simplex optimization within AIMPRO, they remain fixed during thesubsequent self-consistent-field calculations.

The optimization of the exponents is an important step prior to com-mencing a real calculation. Clearly, a poorly chosen basis may impact onthe accuracy of the final calculation, by affecting the accuracy of the wavefunction description. However, as we shall demonstrate in the example be-low, the physical properties of the material (for example lattice constant andbulk modulus) are not sensitive to the fine-tuning of the basis. Only whenthe basis-set has been carelessly chosen is there a significant impact upon thephysical properties.

3.2.3 Case study: bulk silicon

Let us consider the case of a range of basis-sets for bulk silicon. Here, weshall consider the optimization of exponents in a four exponent basis-set,with angular-momenta up to d on the first and second exponent, and upto p on the third and fourth exponent (where the first α is smallest). Werefer to this as a ddpp basis. As an initial guess for the exponents we shalluse the values {0.1, 0.3, 1.5, 4.0}. The range of these values results from asimple consideration of the length scales of the CGOs, the lowest exponent(longest ranging Gaussian) allowing our basis to represent functions extend-ing into the space between adjacent atoms, the highest exponent (shortest

70 J P Goss, M J Shaw and P R Briddon

range Gaussian) allowing us to describe the short-ranged variations of thevalence pseudo-wave-functions. (Note: the basis-sets are described here forpseudopotential calculations - if all-electron calculations are to be performed,very much larger bases would be required, the highest exponents of whichwould be far greater than those needed for pseudopotentials.)

Ene

rgy

(au)

Iteration

−7.928

−7.927

−7.926

−7.925

−7.924

−7.923

0 5 10 15 20 25 30

Fig. 3.1. Total energy of Si bulk unit cell during basis exponent optimization. Theenergy is shown as a function of the number of simplex iterations

We show in Fig. 3.1 the total-energy as a function of the number of stepsin a simplex optimization of this basis. Note how the reduction in energy isrelatively steep in the early stages followed by a long tail, reflecting how amodest attempt to obtain exponents that reflect the properties of the systemis sufficient to obtain a reasonably well-converged total-energy.

Several of the partially optimized bases were used to compute the lat-tice constant and bulk modulus of the Si by fitting to the Birch-Murnaghanequation of state. These data are presented in Table 3.1.

Since the LDA is generally expected to obtain lattice constants to around1% accuracy, and bulk moduli to around 5-10% it can be seen from Table 3.1that the variation in the structural parameters with basis is relatively small.In particular, the initial, somewhat arbitrary parameters yield reasonableresults, and after only around ten iterations the changes in lattice constantand bulk modulus become minimal on this scale.

We must also consider the role of the angular-momentum in the CGOs.The size of basis-set increases with higher angular-momentum, slowing thecalculation but providing a more flexible and accurate basis-set. There is acompromise to make between accuracy and computational cost. Typically,for materials like diamond and silicon pdpp or ddpp bases are found to besufficient. However, in calculations of defects it is often the case that heavierbases are used for the atoms at the defect itself, such as dddd. This exploitsthe localized nature of the CGOs, improving the accuracy of the basis in a

3 Electrical levels with Gaussian bases 71

Table 3.1. Total energies (Hartrees) at energy minimum, lattice constants (au)and bulk moduli (GPa) for Si basis-sets at various stages of optimization. For com-parison, the experimental lattice constant and bulk modulus of silicon are 10.263 auand 97.9 GPa, respectively [23, 24].

Iterations Total Energy Lattice Constant Bulk Modulus

1 −7.9234 10.24 91

4 −7.9250 10.23 91

6 −7.9269 10.21 94

15 −7.9271 10.21 95

24 −7.9276 10.21 93

30 −7.9280 10.21 94

100 −7.9280 10.20 96

particular region with little effect on the overall accuracy or computationalcost of the calculation.

Table 3.2. Total energies (Hartrees) at energy minimum, lattice constants (au)and bulk moduli (GPa) for different optimized Si basis-sets.

Basis Nb. functions per atom Total energy Lattice constant Bulk modulus

dddd 40 −7.9335 10.17 95

ddpp 28 −7.9280 10.20 96

pdpp 22 −7.9268 10.21 95

pppp 12 −7.8974 10.35 85

Table 3.2 details the effect on the structural properties of silicon for vari-ous basis-sets. The additional cost of the extra functions in the dddd basis isunnecessary since the results are effectively the same as the computationallycheaper pdpp basis. However, the pppp basis-set is seen to be too small andthe omission of any higher angular-momentum functions prevent a sufficientlyaccurate description of the directional bonds found in covalent semiconduc-tors.

As with the plain CGO basis, a number of different contracted basis-sets can be defined according to the number of different combinations of thefunctions, and the angular-momentum to which they extend. To label thesecontracted basis-sets we draw on the conventional nomenclature of quantumchemistry, and specify, for example a 4G basis to be one in which four CGOs(the ‘G’) of different exponents are combined into fixed functions. Separatecombinations are formed for n and l representing the occupied atomic va-lence states of the element in question. For example for the s and p orbitalsin silicon this results in a total of four independent functions, one s- and

72 J P Goss, M J Shaw and P R Briddon

three p-polynomial combinations. The same contraction coefficients are usedfor different components of a given function (for example px, py and pz),although are of course included as independent basis-functions. This levelof contraction is known as a minimal basis-set as they cannot change shapeduring the calculation.

To improve the flexibility of the basis-set, a second set of contracted coef-ficients may be defined resulting in an additional set of four basis functions (atotal of eight basis functions, two s-types and 6 p-type). In our nomenclaturethis would be referred to as 44G [25]. Such basis-sets are greatly restricteddue to the fact that the highest angular-momentum which can be representedis limited to the angular momentum of the occupied atomic states. Althoughthese can describe the isolated atom well, higher angular-momentum compo-nents are needed to describe directional bonds. To account for these we mayintroduce additional ‘polarization’ functions of higher angular-momentumCGOs (for example d functions in silicon) with a single free exponent. Theinclusion of polarization functions in a basis-set is indicated by a ∗ in the no-tation, for example 44G∗. The addition of the polarization functions increasesthe basis from 8 to 13 functions per atom, which remains considerably smallerthan the ddpp plain Gaussian basis.

We have extensively studied the effect of reducing the degrees of freedomin our basis-sets by moving to increasingly small contracted sets. In Table 3.3,the minimum total energy, lattice parameter and bulk modulus are presentedfor contracted basis-sets, and compared with the uncontracted ddpp basis.

Table 3.3. Total energies (Hartrees) at energy minimum, lattice constants (au)and bulk moduli (GPa) for contracted basis-sets 44G∗, 4G∗, and 4G, as defined inthe text.

Basis Nb. functions per atom Total Energy Lattice Constant Bulk Modulus

ddpp 28 −7.9280 10.20 9644G∗ 13 −7.9269 10.19 974G∗ 9 −7.9226 10.21 1034G 4 −7.8854 10.39 91

A 44G∗ basis for materials such as Si and diamond is found to be as ef-fective as a 28 function ddpp basis. In other words, the 44G∗ basis provides aconvergent description of our wave functions for our typical host elements, ata much reduced cost in terms of the Hamiltonian. These contracted basis-setsare therefore used in many of our calculations, giving an optimum perfor-mance with regard to computational cost. Cheaper bases still, such as the4G basis are found to result in wider variations of lattice constant and bulkmodulus, as well as giving poor electronic band structures, particularly inthe conduction band.

3 Electrical levels with Gaussian bases 73

Although the 4G bases are not suitable for routine calculations, there area number of potential applications for these basis-sets. One possibility is thatthese extremely cheap bases (just four functions per atom in Si) could beused for bulk regions in very large unit cell calculations. Fuller bases couldbe used to describe the essential regions of the material (for example thedefect region and adjacent atoms) with the 4G bases filling the remainderof the bulk-like unit cell. For such schemes to work it is necessary to matchthe optimum lattice constant of the two bases to avoid unphysical internalstrain at the boundary between the two different bases. Although we haveimplemented this approach and tested the idea with encouraging results, ithas yet to be used in a real application.

3.2.4 Charge density expansions

In AIMPRO the charge density, n(r), is also fitted to a set of basis functions.With the PBC n(r) is expanded in PWs, so that the approximate chargedensity

n(r) =∑G

n(G)eiG·r, (3.4)

where all PWs with G2/2 < Ecut−off are included, with the energy cut-offchosen according to the atomic species present. The number of PWs is suf-ficient to make n a very accurate representation of the charge density n(r)given by the underlying Gaussian basis-set. This does not incur significantpenalties in either speed or memory as the expansion is performed only forone function, n(r), rather than each occupied band as is the case in conven-tional fully PW basis-set calculations. As a consequence the energy cut-offcan be very large: typical values are 300Ry and 80Ry for carbon and silicon,respectively.

In a cluster calculation, the charge density is expanded in uncontractedGaussians (Eqn. 3.1) typically up to d-type functions. More details regard-ing the charge-density basis functions in cluster calculations can be foundelsewhere [12].

3.3 Electrical levels

We now turn our attention to the calculation of electrical levels. Since themarker-method, which forms the main part of this section, can be understoodin terms of the more commonly used formation energy method, we brieflydescribe this and its associated problems.

Before we do so, however, it is important to define what is meant byan electrical level. The electrical levels of a dopant (or any active defect)correspond to a thermodynamic property of the system, and relates to the

74 J P Goss, M J Shaw and P R Briddon

chemical potential, μe of the electrons (sometimes erroneously referred to asa Fermi-level, which is strictly

only a zero-temperature quantity). Where μe lies below, say, the donor-level of an impurity, energy is released by moving an electron from the defectto the electron-reservoir at μe, and so this transfer of charge proceeds. Con-versely, if μe is higher in the band-gap than the donor-level of the defectit would cost energy to move the electron from the donor to the electron-reservoir, and therefore it does not happen. It should therefore be noted thatthese levels relate to a change of charge state and cannot be obtained fromthe electronic structure of any one of the charge states involved.

3.3.1 Formation energy

The charge-state (q) dependent, zero-temperature formation energy of a sys-tem of atoms and electrons (X) is usually written [26]:

Ef(X, q) = Et(X, q) −( ∑

atoms

μi

)+ q {Ev(X, q) + μe} + ξ(X, q), (3.5)

where Et is the total-energy calculated for the system such as obtained usingAIMPRO, μi and μe are the atomic and electron chemical-potentials, Ev(X, q)is the energy of the valence-band top and ξ(X, q) is a term that takes intoaccount artifacts introduced by the computational framework. For PBCsξ(X, q) reflects defect–defect interactions, whereas for clusters it relates todefect–surface interactions including quantum-confinement. Usually the rele-vant Et(X, q) is that of the lowest energy configuration of X in charge stateq, but not always, as noted below.

For PBCs ξ may take the form of a power series in moments of the electrondistribution, which can be viewed as terms arising from an array monopoles,dipoles, quadrupoles and so on. The magnitudes of each term may be takenfrom a simple model, such as that of Makov and Payne [27] where only theterms in a Madelung energy and a quadrupole term are typically retained.The material is taken into account simply using the static dielectric-constant,ε. Note, the Madelung term scales as q2 and hence grows rapidly with qso that, for example, for a cubic supercell of diamond of side length 2a0,the Madelung correction for q = ±3e is already the order of the band-gap!This simple approach has received considerable criticism over recent years,so other, often computationally demanding approaches have been adopted[28–31].

Notwithstanding the details of ξ(X, q), one can use Ef(X, q) to estimatethe electrical levels of X. The approach is to determine the thermodynam-ically most stable charge state for all relevant values of μe, and the criticalvalues at which charge states change are the electrical levels. For example, theacceptor-level is given by the value of μe that satisfies Ef (X, 0) = Ef (X,−1):

3 Electrical levels with Gaussian bases 75

μe =[Et(X,−1) −Et(X, 0)

]+ [ξ(X,−1) − ξ(X, 0)] −Ev(X,−1), (3.6)

as illustrated in Fig. 3.2. The diagram also illustrates the difficulties in inter-preting levels that lie in the energy-range between the theoretical and exper-imental band-gaps. Note, usually the electrical levels represent the transitionbetween the lowest energy structural configurations of each charge state, butnot always. For example, there may be a considerable energy barrier betweenconformations of a set of atoms, so that the electrical levels relate to thechange of charge state of a specific arrangement of atoms. An example of thisis that of boron-vacancy complexes in silicon, where the equilibrium numberof host sites between the impurity and the lattice-vacancy is charge statedependent [32]. Such metastability must be taken into account when tryingto assign calculated electrical transitions to those measured experimentally.

-0.8

-0.6

-0.4

-0.2

0.0

0.2

0.4

0.6

0.0 0.2 0.4 0.6 0.8 1.0

Ene

rgy

(eV

)

μe (eV)

Ef(X,0)Ef(X,-1)

Ef(X,-1)+ξ

Fig. 3.2. Illustration of the use of the formation energy to obtain an acceptor-levelfor the system X in silicon. Above the acceptor level X− is thermodynamicallymore stable that X0, whereas below the reverse is true. The shaded area indicatesthe region between the theoretical and experimental band-gaps. The inclusion ofξ in the charged formation energy pushes the acceptor-level above the theoreticalvalue for Ec, but it remains deep in the experimental band-gap

This approach has perhaps three chief problems, the first of which thepoorly defined correction term, ξ which we have already touched upon.

The second is that μe is defined relative to the valence band top in thedefective system, which may be difficult to establish with any precision. Ifone does not wish simply to use the approximation Ev(X,−1) = Ev(bulk, 0)then one might use corrections obtained either by examination of the lowestoccupied Kohn-Sham level [33], or of the electrostatic potentials in bulk anddefective systems [34] (see Sect. 3.3.2).

The third, and often most critical problem that may affect the locationof electrical-levels is the underestimate of the band-gap. For instance, theband-gap of hexagonal-ZnO is around 3.4 eV [5], but simple application of

76 J P Goss, M J Shaw and P R Briddon

the LDA- or GGA-DFT leads to band-gaps around 1 eV. Recently, hydrogenhas been shown to be a shallow donor in this material [35, 36], but chargedependent formation-energies yields donor-levels around Ev + 2 eV [37,38].

However, as previously suggested, for a given material it is often foundthat many of the problems introduce systematic errors. Therefore, one choosesalternative reference states that mitigate the charged systems, band-gap errorand other interactions introduced by the geometry: this is the marker method.

3.3.2 Calculation of electrical levels using the Marker Method

We may avoid the problems highlighted above by referencing, say, theacceptor-level of one system to that of another. Eq. 3.6 for two differentsystems, X and Y , where the acceptor-level μe(Y ) is known combine to yieldthe expression:

μe(X) = μe(Y ) +[Et(X,−1) −Et(Y,−1)

]−[Et(X, 0) − Et(Y, 0)

]− [Ev(X,−1) −Ev(Y,−1)]+ [{ξ(X,−1) − ξ(Y,−1)} − {ξ(X, 0) − ξ(Y, 0)}] . (3.7)

If terms in ξ cancel and Ev(X,−1) = Ev(Y,−1) this simply reduces to:

μe(X) = μe(Y ) +[Et(X,−1) −Et(Y,−1)

]−[Et(X, 0) − Et(Y, 0)

](3.8)

This defines an unknown acceptor-level, μe(X), with respect to the knownmarker at μe(Y ) with reference only to total energies. Note, Eqns. 3.7 and 3.8explicitly show that the calculations involve differences of energies betweensystems with the same charge.

However, we have made at least two rather bold assumptions, and it isnecessary to explore where and why they are likely to be valid.

We first examine the notion that Ev(X, q) = Ev(Y, q). Since one can adda constant potential to any system (i.e. the zero of the potential energy scaleis often poorly defined in PBC calculations), the valence-band states may berigidly offset from those of a defect-free system, and one can estimate thepotential difference between X and Y by finding the average electrostaticpotential in bulk-like regions of these systems [34]. For example, for substi-tutional impurities in a diamond-structure material, one might characterizethe difference in the background potentials by finding the difference in thetotal electrostatic potentials at the T-interstitial sites far from the impurities.Alternatively, the potential difference between two systems can be estimatedby the difference in the energies of the lowest occupied levels in X and Y [33]under the assumption that these characterize the average-potentials in thesesystems. In our experience, such corrections may be up to a few tenths of aneV, but are usually small [34].

We now turn to the assumption that the terms ξ cancel. The suggestedcorrections for PBCs of Makov and Payne [27] assume an array of point

3 Electrical levels with Gaussian bases 77

charges, but it is clear from several studies that this approach tends to give arather poor estimate. However, it is instructive to consider ξ(X, q) to be aris-ing from a series of multipole interactions. For chemically similar systems, thelarger terms in the series would naturally be close in magnitude. Therefore,the best accuracy using the marker method is expected when comparingstructurally and chemically similar systems. This remains true even whenconsidering transitions between highly charged states, which is importantbecause of the q2 dependence of the simple Makov-Payne type of correction.

However, there are two important areas where the marker-method may bedifficult to apply. The first is where the availablemarkers are far in energy andchemical nature from that of the system of interest, such as using a shallowdonor as a reference for a very deep donor. Under these circumstances thereis unlikely to be very complete cancellation in the ξ, and hence the error barsin the calculation become more significant. Furthermore, problems associatedwith the underestimate of the band-gap will also become more important.The second is that in some materials there may be no experimental data. Forexample, to our knowledge there are no unambiguous observations of doubledonors or acceptors in diamond.

Where there are no appropriate markers, one might use a bulk system asthe reference, Y , in Eqns. 3.7 and 3.8. Such an approach was detailed for boronin silicon [8] and defects in diamond [9]. Here, the first donor and acceptorlevels of a bulk system are, by definition, Ev and Ec, respectively. However,one should exercise some caution in using a bulk system as a marker; deeplevels and band edges are typically very different in character. Furthermorethe levels of shallow defects are referenced to markers at the other extreme ofthe band-gap, making the underestimate of the band-gap a significant factor.

Finally, in the preceding discussion we have largely emphasized the useof the marker-method for PBCs, but the principles are the same for clustercalculations. Indeed, the use of clusters for narrow-gap materials has twoimportant advantages over the supercell approach. The first is that the lackof periodicity means that there is no dispersion in the electronic structure.Secondly, the confinement of an atomic cluster tends to oppose the typicalunderestimation of the band-gap. For materials such as germanium for whichthe underestimation is acute, this is a very significant, qualitative effect, asreflected in the results presented below.

In the final section we review a number of examples of the application ofthe marker method using AIMPRO in both supercell and cluster modes.

3.4 Application to defects in group-IV materials

3.4.1 Chalcogen-hydrogen donors in silicon

AIMPRO was recently used to analyze the properties of a range of chalcogensand their complexes with hydrogen in silicon. In this study [34] many of the

78 J P Goss, M J Shaw and P R Briddon

issues raised above were explicitly examined, including the role of supercell-size, Brillouin-zone sampling, basis and the average electrostatic potentials.

One significant factor in using chalcogens and their complexes with hy-drogen in this study is that they have been characterized

extensively by experiment, with spin-densities, vibrational-modes and forthe substitutional chalcogens, electrical-levels being available. It was thereforepossible to show that the calculations were able to reproduce important as-pects of the defects of interest other than the electrical characteristics, therebyvalidating the conclusions drawn. In particular, Mulliken bond-populationanalysis indicated that the donor wave functions were being faithfully repro-duced in these calculations, and that the characteristics were similar for thedifferent chalcogen species, indicating that the marker-method should be ap-propriate. Additionally, for the chalcogens the defect-level dispersion are verysimilar suggesting that this effect also would cancel out using an empiricalmarker.

The chalcogens S, Se, and Te lie on-site, each having two donor-levels, aslisted in Table 3.4. The calculated donor-levels listed were obtained using theformation energies with a Madelung correction and the marker-method usingbulk-silicon and S as references.

Table 3.4. Electrical levels relative to Ec (eV) of chalcogen and chalcogen-hydrogen complexes in silicon (see Ref. [34]). Experimental data from [39].

Experiment Formation energy Bulk-marker Sulfur-marker

Defect (0/+) (+/2+) (0/+) (+/2+) (0/+) (+/2+) (0/+) (+/2+)

S 0.29 0.59 0.78 1.25 0.42 0.48 0.29 0.59

Se 0.29 0.54 0.79 1.23 0.40 0.45 0.28 0.55

Te 0.20 0.36 0.75 1.22 0.27 0.28 0.25 0.38

S–H - - 0.41 1.78 0.13 1.18 0.01 1.28

Se–H - - 0.37 1.77 0.10 1.18 −0.02 1.28

Te–H - - 0.27 1.77 0.00 1.18 −0.12 1.28

The shallow nature predicted for the X–H complexes, the structure ofwhich is shown schematically in Fig. 3.3, was viewed to be consistent with thefact that they had not been detected using deep-level transient spectroscopy[40, 41], but had been detected via their vibrational modes [42].

Three important results can be drawn from this study. The first is that theformation-energy approach yields qualitatively erroneous results: S, Se, andTe have no double-donor-level in the band-gap, and the single-donor-levelsare far too deep. This can be traced to the underestimate in the band-gapand the nature of the correction terms, ξ. The second is that, although thesingle donor-levels are reasonably reproduced using the bulk-marker method,the second donor-level is very poor, due to underestimate of the correlation

3 Electrical levels with Gaussian bases 79

Fig. 3.3. Schematics of the partially passivated chalcogen–H complex in silicon.White and black circles represent host and impurity atoms, respectively, H beingrepresented by the smaller black circle. Dashed lines indicated the cubic axes

energy. Finally, the positive result is that, at least for classes of similar defects,the empirical marker method yields very good agreement with experiment.

3.4.2 VO-centers in silicon and germanium

The properties of oxygen in silicon and germanium have been studied ex-tensively as a consequence of the incorporation of O during growth and theproperties that it lends to the materials. Of all the various roles and structuresin which oxygen is involved, one of the most primitive is oxygen substitut-ing for a host atom. Oxygen is a relatively small atom so that this centeris generally referred to as a complex of (interstitial) oxygen with a latticevacancy (VO), the structure of which is shown schematically in Fig. 3.4. VOcenters in the neutral charge state can be understood as follows. The removalof a host atom creates four ‘dangling-bonds’. Divalent oxygen passivates twoof these and the remaining two dangling-bonds then combine together in areconstruction, resulting in a defect which is fully chemically coordinated.

(a) (b)

Fig. 3.4. Schematics of the VO-center in Si and Ge. White and black circles rep-resent host and oxygen atoms, respectively. The dark bond in (a) indicates thereconstruction present in the neutral charge state. (b) shows the unreconstructedcenter characteristic of the negative charge state

80 J P Goss, M J Shaw and P R Briddon

However, these centers can trap additional electrons by breaking the rel-atively weak reconstruction, and the resultant dangling-bonds lead to statesin the band-gap. Experimentally, the VO-center in silicon has an acceptor-level at Ev + 1 eV [43]. However, the DFT calculations using supercells andformation-energies have resulted in a rather poor agreement, predicting anacceptor-level at Ev + 0.4 eV [44]. However, recalling that in these calcula-tions the band-gap of bulk silicon is underestimated by more than 50% of theexperimental value, this donor-level is approximately the correct depth belowthe theoretical Ec, which might be viewed as agreeing with experiment, andit is not clear how to interpret this calculation.

The marker-method using atomic clusters [7] where the band-gap un-derestimate is mitigated and using the acceptor-level of interstitial-carbon(Ec − 0.10 eV [45]) as a marker yielded an acceptor-level for VO in siliconat Ec − 0.13eV, in close agreement with the experimental level. Indeed, thesmall error is a testament to how robust the method is since, although theyare close-by in energy, the character of the acceptor wave functions for theVO and carbon-interstitials are not closely related.

The importance of the band-gap error is even greater when consider-ing Ge. Recent calculations have also adopted the cluster configuration ofAIMPRO to analyze a range of defects in this material [46]. In particular, VOexperimentally has acceptor and double-acceptor-levels at Ev + 0.27eV andEv +0.49 eV, respectively [47]. Using AIMPRO and atomic clusters which haveapproximately the experimental band-gap of 0.7 eV the two acceptor-levels ofVO have been calculated relative to the acceptor and double-acceptor-levelsof substitutional Zn at Ev +40 meV and Ev + 100meV, respectively [5]. Thecalculated second acceptor-level is in excellent agreement with the experimen-tal result, lying at Ev + 0.47 eV. However, the first acceptor-level calculatedat around Ev +0.42 eV is in less good agreement, but still within 0.2 eV. Thereason for the larger deviation in the single acceptor level is unclear, but evensuch an error-bar is useful in the context of the more traditional supercellapproaches that suggest this material has no band-gap at all.

3.4.3 Shallow and deep levels in diamond

In contrast to Si and Ge, diamond, with an indirect band-gap of around5.5eV [5], is a good example of an insulator, not obviously to be associatedwith the semiconductor industry. However, p- and n-type diamond can beproduced via doping with boron and phosphorus, respectively, with heavilyboron-doped materials becoming metallic and opaque.

However, although phosphorus acts as a donor, the activation energy isvery high at around 0.6 eV, so the room-temperature ionization fraction, andhence the number of free electrons, is rather small. This has lead to an effortto deduce which, if any, other system may result in a shallow donor-level.

As well as the electrical properties of diamond, the optical transparency,especially in the infra-red, of the pure material has led to applications in

3 Electrical levels with Gaussian bases 81

high-specification optics. However, as-grown diamond may contain deep-leveldefects which absorb and luminesce, and an understanding of these centersis also obviously of relevance.

Regrettably, the number of definitively characterized donor- and acceptor-levels in diamond is relative small, and it is unsurprising that computationalmethods have been used in an attempt to assess both shallow and deepcenters. We first address examples of potential shallow-donors. Since P is thebest-established shallow donor in diamond, it seems a good idea to establishif, as in silicon, the other pnictogens may have useful donor properties. Fig. 3.5plots the calculated donor-levels using AIMPRO as described in [10].

(a)

3.0

3.5

4.0

4.5

5.0

5.5

SbAsPN

Ene

rgy

(eV

)

(b)

3.0

3.5

4.0

4.5

5.0

5.5

SbAsPN

Ene

rgy

(eV

)

Fig. 3.5. Location of donor-levels in the band-gap relative to Ev for pnictogendonors in diamond. (a) and (b) shows values calculated using phosphorus and thebulk supercell being used as a marker, respectively. Empty and filled circles show 64and 216 atom cells, respectively. The dashed and dotted lines show the experimentaldonor levels for P and N, and the shade area shows the experimental conductionband

Fig. 3.5(a) shows the use of the experimental donor-level [48] for P atEc − 0.6 eV as a marker for supercells containing 64 or 216 atoms (cubes ofside length 2a0 and 3a0, respectively). The trends are the same for both curvesand both supercells reproduce the experimental donor-level of nitrogen.

Fig. 3.5(b) shows the same data, but referenced to bulk-diamond. Now,the marker is the valence-band top, so it is far (around 5 eV!) from the antic-ipated location of the pnictogen donor-levels. This exemplifies the need forappropriate markers for the method to have a quantitative role. However,irrespective of the marker, the important result in this application is thatshallower donors of a simple nature, i.e. As and Sb, are expected to lead toimproved n-type diamond.

Recently we also used supercell calculations to compare the use of themore traditional formation energy approach with the marker method for arange of centers, and in particular for those for which electrical levels areknown experimentally [9]. Table 3.5 lists some of these results.

The overall agreement between the marker-method and experiment isconspicuously better than using the formation energy approach, indicating

82 J P Goss, M J Shaw and P R Briddon

Table 3.5. Electrical levels (eV) for various centers in diamond from experiment,compared to those calculated using the formation-energy method and the marker-method using a bulk cell as a marker. For the formation-energy method, we quotethe levels above Ev and below the theoretical and experimental locations of Ec

(Ethc = Ev + 4.2 eV and Eexpt

c = Ev + 5.5 eV). Asterisks indicate the marker-method results where the band-gap energy of 5.5 eV has been used to reference tothe opposite band-edge.

Formation energy Bulk-marker

Defect Experiment Ev Ethc Eexpt

c Ev Ec

Acceptors

Bs Ev + 0.37 [49] 0.2 4.0 5.3 0.5* 5.0

Nis Ec − 2.49 [50, 51] 3.3 0.9 2.2 3.0* 2.5

V–Ns Ec − 2.583 [52] 1.8 2.4 3.7 2.2* 3.3

Donors

Ns Ec − 1.7 [53] 3.0 1.2 2.5 4.0 1.5*

Ps Ec − 0.6 [48] 4.2 0.0 1.3 5.2 0.3*

Ns–Ns Ec − 3.8 [54] 0.8 3.6 4.7 1.8 3.7*

that the marker-method may usefully be applied to wide-gap materials forboth deep and shallow electrical levels. For the small set of data presentedin Table3.5 one could adopt any acceptor and donor as an empirical markerand get very similar accuracy.

3.5 Summary

The characterization of the electrical properties of defects and dopants usingstandard methodologies such as LDA- and GGA-DFT is a demanding prob-lem. The most common approach of calculating charge-dependent formationenergies has well-publicized frailties associated with periodic boundary condi-tions for charged systems, but these may be somewhat reduced by employingschemes where the electrical levels of two or more systems are compared.Additionally, when comparing total energies for chemically and structurallysimilar systems in the PBC approach, the dispersion of the defect levels infirst Brillouin-zone and multipole interaction, which are present even for neu-tral systems, are likely to be comparable and cancel to a large degree.

In addition, for electrical levels in narrow-gap materials, such as germa-nium, the use of carefully constructed atomic clusters also significantly re-duces the underestimation of band-gaps in these methods, allowing a fullyquantitative analysis of the levels in these materials. We have also shown thatthe marker-method is effective in wide-gap materials where excellent agree-ment may be obtained over the full range of the band-gap, provided that onehas a suitable marker available.

3 Electrical levels with Gaussian bases 83

Since the marker-method is applicable where looking a similar systems,it may be of particular use when studying dilute alloys such as SiGe. Herea known marker in silicon (for a Si-rich alloy) could be used to determinethe effect of having a nearby Ge atom, such as that of VO-centers in suchalloys [55].

The use of bulk materials as markers is a qualified success in the sensethat single donor and acceptor-levels are in many cases in good agreementwith experiment. However, multiple charges introduce larger errors and, forexample with the chalcogen species in silicon, the second-donor-levels are inmuch less good agreement.

We conclude that where experimental data allow, the most accurate quan-titative method for predicting electrical levels in crystalline semiconductorsand insulators is by comparison of a chemically and structurally similar de-fect for which the electrical levels are known. Other marker species, includingthe bulk supercells, are likely to introduce larger errors, but in our experienceeven then the results are rather favorable.

Acknowledgments

The authors would like to thank J. Coutinho, R. Jones and colleagues foruseful discussions and access to data presented in this paper.

References

1. G. Burns, Solid State Physics (Academic Press, Orlando, International edition,1985).

2. P. Hohenberg and W. Kohn,Phys. Rev. 136, 864 (1964).

3. W. Kohn and L. J. Sham, Phys. Rev. 140, A1133 (1965).4. Plane-waves represent functions in a Fourier-transformed phase-space and are

of the form exp(ik.r), where k is the wave-vector which is related to an energyby E = h2k2/2m.

5. S. M. Sze, Physics of Semiconductor Devices (Wiley, New York, 2nd edition,1981).

6. G. E. Engel and W. E. Pickett, Phys. Rev. B 54, 8420 (1996).7. A. Resende et al., Phys. Rev. Lett. 82, 2111 (1999).8. J.-W. Jeong and A. Oshiyama, Phys. Rev. B 64, 235204 (2001).9. J. P. Goss et al., Diamond and Related Mater. 13, 684 (2004).

10. S. J. Sque et al., Phys. Rev. Lett. 92, 017402 (2004).11. R. Jones and P. R. Briddon, The ab initio cluster method and the dynamics of

defects in semiconductors, chapter 6 in Semiconductors and Semimetals 51A(Academic Press, Boston, 1998).

12. P. R. Briddon and R. Jones, Phys. Stat. Sol. (b) 217, 131 (2000).13. W. E. Pickett, Comput. Phys. Rep. 9, 115 (1989).

84 J P Goss, M J Shaw and P R Briddon

14. D. R. Hamann, M. Schluter, and C. Chiang, Phys. Rev. Lett. 43, 1494 (1979).15. G. B. Bachelet, D. R. Hamann, and M. Schluter, Phys. Rev. B 26, 4199 (1982).16. N. Troullier and J. L. Martins, Phys. Rev. B 43, 1993 (1991).17. C. Hartwigsen, S. Goedecker, and J. Hutter, Phys. Rev. B 58, 3641 (1998).18. J. P. Perdew and Y. Wang, Phys. Rev. B 45, 13244 (1992).19. S. Goedecker, M. Teter, and J. Hutter, Phys. Rev. B 54, 1703 (1996).20. J. A. White and D. M. Bird, Phys. Rev. B 50, R4954 (1994).21. J. P. Perdew, K. Burke, and M. Ernzerhof, Phys. Rev. Lett. 77, 3865 (1996).22. It should be noted that the linear combinations of six functions with n1 +

n2 + n3 = 2 in equation 3.1 produce the five d-type functions (xy, xz, yz,2z2 −x2−y2 and x2 −y2), and an s-type function (x2 +y2 +z2). Our standardpractice in using the uncontracted basis sets discussed here is to include thisadditional s-type function, and hence the n1 = n2 = n3 = 0, n1 + n2 + n3 = 1and n1 + n2 + n3 = 2 terms yield ten rather than nine functions.

23. E. R. Cohen and B. N. Taylor, J. Res. of the Nat. Bureau of Standards 92, 85(1987).

24. H. J. McSkimin and P. Andreatch Jr., J. Appl. Phys. 35, 2161 (1964).25. This is similar to the ‘31’ part of the quantum chemists standard 6-31G basis

set. The only difference is that in our two contractions are combinations of allfour underlying Gaussians; the 6-31G basis has 3 functions contracted togetherand one function left uncontracted, a more restrictive prescription.

26. S. B. Zhang and J. E. Northrup, Phys. Rev. Lett. 67, 2339 (1991).27. G. Makov and M. C. Payne, Phys. Rev. B 51, 4014 (1995).28. H. Nozaki and S. Itoh, Phys. Rev. E 62, 1390 (2000).29. U. U.Gerstmann et al., Physica B 340–342, 190 (2003).30. C. W. M. Castleton and S. Mirbt, Phys. Rev. B 70, 195202 (2004).31. J. Shim et al., Phys. Rev. B 71, 035206 (2005).32. J. Adey et al., Phys. Rev. B 71, 165211 (2005).33. P. Deak et al., Phys. Stat. Sol. (b) 325, 139 (2003).34. J. Coutinho et al., Phys. Rev. B 67, 035205 (2003).35. S. F. J. Cox et al., Phys. Rev. Lett. 86, 2601 (2001).36. D. M. Hofmann et al., Phys. Rev. Lett. 88, 45504 (2002).37. C. G. Van de Walle, Phys. Rev. Lett. 85, 1012 (2000).38. M. G. Wardle, J. P. Goss, and P. R. Briddon, Phys. Rev. B 71, 155205 (2005).39. H. G. Grimmeiss and E. Janzem, in Deep centers in semiconductors (Gordon

and Breach, Switzerland, 2nd edition, 1996), p.97.40. G. Pensl et al., Appl. Phys. Lett. 51, 451 (1987).41. G. Roos et al., J. Appl. Phys. 67, 1897 (1990).42. R. E. Peale, K. Muro, and A. J. Sievers, Mater. Sci. Forum 65–66, 151 (1990).43. G. D. Watkins and J. W. Corbett, Phys. Rev. 121, 1001 (1961).44. M. Pesola et al., Phys. Rev. B 60, 11449 (1999).45. L. W. Song and G. D. Watkins, Phys. Rev. B 42, 5759 (1990).46. R. Jones et al., 11th GADEST conference, Giens, France, 2005.47. V. P. Markevich et al., Appl. Phys. Lett. 81, 1821 (2002).48. E. Gheeraert et al., Phys. Stat. Sol. (a) 174, 39 (1999).49. P. A. Crowther, P. J. Dean, and W. F. Sherman, Phys. Rev. 154, 772 (1967).50. D. M. Hofmann et al., Phys. Rev. B 50, 17618 (1994).51. R. N. Pereira et al., J. Phys. Cond. Matter 13, 8957 (2001).52. J. W. Steeds et al., Diamond and Related Mater. 9, 397 (2000).53. R. Farrer, Solid State Commun. 7, 685 (1969).54. G. Davies, J. Phys. C 9, L537 (1976).55. V. P. Markevich et al., Phys. Rev. B 69, 125218 (2004).

4 Dynamical Matrices and Free energies

Stefan K. Estreicher and Mahdi Sanati

Physics Department, Texas Tech University, Lubbock TX [email protected]

4.1 Introduction

Vibrational spectroscopy provides essential experimental data about defectsin semiconductors, not only because of the microscopic nature of the infor-mation it provides but also because many of the quantities measured can becalculated from first principles. Fourier transform infrared absorption (FTIR)and Raman spectroscopies often produce sharp lines associated with local vi-brational modes (LVMs) of impurities lighter than the hosts atoms. Theselines are often above the highest normal mode of the crystal, the Γ phononbut, when the phonon density of states has a gap (as in the case of GaN [1]),between the acoustic and optic modes. isotope substitutions provide elementidentification as well as information about how many atoms of a given speciesare part of the defect. Uniaxial stress experiments give the symmetry of thedefect through the splitting of the IR or Raman lines. Annealing studies pro-vide various activation energies. Examples of such studies and a discussionof the techniques are found in Ref. [2]

By themselves, the measured LVMs, isotope shifts, symmetry, and acti-vation energies are often insufficient to identify of a unique structure for thedefect. However, they are precious to theorists who use first-principles tech-niques to calculate the structures (symmetry), LVMs and their isotope shifts,binding, migration and/or reorientation energies, and thus prove or disprovethe assignment of an IR or Raman spectrum to a specific defect. The in-terplay between vibrational spectroscopy and first-principles theory has ledto the unambiguous identification of many defects and has provided realistictests of the accuracy of theory.

For many years, the calculation of defect-related vibrational modes hasbeen limited to those LVMs that are IR or Raman active. The stretch mode ofa Si-H bond, in a Si vacancy for example, can be predicted quite accuratelyby calculating the total energy of a supercell containing this defect in itsequilibrium configuration as well as for several (at least 2) positions of Halong the Si-H bond. The 1-dimensional potential along this axis is fittedto a polynomial and the frequency and zero-point energy of this mode arecalculated. The same can of course also be done for wag modes. There aremany examples of such calculations. [3–5]

86 Stefan K. Estreicher and Mahdi Sanati

This method works well, but a lot of additional and very useful informa-tion can be obtained when the entire dynamical matrix of the supercell iscalculated. For a system of N atoms and therefore 3N normal modes, thedimension of this matrix is 3N × 3N and its calculation is computation-ally expensive. The eigenvalues of the dynamical matrix are all the normalmode frequencies of the supercell: acoustic and optic phonons as well asdefect-related modes. These can be LVMs, located above the Γ phonon, reso-nant and/or pseudolocal vibrational modes (pLVMs). Resonant modes occurwhen the strain (associated with a defect) stretches or compresses host-atombonds, resulting in new vibrational frequencies near the Γ. pLVMs are local-ized impurity-related impurity modes buried in the phonon continuum. Suchmodes are sometimes visible experimentally as phonon sidebands in photolu-minescence (PL) spectra. [6,7] Two copper-related centers in Si have recentlybeen theoretically identified [8,9] thanks to such pLVMs. Further, the knowl-edge of all the normal mode frequencies allows the construction of phonondensities of state, which are needed to calculate (Helmholtz) free energies.

The eigenvectors of the dynamical matrix make it possible to find (andidentify the symmetry of) all the localized modes associated with any atomor group of atoms in the supercell. The eigenvectors can also be used to pre-pare the supercell in thermal equilibrium and any temperature, thus allowingconstant temperature molecular dynamics (MD) simulations to be performedwithout the need thermalization or even a thermostat. This in turns makesit possible to calculate vibrational lifetimes and decay channels of specificmodes as a function of temperature. For all these reasons, the calculation ofdynamical matrices is well worth its cost.

Our results are obtained from self-consistent, first-principles theory basedon local density-functional theory in 64 host atoms periodic supercells. Thecalculations are performed with the SIESTA code. [10, 11] The exchange-correlation potential is that of Ceperley-Alder [12] as parameterized byPerdew and Zunger. [13] Norm-conserving pseudopotentials in the Kleinman-Bylander form [14] are used to remove the core regions from the calculations.The basis sets for the valence states are linear combinations of numericalatomic orbitals of the Sankey type, [15,16] generalized to be arbitrarily com-plete with the inclusion of multiple-zeta orbitals and polarization states. [10]We use double-zeta (two sets of s and p orbitals) for first and second-rowatoms (H through Ne) and add a set of polarization functions (e.g., one setof d orbitals) for third-row atoms and below. The charge density is projectedon a real-space grid with an equivalent cutoff of 150 Ry to calculate theexchange-correlation and Hartree potentials. A 2 × 2 × 2 Monkhorst-Pack kpoint sampling [17] is used to optimize the structures.

An overview of dynamical matrices is in section 4.2. Examples of LVMsand pLVMs are given in section 4.3. The calculation of vibrational lifetimesis in section 4.4. Vibrational free energies and specific heats are discussed insection 4.5, and the properties of defects at finite temperatures in section 4.6.

4 Dynamical Matrices and Free energies 87

4.2 Dynamical matrices

The calculation of dynamical matrices is a well-known topic discussed in manytextbooks, such as Ref. [18]. Further, the calculation of the force constantmatrix F is implemented in many (if not all) software packages that have MDcapabilities. The brief summary below is therefore only necessary in order todefine the notation to be used in this chapter and make a few comments.

We consider a solid represented by periodic supercells of N atoms of massmα (with α = 1, 2, . . . ,N). The nuclei oscillate around their equilibrium po-sitions. The dynamical matrix is related to the force constant matrix by

Dαβ,ij =Fαβ,ij√mαmβ

(4.1)

where i, j = x, y, z are Cartesian indices.The calculation of Fαβ,ij is computationally intensive for the type of sys-

tems required in the study of defects in semiconductors. Today’s typical su-percells contain of the order of 100 atoms, and there is little doubt that severalhundreds (or even thousands) atoms will become the norm in the near future.Once F is known, changing the masses of selected atoms (that is, playing withisotopes) is trivial. Techniques to calculate F are well-known, ranging fromthe direct ‘frozen phonon’ approach implemented in many MD software pack-ages to MD-based methods based on correlation functions, [19, 20] or linearresponse theory. [21–23] Note that dynamical matrices can also be computedusing order-N methods. [24]

The eigenvalues of the dynamical matrix are the normal mode frequenciesof the system ωs, and the corresponding eigenvectors es

α,i are orthonormal:

∑α,i

esα,i es′

α,i = δss′ and∑

s

esα,i es

βj = δαβδij . (4.2)

In the harmonic approximation, the normal-mode coordinates qs are

qs(t) = As cos(ωst + ϕs) =∑α,i

√mα uα,i es

α,i , (4.3)

where the amplitudes As are temperature-dependent and the uα,i are Carte-sian displacements away from equilibrium.

This relationship can be used to prepare a system in equilibrium at atemperature T, then perform MD runs at constant temperature without theneed for thermalization or even a thermostat. [25] Indeed, the unknown am-plitudes of the normal modes can be obtained by requiring that the kineticenergy of each mode is, on the average, kBT/2, that is

<12

(∂qs

∂t

)2

>=<12ω2

s A2s sin2(ωst + ϕs)>=

14ω2

s A2s =

12kBT . (4.4)

88 Stefan K. Estreicher and Mahdi Sanati

Thus, the average amplitude of the normal mode s is <As>=√

2kBT/ωs.Note that assigning the amplitudes As =<As > to each mode s implies

that the total energy of each mode is exactly kBT. It is better to pick arandom distribution

ζs =∫ Es

0

1kBT

e−E/kBTdE (4.5)

with 0 < ζs < 1. This leads to As =√−2kBT ln(1 − ζs)/ωs. Thus, in the har-

monic approximation, the Cartesian coordinates and corresponding velocitiesneeded to prepare the system in equilibrium at the temperature T are

uαi =√

2kBTmα

∑s

1ωs

√− ln(1 − ζs) cos(ωst + ϕs) es

αi (4.6)

and∂uαi

∂t= −

√2kBTmα

∑s

√− ln(1 − ζs) sin(ωst + ϕs) es

αi . (4.7)

The initial phases 0 ≤ ϕs < 2π are random, as each mode has a randomamount of kinetic and potential energy at the time t = 0. A similar transfor-mation has been used by Gavartin and Stoneham [26] to calculate the energydissipation in quantum dots. However, we discuss it here in the context ofestablishing initial conditions with the appropriate random distribution of en-ergies and phases. This is used in Section 4.4 to calculate vibrational lifetimesand decay channels from first principles.

4.3 Local and pseudolocal modes

The simplest and most immediate consequence of the knowledge of the dy-namical matrix is the identification of all the localized modes of a defect, andtheir symmetry. Indeed, a plot of

∑i |es

α,i|2 vs. s (that is, vs. the normal modefrequencies ω) for a specific atom α (or set of atoms) provides quantitativeinformation about the localization of the modes associated with the givenatom(s). For example, Fig. 4.1 shows the localized modes associated withbond-centered (bc) hydrogen in crystalline Si (H+

bc): the asymmetric stretch(IR active) of H is a LVM at 2004cm−1 (observed [27] at 1998cm−1), the twodegenerate wag modes (not observed) are pLVMs, far below the Γ phonon,at 264 cm−1, and the symmetric stretch is a pLVM at 452 cm−1 (also notobserved). The latter does not involve any H motion at all, but shows upwhen

∑i |es

α,i|2 includes the two Si nearest neighbors to H.Figure 4.2 shows the eigenvectors of the dynamical matrix in one of the

two degenerate wag modes at 264 cm−1. Although the H displacement inthis mode is large, it accounts for only about half the displacements of allthe atoms in the cell. Note that the eigenvectors only give the relative am-plitudes of the atomic displacements. Their absolute values depend on thetemperature, and are exaggerated in the figure.

4 Dynamical Matrices and Free energies 89

Fig. 4.1. LVMs and pLVMs associated with H+bc in the Si64 supercell. The plots

of d2(α) =∑

i|es

α,i|2 where α is H (left) or its two Si nearest neighbors (right)show the vibrational modes localized on the impurity and its two Si neighbors. Thevertical dashed lines show the calculated Γ phonon.

Thus, plotting∑

i |esα,i|2 quantifies the localization of all the local modes

associated with an atom or group of atoms. The corresponding eigenvectorsallow the identification of the symmetry of the mode, which is needed toestablish whether a mode is IR active or not, and whether it can producephonon sidebands in PL spectra or not. [6] This is precisely how two low-frequency copper-related defects have recently been identified. [28,29] A veryuseful (and free) software package [30] readily reads dynamical matrices andgraphically displays the eigenvectors for any chosen normal mode.

4.4 Vibrational lifetimes and decay channels

The vibrational lifetimes of the LVMs of light impurities in Si have recentlybeen measured by transient bleaching spectroscopy. [31, 32] Surprisingly, thelifetimes of nearly identical H-related LVMs differ by some two orders ofmagnitude. For example, at low temperatures, the measured lifetimes ofthe 2072 cm−1 mode of the di-vacancy–di-hydrogen complex (VH · HV), the1998 cm−1 mode of H+

bc, [31], and the 2062 cm−1 mode of the H∗2 pair [32]

are 295ps, 8ps, and 4ps, respectively. Since the decay of all these modesmust involve at least four phonons (experimentally, six phonon processes areproposed), it is not clear why the lifetimes of LVMs with almost identicalfrequencies vary by two orders of magnitude or why any of them would beshort-lived at all.

90 Stefan K. Estreicher and Mahdi Sanati

Fig. 4.2. Eigenvectors of the dynamical matrix in one of the 264 cm−1 wag modesof H+

bc. The H atom is the small black ball, the grey balls are all Si. The size of thearrows is exaggerated to make the smaller oscillation amplitude more visible. Thearrows only show the relative amplitudes of the motion of the atoms in this mode

The calculations of vibrational lifetimes begin with the dynamical matri-ces of the supercells containing the defects. The eigenvectors of this matrixare used to determine the equilibrium background temperature of the cellat the time t = 0, as discussed in Section 4.2. This is used to prepare thecell in equilibrium at the same sample temperatures as in the experimentalwork. The initial excitation of the LVM of interest is assigned to be its zero-point energy plus one phonon, that is 3hω/2. This mimics the laser excitationof the specific LVM. Then, constant-temperature (classical) MD simulationsare performed, with a time step of 0.3fs in the case of H. Since ab-initio MDsimulations are limited to real times of a few times 10ps, it is critical to beable to begin the MD runs at a relatively elevated background temperature,because the measured lifetimes are much shorter at higher temperatures. Thecalculations are then repeated at lower temperatures to monitor the increasein the lifetime until such low temperatures that the calculations become com-putationally prohibitive.

4 Dynamical Matrices and Free energies 91

Fig. 4.3. Plot of the energy (eV) of the asymmetric stretch mode of H+bc at

2004 cm−1 in the Si64 supercell, at the background temperature of 75K vs. time(ps). The energies of 194 other modes of the cell are plotted as well. The first stepof the decay involves the pLVMs of the defect, which themselves have very shortlifetimes

At every time step, the 3N Cartesian coordinates of the atoms are writtenas linear combinations of the 3N normal modes of the supercell. Thus, theenergies of all the modes can be plotted as a function of time. This approachallows not only the decay of the LVM of interest to be calculated, but alsothe identification of the receiving modes, all of it as a function of time forvarious background temperatures of the cell. Much of this work is still underway, but it appears that the pLVMs of the defect play a critical role in thedecay process. These pLVMs are different in each case.

We illustrate this approach with the case of the asymmetric stretch ofH+

bc. [31] The result of the run performed at a background temperature of75K is shown in Fig. 4.3. The calculated decay of the LVM produces a goodexponential fit leading to a calculated lifetime of 7.8ps at 75K, a value whichagrees very well with experiment. [31] The determination of the decay chan-nels is still ongoing. Since the LVMs frequency is around 2000 cm−1, a min-imum of 4 phonons is required. Examination of the time-dependence of theenergies of the normal modes of the cell shows that both wag modes and thesymmetric stretch of the defect (at 262cm−1 and 452cm−1, respectively) playan important role in the first step of the decay.

92 Stefan K. Estreicher and Mahdi Sanati

4.5 Vibrational free energies and specific heats

Although the calculation of potential energy surfaces is playing a most use-ful role in our understanding of the behavior of defects in semiconductors,the real world functions at non-zero temperatures. Samples undergo variousthermal annals, they are implanted and exposed to light, and most devicesfunction at or above room temperature. They physics and chemistry of de-fects is obviously temperature-dependent, as one observes processes such asdiffusion and association or dissociation at various temperatures.

In all these processes, the Gibbs free energy is the most relevant quan-tity since virtually all experiments are done at constant pressure. However,we focus here on the Helmholtz free energy. Working at constant volumerather than constant pressure is appropriate up to several hundred degreesCelsius in most semiconductors because their thermal expansion coefficientis very small. In Si for example, this coefficient is 4.68 × 10−6 K−1 at roomtemperature (RT) and the phonon frequencies shift slowly with T. Indeed,the difference between the constant-pressure and constant-volume specificheats [35] CP − CV is 0.0165 J/molK at room temperature, a correction ofonly 0.08% to CP=20 J/molK. Thus, in the case of semiconductors such asSi and in the temperature range where the use of harmonic dynamical ma-trices is justified, calculating the phonon density of states g(ω) at T=0Kand ignoring temperature dependence of the lattice constant are reasonableapproximations. Note that this ignores the frequency shifs associated withthe anharmonicity. This greatly simplifies the calculations. In fact, it rendersthem possible since calculating temperature-dependent phonon densities ofstate is a formidable task.

The phonon densities of states of perfect (defect-free) crystals are nor-mally calculated from the dynamical matrix of the primitive unit cell evalu-ated at thousands of q points in the Brillouin zone (BZ) of the lattice. Whenstudying defects, large periodic supercells must be used, and the BZ of thesupercell is distinct from that of the perfect solid. The eigenvalues of the dy-namical matrix only provide a small number of normal mode frequencies, andthe phonon density of states extrapolated from those few hundred frequen-cies lead to rather poor g(ω)’s. However, evaluating the dynamical matrix atmany q points in the BZ of the supercell works very well. Figure 4.4 showsthree phonon densities of state obtained from harmonic dynamical matricesextrapolated at about 90 q points in the BZ of the supercell. The dynami-cal matrices were calculated with different basis sets and Monkhorst-Pack kpoint sampling in the Si64 supercell. The best fit to the measured data [36]is obtained with the largest basis set (double-zeta polarized) and k pointsampling (2 × 2 × 2).

In the harmonic approximation, the Helmholtz free energy is given by

Fvib(T) = kBT∫ ∞

0

ln{sinh(hω/2kBT)}g(ω)dω (4.8)

4 Dynamical Matrices and Free energies 93

Fig. 4.4. Phonon densities of state g(ω) calculated from harmonic dynamical ma-trices of the Si64 supercell evaluated at 90 q points in the BZ of the cell. The dashedline (first peak at low frequency) was obtained with a double-zeta basis set and a1 × 1 × 1 Monkhorst-Pack k-point sampling, the solid line (second peak) with adouble-zeta polarized basis set and a 2× 2× 2 k-point sampling, and the long-dashline with a double-zeta polarized basis set and a 1 × 1 × 1 k-point sampling. Asexpected, the solid line best matches the experimental data [36]

where kB is the Boltzmann constant. In the perfect cell, the integration iscarried out up to the Γ phonon. With a defect in the supercell, the integralextends up to the highest normal mode of the cell (perturbed Γ phonon) andbecomes a simple sum for the higher LVMs. Note that Fvib(T = 0) gives thetotal zero-point energy. Once Fvib is calculated, the vibrational entropy andspecific heat at constant volume are given by

Svib = −(∂Fvib

∂T

)V

CV = −T(∂2Fvib

∂T2

)V

. (4.9)

The latter can be compared to the measured Cp in order to determine upto which temperature the constant-volume and harmonic approximations are

94 Stefan K. Estreicher and Mahdi Sanati

Fig. 4.5. Calculated and measured peak in the specific heat over T3 for variousisotopically pure samples of Ge. The dashed line (expt: crosses) are for 73Ge, thesolid line (expt: circles) are for the natural isotopic abundance, and the dotted line(expt: squares) are for 70Ge

appropriate. We have demonstrated that the approximations work quite wellup to some 700K in the case of c-C, Si, Ge, and GaN (see Refs. [1,37,38]), andthat the agreement with even fine features is very good at low temperature.Figure 4.5 shows the calculated isotopic-dependent peak in C/T3 in Ge. Thetemperature at which it is predicted to occur and the splitting associatedwith various isotopes quantitatively reproduce the experimental data. [39]

These tests give us confidence that the phonon densities of states cal-culated in the manner described above are also accurate when defects arepresent in the same supercell, and therefore that the calculated Helmholtzvibrational free energies are accurate up to several hundred degrees Celsius.Of course, the Fvib obtained for a defect in a 64 host-atoms cell correspond toa defect concentration of about 1.5 atomic percent, which is very high. How-ever, the few calculations we have performed with 128 and 216 atoms cellsshow that cell size effects are not very significant, probably because g(ω) isonly the weight function used in the integration, and small changes in thisfunction have a minor impact of the result.

4 Dynamical Matrices and Free energies 95

4.6 Theory of defects at finite temperatures

We are always interested in energy differences. This could be the energydifference between two configurations of the same defect, the energy differencebetween a bound complex and its dissociation products (binding energy), etc.At finite temperatures, such a total free energy difference may contain severalcontributions:

ΔE = ΔU + ΔFvib + ΔFe/h + ΔFrot + · · · − TΔSconfig . (4.10)

ΔU is the potential energy difference obtained (in this paper) from first-principles density-functional theory. The vibrational free energy ΔFvib hasbeen discussed above.

Depending on the defect under study, there may be different concentra-tions of electrons (in the conduction band) or holes (in the valence band)because different configurations of a given defect often have different electri-cal activities. This has nothing to do with the electrons or holes provided bythe background dopants, which can be large. Instead, ΔFe/h refers only tothe change in the number of free carriers associated with the two configura-tions of the defect under study, because the contributions of the background(dopant-associated) charge carriers cancel out. Unless one is dealing with un-usually high changes in carrier concentrations, this term is very small andcan be neglected. [33]

Additional contributions to the free energy are associated with energylevels arising from rotational, magnetic, spin or other degrees of freedomspecific to a particular defect. These terms can often be calculated directlyfrom the appropriate partition functions. [33] One example is provided bythe rotational free energies of interstitial H2, HD, and D2 molecules [40] inSi. The rotational energies are Ej = j(j + 1)h2/MR2, where M is the nuclearmass and R the internuclear separation. Since H2 consists of two protons(fermions), there are three ortho states (even combinations of spin: ↑↑, ↓↓,and ↑↓ + ↓↑) for which only odd values of j are allowed, and one para state(odd combination of spin: ↑↓ − ↓↑) for which only even values of j are allowed.This leads to the familiar set of two Raman lines with 3:1 intensity ratio. SinceD2 consists of two bosons, the total wavefunction must be symmetric, and asimilar argument leads to two Raman lines with 1:2 intensity ratios. Thus,the rotational free energy per molecule is given by

Frot(T) = −gokBT lnZo − gpkBT ln Zp (4.11)

where the partition functions are

Zo = Σj=1,3,5,...(2j + 1)e−j(j+1)θ/T (4.12)

Zp = Σj=0,2,4,...(2j + 1)e−j(j+1)θ/T (4.13)

96 Stefan K. Estreicher and Mahdi Sanati

and {go, gp} = {3/4, 1/4} for H2 and {1/3, 2/3} for D2. Since HD has nosymmetry restrictions, all the value of j are allowed (single Raman line) andFrot(T) = −kBT lnZ with Z = Σj(2j + 1)e−j(j+1)θ/T. The effective tempera-ture θ = h2/2IkB contains the (classical) moment of inertia I of the molecule,which depends on the host. For free H2, θ = 85.4K. In Si, the molecule hasa longer bond length than in free space [41] and θ = 73.0, 48.7 and 36.5 Kfor H2, HD and D2, respectively. The values of the rotational free energiesat 77, 300, and 800K for interstitial H2, HD, and D2 in Si are in Table 4.1.The largest rotational free energy per molecule occurs in the case of HD.When comparing the free energies of the interstitial H2 molecule with theinterstitial H∗

2 complex (which consists of a Si-Si replaced by two Si-H bonds:Si − Hbc · · ·Si − Hab) which has no rotational degrees of freedom, the sumΔFvib +ΔFrot clearly favors H2 at higher temperatures. [33] This predictionis consistent with the fact that samples hydrogenated at high temperaturesthen rapidly quenched [42] show the presence of only H2 molecules.

Table 4.1. Rotational free energy Frot (eV) for interstitial H2, HD, and D2 in Si

T (K) 77 300 800

H2 −0.005 −0.030 −0.129

HD −0.004 −0.048 −0.194

D2 −0.004 −0.040 −0.168

The importance of the configurational entropy term depends on the situ-ation. When comparing the two metastable configurations of the CH∗

2 com-plex, [33] ΔSconfig is exactly zero. When comparing interstitial H2 and H∗

2,the configurational entropy contribution is not zero but very small. Indeed,interstitial H2 occupies tetrahedral interstitial (t) sites while H∗

2 is at a bc site.Since there are twice as many bc as t sites, a (very) small ΔSconfig results.

The situation is very different when calculating binding free energies.Indeed, for a complex {A,B} with dissociation products A and B, there arevastly different numbers of configurations for the dissociated species than forthe complexes, sometimes leading to surprisingly large values for ΔSconfig.The calculation must be done using realistic concentrations of the speciesinvolved, and its details depend on the specific situation.

The difference in configurational entropy per complex is ΔSconfig =(kB/[{A,B}]) ln(Ωpair/Ωnopair), where [{A,B}] is the number of complexes,and Ωpair and Ωnopair are the number of configurations with all possible com-plexes forming and with all complexes dissociated, respectively. We set {A,B}to be ‘dissociated’ when no B species is within a sphere of radius rc of any A.The results are not very sensitive to the actual value of rc. The sites of A, B,and {A,B} are known and the concentrations [A] and [B] are estimated fromexperiment. If [A] is larger than [B], the maximum number of complexes is

4 Dynamical Matrices and Free energies 97

[{A,B}] = [B]. Although a real sample has traps for the dissociation productsA and/or B that are distinct from isolated A and/or B in a perfect crystal,we ignore this additional complication.

We consider here two boron-oxygen complexes in Si. Both of them containan interstitial oxygen dimer ({Oi}2) trapped at either substitutional (Bs) orinterstitial (Bi) boron. Thus, we consider the binding free energies of the{Bs,Oi,Oi} and {Bi,Oi,Oi} complexes (both in the +1 charge state). We donot discuss here the reasons why these complexes are important, how theyform, and the consequences of complex formation for the sample at hand.These issues are discussed elsewhere. [43] The configurations of the complexesand their dissociation products have been obtained from conjugate gradientgeometry optimizations in the appropriate charge states. The binding ener-gies at T=0K are ΔU = 0.54 and 0.61 eV for {Bs,Oi,Oi} and {Bi,Oi,Oi},respectively. All the vibrational free energies have been calculated. We arefaced now with the calculation of ΔSconfig.

Since we know the equilibrium sites of all the species involved, we knowthe number of equivalent orientations. However, we need to assume the con-centrations of the various species in order to calculate the total number ofconfigurations. We assume a sample with N = 5 × 1022 substitutional sites,[{Oi}2] = 1014 oxygen dimers, [Bs] = 1019 substitutional and [Bi] = 1014

interstitial boron impurities. In a 1 cm3 sample, the number of sites for Bs,split-interstitial sites for Bi and staggered or square configurations [44] for{Oi}2 is 5× 1022. The numbers depend on the sample, but these are realisticvalues which could correspond to an actual experimental situation.

At low temperatures, all the Bs’s traps one {Oi}2. The number of ways onecan arrange [{Oi}2] dimers among N sites is N!/[{Oi}2]!(N− [{Oi}2])!. Each{Oi}2 traps at one Bs and each {Bs,Oi,Oi} has 12 equivalent orientations.leading to 12[{Oi}2 ] possibilities. The remaining [Bs] − [{Oi}2] borons aredistributed among the remaining N − 12[{Oi}2] sites. Thus, the number ofconfigurations for {Bs,Oi,Oi} complexes is

Ωpairs =12[{Oi}2]N!(N − 12[{Oi}2])!

[{Oi}2]!(N − [{Oi}2])!([Bs] − [{Oi}2])!(N − [Bi]− 11[{Oi}2])!(4.14)

At high temperatures, all the {Bs,Oi,Oi} complexes are dissociated. Wecan arrange [Bs] borons among N substitutional sites in N!/[Bs]!(N − [Bs])!ways. If rc = 10A, no oxygen dimer is within a sphere of radius 10A ofany boron, implying that about 150 substitutional sites around each Bs arenot allowed, which means that the number of sites available for the [{Oi}2]oxygen dimers is N − 150[Bs]. The number of configurations for the {Oi}2’sis therefore (N−150[Bs])!/[{Oi}2]!(N−150[Bs]− [{Oi}2])!. Thus, the numberof configurations for the dissociated complexes is

Ωnopairs =N!(N − 150[Bs])!

[Bs]!(N− [Bs])![{Oi}2]!(N− 150[Bs] − [{Oi}2])!(4.15)

98 Stefan K. Estreicher and Mahdi Sanati

Using Sterling’s formula and an expansion for ln(1 + ε) with ε 1, we get

ΔSconfig = kB

(ln

12[Bs]N

+278[Bs]

N− [{Oi}2]

[Bs]

). (4.16)

Similar calculations for {Bi,Oi,Oi} lead to

ΔSconfig = kB

(ln

24[{Oi}2]N

− 1 + 32[Bi]N

+[{Oi}2]

N

). (4.17)

With the concentrations assumed, this gives ΔSconfig = −0.515 and−1.538 meV/K for {Bs,Oi,Oi} and {Bi,Oi,Oi}, respectively.

The difference between these situations is huge, and the reason for it isquite obvious. Consider an {A,B} complex which dissociates into A and B. IfA and/or B are abundant (as is the case for Bs), there are many configurationsresulting in pairs and relatively few configurations with A away from B.On the other hand, when both A and B are scarce (as is the case for Bi

and {Oi}2), there are far fewer ways to make pairs and a great number ofdissociated configurations. The binding free energies of the {Bs,Oi,Oi} and{Bi,Oi,Oi} complexes are plotted as a function of temperature in Fig. 4.6.As shown in Ref. [43], the contribution of ΔFvib is very small and the slopeis almost entirely determined by the difference in configurational entropy.

Thus, for a given {A,B} complex, the smaller the concentration of A or B,the larger the configurational entropy associated with the dissociated speciesand the smaller the entropy associated with complex formation. Then, theslope of Eb(T) is much steeper. The opposite holds if A and/or B exist inhigh concentrations. In the example discussed in this paper, changing onecomponent of the complex from Bs to Bi changes the relevant concentrationfrom 1019 to 1014, and this change of five orders of magnitudes roughly triplesΔSconfig.

Note that above the temperature T0 where Eb(T0) = 0, the interactionsbecome repulsive. The value of T0 depends on Eb(0) and on the slope, that ison ΔSconfig, which in turns depends on the concentrations of the dissociationproducts in the sample.

4.7 Discussion

First-principles calculations of properties of defects in periodic supercells havebecome quantitative in many respects. The configurations, energetics, se-lected LVMs, spin densities and, to a lesser extent, electrical activities oflocalized defects can be predicted with very good accuracy. However, theknowledge of the entire dynamical matrix of the system provides much moreinformation.

The eigenvalues of this matrix give all the local, pseudo-local, and resonantdefect-related modes, as well as the crystal phonon frequencies. They can

4 Dynamical Matrices and Free energies 99

Fig. 4.6. Binding free energies of the {Bs,Oi,Oi} and {Bi,Oi,Oi} complexes in Si.The latter complex will not form above room temperature, where the interactionsbetween Bs and {Oi}2 are repulsive

be used to obtained high-quality phonons densities of states which in turnallow the calculation of vibrational free energies. Although the latter arelimited to the constant volume (and, in our case, harmonic) approximation,the calculated specific heats show that the results are reliable up to severalhundred degrees Celsius.

The eigenvectors of the dynamical matrix allow the localization of lo-cal modes to be quantified and their symmetry predicted. The eigenvectorscan also be used to prepare MD simulation in thermal equilibrium at anytemperatures without the need for lengthy thermalizations or even a thermo-stat. This feature is needed to calculate vibrational lifetimes as a function oftemperature.

The calculation of defect energetics at finite temperatures is relativelystraightforward once the vibrational free energy is known. Indeed, rotationaland other contributions can be obtained (or approximated) analytically. Themost tricky, and sometimes most critical, part is the contribution of theconfigurational entropy. One example has been discussed here in detail. The

100 Stefan K. Estreicher and Mahdi Sanati

result depends on the concentrations of the species involved, that is, on thesample.

The binding free energy of an {A,B} complex in a crystal varies lin-early with temperature, with a slope largely dominated by the difference inconfigurational entropy between {A,B} and A away from B. If R is a dis-sociation rate, R exp{−Eb/kBT} = R exp{ΔSconfig/kB} exp{−Eb(0)/kBT},and an Arrhenius plot yields a straight line with slope −Eb(0)/kB and in-tercept (ln R + ΔSconfig/kB). Thus, Arrhenius plots of the dissociation of an{A,B} complex in samples containing different concentrations of A and/orB should produce parallel lines since the slopes are the same but the inter-cepts differ. This suggests a way to measure configurational entropies. If wetake R = 1011 and ΔSconfig = −0.5 or −1.0meV/K, the intercepts will be at25.3− 5.8 = 19.5 or 25.3− 11.6 = 13.7, a measurable change.

Even though the potential energy surface describing the interactions be-tween A and B has a pronounced minimum when the {A,B} complex forms,these interactions become repulsive at temperatures T > T0, where the criti-cal temperature T0 is defined from Eb(T0) = 0. If complex formation beginsat a temperature near (but below) T0, the variations of the slope of Eb(T)with time (that is: as complex formation takes place and the concentrations ofthe various species change) will cause the interactions to shift from attractiveto repulsive, probably resulting a maximum precipitate size.

Finally, since the difference in configurational entropy depends on theconcentrations [A] and [B] in the sample, the binding free energy of a spe-cific complex {A,B} at a specific temperature will be different in samplescontaining different concentrations of A or B.

Many of these consequences of the temperature dependence of bindingfree energies are unexpected, not to say counterintuitive, because many ofus are used to thinking in terms of potential energy alone. Yet, regardlessof the value of the configurational entropy or the way it is calculated, theconsequences are inescapable.

Acknowledgments

This work is supported in part by the R.A. Welch Foundation and the Na-tional Renewable Energy Laboratory.

References

1. R.K. Kremer, M. Cardona, E. Schmitt, J. Blumm, S.K. Estreicher, M. Sanati,M. Bockowski, I. Grzegory, T. Suski, and A. Jezowski, Phys. Rev. B 72, 075209(2005).

2. S.J. Pearton, J.W. Corbett, and M. Stavola: Hydrogen in Crystalline Semicon-ductors (Springer, Berlin, 1991)

4 Dynamical Matrices and Free energies 101

3. S. Limpijumnong and C.G. Van de Walle, Phys. Rev. B 68, 235203 (2003)4. A.F. Wright, C.H. Seager, S.M. Myers, D.D. Koleske, and A.A. Allerman, J.

Appl. Phys. 94, 2311 (2003)5. A. Carvalho, R. Jones, J. Coutinho, and P.R. Briddon, J. Phys.: Condens.

Matter 17, L155 (2005)6. G. Davies, Phys. Rep. 176, 83 (1989)7. S. Knack, Mat. Sci. Semic. Proc. 7, 125 (2005)8. S.K. Estreicher, D. West, J. Goss, S. Knack, and J. Weber, Phys. Rev. Lett.

90, 035504 (2003)9. S.K. Estreicher, D. West, and M. Sanati, Phys. Rev. B 72, R13532 (2005)

10. D. Sanchez-Portal, P. Ordejon, E. Artacho, and J.M. Soler, Int. J. Quant.Chem. 65, 453 (1997)

11. E. Artacho, D. Sanchez-Portal, P. Ordejon, A. Garcıa, and J.M. Soler, Phys.Stat. Sol. (b) 215, 809 (1999)

12. D.M. Ceperley and B.J. Adler, Phys. Rev. Lett. 45, 566 (1980)13. S. Perdew and A. Zunger, Phys. Rev. B 32, 5048 (1981)14. L. Kleiman and D.M. Bylander, Phys. Rev. Lett. 48, 1425 (1982)15. O.F. Sankey and D.J. Niklevski, Phys. Rev. B 40, 3979 (1989); O.F. Sankey,

D.J. Niklevski, D.A. Drabold, and J.D. Dow, Phys. Rev. B 41, 12750 (1990)16. A.A. Demkov, J. Ortega, O.F. Sankey, and M.P. Grumbach, Phys. Rev. B 52,

1618 (1995)17. H.J. Monkhorst and J.D. Pack, Phys. Rev. B 13, 5188 (1976)18. T. Inui, Y. Tanabe, and Y. Onodera: Group Theory and ts Applications in

Physics (Springer, Berlin, 1990)19. M.P. Allen and D.J. Tildesley: Computer Simulations of Liquids (Clarendon,

Oxford, 1987)20. See e.g., J.L. Gavartin and D.J. Bacon, Comp. Mater. Sci. 10, 75 (1998)21. S. Baroni, P. Gioannozzi, and A. Testa, Phys. Rev. Lett. 58, 1861 (1987)22. A. Fleszar and X. Gonze, Phys. Rev. Lett. 64, 2961 (1990); X. Gonze and C.

Lee, Phys. Rev. B 55, 10355 (1997)23. J.M. Pruneda, S.K. Estreicher, J. Junquera, J. Ferrer, and P. Ordejon, Phys.

Rev. B 65, 075210 (2002)24. P. Ordejon, D.A. Drabold, R.M. Martin, and S. Itoh, Phys. Rev. Lett. 75, 1324

(1995)25. D. West and S.K. Estreicher, J. Phys.: Condens. Matter (submitted)26. J.L. Gavartin and A.M. Stoneham, Phil. Trans. R. Soc. Lond. A 361, 275

(2003)27. M. Budde, G. Luepke, C. Parks Cheney, N.H. Tolk, and L. Feldman, Phys.

Rev. Lett. 85, 1452 (2000)28. S.K. Estreicher, D. West, J. Goss, S. Knack, and J. Weber, Phys. Rev. Lett.

90, 035504 (2003)29. S.K. Estreicher, D. West, and M. Sanati, Phys. Rev. B 72, R13532 (2005)30. For information about the Molekel visualization package, see

www.cscs.ch/molekel/31. M. Budde, G. Luepke, C. Parks Cheney, N.H. Tolk, and L.C. Feldman, Phys.

Rev. Lett. 85, 1452 (2000)32. G. Lupke, X. Zhang, B. Sun, A.Fraser, N.H. Tolk, and L.C. Feldman, Phys.

Rev. Lett. 88, 135501 (2002)33. S.K. Estreicher, M. Sanati, D. West, and F. Ruymgaart, Phys. Rev. B 70,

125209 (2004)

102 Stefan K. Estreicher and Mahdi Sanati

34. M. Sanati and S.K. Estreicher, Phys. Rev. B (submitted)35. P. Flubacher, A.J. Leadbetter, and J.A. Morrison, Phil. Mag. 4, 273 (1959). In

this paper, ‘cal/g atom’ should read ‘cal/mol’ (or the numbers be divided bythe atomic mass of Si)

36. F. Widulle, T. Ruf, M. Konuma, I. Silier, M. Cardona, W. Kriegseis, and V.I.Ozhogin, Sol. St. Comm. 118, 1 (2002)

37. M. Cardona, R.K. Kremer, M. Sanati, S.K. Estreicher, and T.R. Anthony, Sol.St. Commun. 133, 465 (2005)

38. M. Sanati, S.K. Estreicher, and M. Cardona, Sol. St. Commun. 131, 229 (2004)39. W. Schnelle and E. Gmelin, J. Phys. Condens. Matter 13, 6087 (2001)40. For a review of interstitial hydrogen molecules in semiconductors, see S.K.

Estreicher, Acta Phys. Polon. A 102, 403 (2002)41. S.K. Estreicher, K. Wells, P.A. Fedders, and P. Ordejon, J. Phys.: Condens.

Matter 13, 62 (2001)42. E.E. Chen, M. Stavola, and W.B. Fowler, Phys. Rev. B 65, 245208 (2002)43. M. Sanati and S.K. Estreicher, Phys. Rev. B 72, 165206 (2005)44. J. Adey, R. Jones, D.W. Palmer, P.R. Briddon, and S. Oberg, Phys. Rev. Lett.

93, 055504 (2004)

5 The calculation of free energies in

semiconductors: defects, transitions and phase

diagrams

E. R. Hernandez1 , A. Antonelli2, L. Colombo3, and and P. Ordejon1

1 Institut de Ciencia de Materials de Barcelona (ICMAB–CSIC), Campus deBellaterra, 08193 Barcelona, Spain [email protected]

2 Instituto de Fısica Gleb Wataghin, Universidade Estadual de Campinas,Unicamp, 13083-970, Campinas, Sao Paulo, Brazil

3 SLACS (INFM-CNR) and Department of Physics, University of Cagliari,Cittadella Universitaria, I-09042 Monserrato (Ca), Italy

5.1 Introduction

The free energy plays a central role in understanding the thermal propertiesof materials. From it, other properties of a material may be derived, suchas the internal energy, the volume, entropy, etc. The phase behaviour of amaterial is controlled by the values of the free energy of its different phases,and the concentrations of defects or impurities, as well as their partitionamong coexisting phases, are defined by their chemical potentials, which arethemselves obtained from the free energy. It is clear,

therefore, that it is highly desirable to have efficient computational toolsthat can evaluate the free energy of modelled materials in different conditionsof temperature and pressure (or volume). These methods should be accurate,and ideally it should be possible to combine them with first principles elec-tronic structure methods, which provide an accurate picture of the structure,bonding and energetics of materials.

In this chapter we provide a self-contained description of recent theoreti-cal developments which have contributed to making free energy calculationsmore accessible and efficient, and we illustrate their capabilities with a se-ries of applications, mostly in the field of semiconductors, the topic of thisbook. Our intention in writing this chapter has been to illustrate the po-tential of these techniques, some of which are still relatively little known.On the other hand, we have not intended to provide an exhaustive reviewof free energy techniques, nor to dwell overmuch on the technical details ofimplementation. Both topics are covered at length in the excellent book byFrenkel and Smit [2], and also to some extent in the earlier book by Allen andTildesley [1]. Neither has it been our intention to list the many examples ofapplications of free energy techniques to problems in materials science, chem-ical physics, geology or biomolecular systems. Applications in some of thesefields are reviewed to some extent in the articles by Rickman and LeSar [3]and by Ackland [4]. A very recent review of some of the techniques that will

104 E. R. Hernandez, A. Antonelli, L. Colombo, and and P. Ordejon

be discussed here, in particular the reversible scaling technique, is that ofde Koning and Reinhardt [5].

5.2 The Calculation of Free energies

Unlike the total internal energy, which depends only on the positions andvelocities of the system at a single point in phase space, the free energydepends on all configurations (all state points) in the phase space volumeaccessible to the system of interest in its given conditions of temperature andvolume (or pressure). This is, in essence, why it is more arduous to calculatethe free energy in atomistic simulations of condensed matter systems. Fromstatistical mechanics we have that the free energy at temperature T has theform

F = −kBT lnZ, (5.1)

where kB is Boltzmann’s constant and Z is the partition function. In Canon-ical ensemble conditions, i.e. constant number of particles, N , constant vol-ume V and constant temperature T , the partition function has the followingexpression:

ZNV T =1

N ! h3N

∫V

d Γ e−βH(Γ), (5.2)

where Γ represents a point in phase space, i.e. the 6N-dimensional spaceformed by the 3N momenta and 3N coordinates of the system, d Γ is thecorresponding volume element, h is Planck’s constant, H is the Hamiltonianof the system, β = (kBT )−1, and only configurations of volume V contributeto the integral. By writing an integral we are implicitly assuming that thesystem under consideration is being described by classical mechanics. In thequantum case the integral would be replaced by a discrete sum over quantumstates.

Frequently it is desirable to consider the system under conditions of con-stant temperature and constant pressure (instead of constant volume), condi-tions which correspond to the so-called isothermal-isobaric orNPT ensemble.In this case, the partition function Z is generalised to

ZNPT =1

N ! h3N

∫ ∞

0

d V e−βPV

[∫V

d Γ e−βH(Γ)

]. (5.3)

In this case the free energy is called Gibbs free energy (and we will label itG), to distinguish it from the canonical free energy (also called Helmholtz freeenergy).

The partition function plays the role of a normalisation factor in thermo-dynamical averages. Consider for example some property A, which depends

5 Free energies in semiconductors 105

on the positions, and/or velocities of the particles in the system. Then thecanonical thermal average of A would be given by

〈A〉NV T =

∫V d Γ A(Γ ) e−βH(Γ)∫

V d Γ e−βH(Γ)

, (5.4)

where the denominator can be recognised as the canonical partition func-tion of Eq. (5.2) save for some factors which are cancelled when taking thequotient of integrals. Such thermal averages are readily estimated by thestandard simulation techniques of (canonical) molecular dynamics (MD) orMonte Carlo (MC), but it is important to note that these techniques directlyestimate the quotient of integrals appearing in Eq. (5.4), and they cannotevaluate each of the integrals separately. Also note that the free energy is notitself a thermal average [it does not have the form of Eq. (5.4)], and thereforeit cannot be obtained by direct MD or MC simulation.

5.2.1 Thermodynamic integration and adiabatic switching

Equations (5.2) and (5.3) clearly illustrate the difficulty of evaluating thefree energy. These integrals are multidimensional, with as many dimensionsas degrees of freedom in the system, and except for very simple models suchas the ideal gas or the harmonic solid, they cannot be evaluated analytically.Furthermore, their high dimensionality precludes any attempt of evaluationby numerical quadrature methods. Thus it would seem that we are facedagainst an unsurmountable difficulty, though fortunately this is not the case.A way out of the hurdle is given by considering the dependence of the Hamil-tonianH on some parameter λ, and asking ourselves how does the free energychange with λ. It is immediately apparent that

∂F

∂λ=

∫Vd Γ

(∂H∂λ

)e−βH(Γ)∫

Vd Γ e−βH(Γ)

=⟨∂H

∂λ

⟩. (5.5)

Eq. (5.5) shows that, while the free energy itself is not a thermal average,its derivative with respect to any parameter λ is a thermal average, andtherefore it can be evaluated employing conventional simulation techniques.This observation forms the basis of the technique known as thermodynamicintegration or coupling parameter [6, 7] method for evaluating free energies,or rather free energy differences. Consider a system, for example a solid,described by some potential U(r), where r stands for the set of (generalized)coordinates specifying the positions of all degrees of freedom, for which wewish to evaluate the free energy at some given temperature and volume. Nowimagine that there is another system, similar in some sense to the system ofinterest, for which the free energy is known at the temperature and volumeat which we wish to know it for the system of interest. We noted earlierthat one of the systems for which the canonical partition function can be

106 E. R. Hernandez, A. Antonelli, L. Colombo, and and P. Ordejon

evaluated analytically is the harmonic solid in which each atom is tied toits equilibrium lattice site by means of a harmonic spring. We will refer tothe system for which the free energy is known as the reference system, andwill assume it is described by a potential Uref (r). Let us now define theλ-dependent Hamiltonian

Hλ =12

∑i

p2i

mi+ λU(r) + (1 − λ)Uref (r). (5.6)

Notice that when λ = 0 the Hamiltonian corresponds to that of the referencesystem (in our example the Einstein solid), while when λ = 1 it is that of thesystem of interest. A direct application of Eq. (5.5) tells us that the derivativeof the free energy Fλ associated with Hλ at λ ∈ [0, 1] is

∂Fλ

∂λ= 〈U(r) − Uref (r)〉λ. (5.7)

Thus, to obtain the free energy difference between the system of interest andthe reference all we need to do is to integrate over λ,

ΔF = F − Fref =∫ 1

0

dλ 〈U(r) − Uref (r)〉λ. (5.8)

Since Fref is known, the sought F follows immediately. As stated above,the λ-dependent thermal averages 〈U(r) − Uref (r)〉λ are readily obtainedfrom standard simulation techniques. In practise, a discrete set of λ valuesin the interval [0, 1] is chosen (usually of the order of 5 to 10), and at eachone of them an equilibrium simulation is carried out with Hamiltonian Hλ,from which the average 〈U(r) − Uref (r)〉λ is obtained. Then the integral inEq. (5.8) is evaluated numerically. Note that because one samples directlythe potential U(r) − Uref (r) using either MD or MC simulation techniques,no harmonic approximation is involved here, and thus anharmonic effects areautomatically taken into account.

For this procedure to work, the chosen reference system must be in somesense similar to the target system. By this we mean that as λ is changed, thesystem described by HamiltonianHλ [Eq. (5.6)] should not undergo any phasetransitions. If that were to occur, some of the work spent in transforming thereference system into the system of interest (or vice-versa) would actually bespent in the latent heat of the phase transition, and the resulting free energyestimate would be inaccurate. Also, with a view to making the numericalintegration of Eq. (5.8) accurately, it is desirable that the quantity U(r) −Uref(r) is subject to fluctuations as small as possible, and this is anothercriterion of similarity between target and reference systems. We have alreadymentioned that a useful reference for free energy calculations of solids isthe Einstein crystal, but this would not be a good reference for a liquid.Fortunately, there are a number of simple model liquids, such as the Lennard-Jones fluid [8] or the inverse-power fluid [9], which have been extensively

5 Free energies in semiconductors 107

studied, and for which the free energy has been tabulated in a wide rangeof temperature and pressure (density) conditions, and therefore these modelsserve the purpose of reference systems for liquids. Of course, for systems inthe gas phase the ideal gas is an adequate reference.

Eq. (5.8) reminds us again of the fact that it is harder to obtain freeenergies than it is to calculate thermal averages. A thermal average is typ-ically obtained from a single equilibrium (MD or MC) simulation, while toevaluate the free energy of the same system one needs several simulations.The effort is increased further if the free energy must be evaluated at othertemperatures or volumes (pressures). It can easily be seen that the amountof work necessary to find a coexistence point, let alone map out a phaseboundary (where two coexisting phases have the same free energy), soonbecomes a daunting task, if one must resort to the schemes described thusfar. Fortunately, starting in the early 90s, a number of alternative techniquesof increased efficiency have been developed, which considerably amelioratethis situation, and thanks to them the task of calculating phase boundaries,and even entire phase diagrams is nowadays more accessible. The first suchdevelopment was proposed by Watanabe and Reinhardt [10], who showedthat accurate estimations of ΔF could be obtained from a single simulationwith HamiltonianHλ [Eq. (5.6)], during which the parameter λ is slowly (i.e.quasi-adiabatically) switched from 0 to 1 (or vice-versa). Thus the task ofperforming several equilibrium simulations at different values of λ to obtainΔF is reduced to that of performing a single non-equilibrium simulation ona system which quasi-adiabatically transmutes from the reference system tothe system of interest (or the reverse). During this non-equilibrium simula-tion one computes the irreversible work (reversible in the adiabatic limit)Wirr(t, λ=0→1), given by

Wirr(t, λ=0→1) =∫ t

0

dt′ λ {U [r(t′)] − Uref [r(t′)]} , (5.9)

where λ = dλ/dt′ is the rate of change of λ, and the time variable correspondsto either real time in an MD simulation, or to simulation time in an MCsimulation (each MC sweep corresponding to a unit of simulation time). Inthe adiabatic limit, Wirr(t=∞, λ=0→1) would be equal to the free energydifference, ΔF , but since in practise strict adiabaticity cannot be attained,Wirr(t, λ=0→1) is only an approximation toΔF , due to entropic dissipationeffects (see below). Nevertheless, experience shows that Eq.(5.9) providessatisfactorily accurate results with modest computational efforts, indeed moremodest than those required in a full thermodynamic integration calculation.We will refer to the method of Watanabe and Reinhardt as the adiabaticswitching method.

The adiabatic switching method is based on two key observations. Thefirst one is that, by virtue of Liouville’s theorem, any classical trajectory, evenone generated by a time dependent Hamiltonian, preserves the phase space

108 E. R. Hernandez, A. Antonelli, L. Colombo, and and P. Ordejon

volume. In other words, the trajectory generated by a time-dependent clas-sical Hamiltonian will evolve on a phase space hyper-surface which, thoughitself changing shape with time, encloses a fixed amount of phase space vol-ume. The second observation, due to Hertz [10], applies when the evolutionof the Hamiltonian is slow, or adiabatic. Under these conditions, the evolvinghyper-surface of phase space on which the trajectory moves corresponds, ateach instant t, to a constant energy shell of energy H(t). Therefore, a tra-jectory generated by an adiabatically evolving Hamiltonian preserves the en-tropy, defined as S = kB lnΩ, where Ω is the (invariant) phase space volumeenclosed by the adiabatically evolving phase space hyper-surface. This is themechanical analog of the observation in thermodynamics that an adiabaticprocess (i.e. one which proceeds along a succession of equilibrium states) ex-erted on a system preserves its entropy. The consequence of this is that, sinceΔS = 0 when λ in Eq. (5.6) is adiabatically switched along the trajectorygenerated by Hλ in canonical ensemble conditions, the free energy changeinvolved in this switching process can be measured by the internal energychange alone. As pointed out above, in the limit of strict adiabaticity, wewould have that ΔF = Wirr(t=∞, λ=0→ 1), but since the switching pro-cess cannot be truly adiabatic there will be some entropic dissipative effects,which will cause Wirr(t, λ=0→1) to be an upper bound of ΔF . In fact, onecan also run the switching simulation in reverse, i.e. switching λ from 1 to 0,and the resulting irreversible work, Wirr(t, λ=1→ 0) then provides a lowerbound to ΔF . This proves to be a convenient way of determining error barsfor the estimated free energy [5].

5.2.2 Reversible scaling

Another important development allows one to obtain the free energy in aquasi-continuous range of temperatures, starting from a reference temper-ature at which the free energy is known, from either a previous adiabaticswitching or thermodynamic integration calculation. This method, known asreversible scaling , was introduced by de Koning, Antonelli and Yip [11], andis based on the formal equivalence that exists, from a statistical mechanicsperspective, between scaling the temperature and scaling the potential. In-deed, the canonical partition function of a system described by potential U(r)at temperature T is

ZNV T =1

N ! Λ3N(T )

∫V

d r e−U(r)kBT , (5.10)

where the factor Λ(T ) = (h2/2πmkBT )1/2 results from integrating out themomenta, and we have assumed that all particles are identical. Now considera system described by a scaled potential, λU(r), at a different temperature,T0. For this system, the partition function would be

5 Free energies in semiconductors 109

ZNV T0(λ) =1

N ! Λ3N(T0)

∫V

d r e−λU(r)kB T0

=λ3N/2

N ! Λ3N(T0/λ)

∫V

d r e−U(r)

kB T0/λ . (5.11)

Now, it is easy to see that, except for a factor of λ3N/2, the partition functionof the scaled system at temperature T0/λ and that of the unscaled system attemperature T are identical, if we impose that T0 = λT . From this it is easyto derive the following relation between the free energies of the scaled andunscaled systems:

F (T )T

=Fs(T0, λ)T0

+32NkB lnλ (5.12)

In words, we have related the free energy of a system at temperatureT to that of an appropriately scaled system at temperature T0 such thatT0 = λT . From a formal point of view not much has been done. However,using the technique of adiabatic switching described in 5.2.1 it is very simpleto calculate Fs(T0, λ), at fixed temperature T0 but in a quasi-continuous rangeof λ values in some interval λ ∈ [λmin, λmax], and by virtue of Eq. (5.12),each of these Fs(T0, λ) values corresponds to the free energy of the unscaledsystem at a quasi-continuous range of temperatures T ∈ [T0/λmax, T0/λmin].Fs(T0, λ) is obtained as

Fs(T0, λ) = Fs(T0, λ0) +WNV Tirr (λ0→λ), (5.13)

where Fs(T0, λ0) is the free energy of the scaled system at the initial value ofthe scaling parameter λ0 (usually either λmin or λmax) and WNV T

irr (λ0 →λ)is the work done by switching the scaling parameter from λ0 to λ, which,using the adiabatic switching technique is estimated as

WNV Tirr (λ0→λ) =

∫ t

0

d t′ λ U [r(t′)]. (5.14)

Thus, instead of having to run a separate adiabatic switching or thermo-dynamic integration calculation to obtain the free energy of the system ofinterest for each temperature in a discrete set spanning the desired rangeof temperatures, one can perform a single adiabatic switching calculation ofthe scaled system at some appropriately chosen temperature T0, and simplyquasi-continuously vary the scaling parameter λ from λmin to λmax (or vice-versa). Note that in Eq. (5.14) above each time step leads to a new value ofλ and a corresponding value of WNV T

irr (λ0→λ), which, through Eq. (5.12),leads to an estimate of F (T ) at temperature T = T0/λ.For the sake of simplicity, we have presented the method in the context of

the canonical ensemble, but the same ideas can be applied in the isothermal-isobaric case [12], which is of more interest when it comes to studying phase

110 E. R. Hernandez, A. Antonelli, L. Colombo, and and P. Ordejon

transitions. In this case one finds that it is necessary not only to scale thepotential energy but the pressure as well, so that the scaled system pressureis Ps(λ) = λP , P being the pressure of the unscaled system. Then Eq. (5.12)transforms into

G(T, P )T

=Gs[T0, Ps(λ)]

T0+

32NkB lnλ, (5.15)

relating the Gibbs free energies of the scaled and unscaled systems, andGs(T0, Ps, λ) = Gs[T0, Ps(λ0), λ0] +WNPT

irr (λ0→λ), with

WNPTirr (λ0→λ) =

∫ t

0

d t′ λ{U [r(t′)] +

dPs(λ)dλ

V (t′)}, (5.16)

which is obtained from an adiabatic switching calculation carried out underisobaric-isothermal conditions.

The reversible scaling technique discussed above provides an efficient pro-cedure for calculating coexistence points between different phases of a givenmaterial or substance. Using reversible scaling in the isothermal-isobaric en-semble one can easily obtain the Gibbs free energy of the two different phases,at constant pressure P, in a range of temperatures bounding the values wherethe coexistence point is expected to be located. The temperature at which thetwo free energies match is by definition the coexistence temperature of thetwo phases at pressure P. This strategy is illustrated for the particular caseof the melting point of Si in the diamond phase at 0 pressure in Fig. (5.1). Siwas simulated with the semi-empirical tight binding (TB) [13] Hamiltoniandue to Lenosky and coworkers [14], which describes very well the structuraland thermal properties of Si in several of its phases [15].

5.2.3 Phase boundaries and phase diagrams

If one wishes to map out an entire phase boundary, rather than just a singlecoexistence point between two phases in equilibrium, one option would beto iterate the procedure illustrated in Fig. (5.1) to obtain new coexistencetemperatures at different pressures. However, an alternative method existswhich is often more efficient and practical, since it does not require any fur-ther calculation of free energies along the phase boundary. This techniquewas pioneered by Kofke [17, 18], and is known as Gibbs-Duhem or Clausius-Clapeyron integration. In this method, starting from some previously deter-mined coexistence point along the desired phase boundary, one numericallysolves the Clausius-Clapeyron equation:

d T

dP= T

ΔV

ΔH. (5.17)

This equation relates the slope of the phase boundary, dT/dP , at thecurrent coexistence point, given by temperature T and pressure P , with the

5 Free energies in semiconductors 111

1530 1540 1550 1560 1570 1580Temperature (K)

−5.17

−5.16

−5.15

−5.14

−5.13

−5.12

Gib

bs F

ree

Ene

rgy

(eV

/ato

m)

DiamondLiquid

Fig. 5.1. Gibbs free energy of the diamond and liquid phases of Si in the neigh-bourhood of the zero pressure melting point, as obtained from reversible scalingsimulations with the Lenosky et al. [14] model. The simulations contain 128 atomsin each phase, and four k-points were used to sample the Brillouin zone. It can beseen that the two free energies become equal at approximately 1551 K, and abovethis temperature the liquid becomes thermodynamically more stable. More detailsof these calculations can be found in [15, 16].

differences of molar volumes ΔV and molar enthalpies ΔH of the coexistingphases. Thus, if ΔV and ΔH are known at the current coexistence point,one can estimate how the equilibrium temperature will change if the pressurechanges by some small amount. ΔV and ΔH can be obtained from standardequilibrium simulations using MD or MC. Note that the interface betweenthe coexisting phases need not be taken into account: each phase is inde-pendently simulated in a separate simulation box, at identical conditions oftemperature and pressure. Once the average volume and enthalpy of eachphase is known with sufficient accuracy, new coexistence conditions are de-rived from Eq. (5.17), and the process is iterated until the phase boundaryhas been mapped out. The stability and accuracy of the method have beenanalysed in detail by Kofke [18]. This method has been extensively employed,and some particular examples of its use will be illustrated in the next section.

A parallelism can be established between Kofke’s Clausius-Clapeyron in-tegration method and thermodynamic integration. In the latter, one performsa series of equilibrium simulations at different λ values in the range [0, 1] withthe aim of obtaining the free energy of the system. In Clausius-Clapeyron in-tegration, one performs a series of equilibrium simulations at different valuesof the independent variable (T or P ), and uses Eq. (5.17) to find how thedependent variable (P or T ) adapts to the change in the independent variable

112 E. R. Hernandez, A. Antonelli, L. Colombo, and and P. Ordejon

such that the two phases remain in equilibrium. However, as we saw in 5.2.1,the calculation of the free energy can be made more efficiently using adia-batic switching, where instead of several (c.a. 5-10) equilibrium simulations, asingle non-equilibrium quasi-adiabatic simulation is employed. It is thereforenatural to ask if a similar gain in efficiency is possible in Clausius-Clapeyronintegration, and the answer is affirmative, as de Koning et al. [12] have shown.The Clausius-Clapeyron equation [Eq. (5.17)] is arrived at by demanding thatthe change in free energy due to a small change in the independent variablebe the same in both phases. For illustrative purposes, let us consider the casein which the temperature T plays the role of independent variable, while thepressure P is the dependent one. Like in the reversible scaling method, letus now consider scaled versions of the two phases at constant temperatureT0 and pressure PS . When λ = 1 these correspond to some initial coexis-tence point. The free energies of the scaled phases will be GS,a and GS,b,respectively. In the scaled case, now the temperature is fixed, and the roleof the independent variable is played by λ. Therefore, as λ is changed, thescaled phases will depart from equilibrium, unless the scaled pressure PS(λ)evolves in such a way as to make the change in GS,a equal to that in GS,b. Byrequiring that this be the case, one arrives at a Clausius-Clapeyron equationfor PS(λ), which reads

dPS(λ)dλ

= −ΔUΔV

, (5.18)

where ΔU is the difference of internal energies between the two phases. Ifλ is stationary, the values of ΔU and ΔV would be obtained from stan-dard equilibrium simulations, and everything would be exactly as in the caseof the Clausius-Clapeyron integration of Kofke, but working with the scaledphases. However, Eq. (5.18) suggests that, as in the reversible scaling method,λ can be varied quasi-continuously (i.e. quasi-adiabatically). In these condi-tions, ΔU and ΔV will adapt to the changing value of λ, and PS(λ) willevolve according to Eq. (5.18) such that the scaled phases remain in equilib-rium. Through the scaling relations T = T0/λ, P = λPS(λ), the equilibriumpressure P (T ) at temperature T will be obtained for the unscaled phases.de Koning et al. demonstrated the usefulness of the method by calculatingthe melting curve of the Lennard-Jones model for pressures between 0 upto 160 reduced units, obtaining results matching those obtained previouslyby Agrawal and Kofke [19] employing the standard (equilibrium) Clausius-Clapeyron integration method. In the next section we will see another ap-plication, in which this dynamic Clausius-Clapeyron integration method hasbeen recently used to obtain the phase diagram of Si as predicted by a semi-empirical TB model.

5 Free energies in semiconductors 113

5.3 Applications

Let us now discuss some examples of applications of the techniques describedin section 5.2 to the study of thermal properties of defects, phase transi-tions and phase diagrams. Our focus, given the theme of this book, will bemostly on semiconductors, though occasionally we will mention examples ofapplications to other types of materials, due to their importance or to their il-lustrative value. Let us remark that we do not intend to provide an exhaustivereview of the literature concerned with thermal properties of semiconduct-ing materials, as this would be beyond the scope of this chapter. Our aim israther to illustrate the capabilities of the novel techniques discussed above.

5.3.1 Thermal properties of defects

Most calculations of the free energy and other thermal properties of defectsin semiconductors to be found in the literature rely on the use of the quasi-harmonic approximation, and since this topic is going to be covered at lengthin Chapter 8, we will not dwell much on it here. However, we must mentiontwo studies which employed techniques described in section 5.2, and whichtherefore go beyond the quasi-harmonic approximation. Both studies illus-trate nicely the potential of the techniques discussed in this chapter, andmake evident the need for incorporating the effects of anharmonicity at suf-ficiently high temperatures.

The first example is the study of Jaaskelainen et al. [20], who computedthe free energy of formation of the vacancy and self-interstitial in Si as afunction of temperature. A thorough study of self–diffusion requires accuratefree energy calculations (aimed at predicting temperature-dependent equi-librium concentrations) and extensive diffusivity simulations (aimed at com-puting migration energies and diffusivity prefactors) for all relevant kinds ofnative defects. In their study, Jaaskelainen and coworkers considered onlyself–interstitial (I) and vacancy (V) defects.

Once the formation free energies F fI,V = Ef

I,V +TSfI,V , as well as migration

energies EmI,V and diffusivity prefactors d0

I,V are known, the self-diffusioncoefficient DSD(T ) can be cast in the form

DSD(T ) = d0I exp(−E

fI − TSf

I

kBT) exp(−Em

I /kBT ) +

d0V exp(−E

fV − TSf

V

kBT) exp(−Em

V /kBT ) (5.19)

so that a direct theory vs. experiment comparison is possible.The thermodynamic integration (TI) method [2] was adopted by Jaaskelainen

et al. [20] to evaluate F fI,V in c-Si by means of the tight binding (TB) model

provided by Kwon et al. [21]. The ensemble average appearing in Eq. (5.8)

114 E. R. Hernandez, A. Antonelli, L. Colombo, and and P. Ordejon

was performed during constant-volume, constant-temperature molecular dy-namics (TBMD) simulations in a 64±1 atom periodically-repeated cell. Thethermodynamical integration was performed over 16 λ points, while free en-ergy calculations were performed at four different temperatures, namely 300,500, 1000 and 1400 K. The formation entropy Sf

I for the self-interstitial defectwas found to be almost constant with temperature, the average value beingSf

I = 11.2kB. First-principles calculations [5] predict a value of SfI ∼ 10kB

when including anharmonic terms through TI calculations as well. The caseof a vacancy is more complicated due to its high mobility [22], which effec-tively adds a sizeable contribution of migration entropy in the TI free energycalculations. This is confirmed by the fact that the computed value of Sf

V var-ied in the range 10.2–11.7 kB in the selected temperature interval, with anaverage value of Sf

V ∼ 10.8kB. In the case of the vacancy, in order to preventthe double counting of the migration entropy, which is already taken intoaccount by the TBMD simulations aimed at measuring d0

V , short observationruns were performed, taking care to select only those simulations where Vdiffusion actually did not take place. The resulting average formation (i.e.configurational+vibrational) entropy was Sf

V ∼ 8.8kB.These TBMD results for Sf

I and SfV are in good qualitative agreement

with first–principles calculations by Blochl et al. [5] in the sense that thedifference in the entropies of formation for interstitials and vacancies is ofthe order of 1-2 kB in both studies, with a larger formation entropy for theinterstitial.

Diffusivity constants were finally obtained by using the migration pref-actors and energies computed by means of the same TBMD functional, fol-lowing Tang et al. [22]. The overall reliability of this TI-TBMD approachis summarised in Fig. (5.2), where the silicon TBMD total self-diffusion co-efficient DSD = DI + DV is compared with state-of-the-art experimentaldata [23,24]. As can be seen, there is extremely good agreement between thedifferent sets of experimental data and the theoretical results. These temper-ature-dependent self-diffusion values should be therefore useful in modellingSi bulk processing.

As noted above, Jaaskelainen et al. implicitly assumed that the main con-tributors to self diffusion in Si are the vacancy and interstitial defects, thusneglecting the possible contribution of other kinds of defects, such as clus-ters of interstitials or vacancies. Indeed, both first principles simulations [28]and tight binding temperature-accelerated MD simulations have revealed thatsmall clusters of defects can also make substantial contributions to the mobil-ity. In particular, the study of Cogoni et al. [29] shows quite conclusively thatin all range of temperatures the mobility of the interstitial dimer is compara-ble to that of the single interstitial. At sufficiently high temperatures (above1400 K) even the trimer diffuses fairly rapidly. However, it should be noticedthat to better estimate the significance of these clusters of interstitials toself-diffusion in Si, their concentrations, as well as their mobilities have to be

5 Free energies in semiconductors 115

Fig. 5.2. Temperature dependence of the total self-diffusion coefficient in silicon.The continuous line are the experimental results of ref. [23], the dotted line are theexperimental measurements of ref. [24], and the dot-dash line are the theoreticalresults of ref. [?]. from Jaaskelainen et al., Phys. Rev. B 64, 233203 (2001).

taken into account. Since their formation energies are larger than that of thesingle interstitial, it is likely that in equilibrium conditions their net contribu-tion to self-diffusion will be rather small. On the other hand, the processingof Si samples may incur non-equilibrium defect populations, and under theseconditions the contribution of clusters of interstitials to self-diffusion couldbe important.

Let us now focus on another aspect of the thermal properties of Si. Atroom temperature, Si is a brittle material, but as the temperature is raised,a dramatic change in its mechanical properties takes place in a very narrowrange of temperatures around 873 K, and Si becomes a ductile material [30].It is accepted that this change is due to the fact that at this temperaturedislocations can be more easily nucleated and emitted at the tip of cracks,blunting the crack tip and making its propagation through the material moredifficult. In Si there are two sets of closely packed {111} slip planes, calledshuffle and glide. While dislocations nucleated at the shuffle set have highPeierls stress and low mobility, the dislocations on the glide set can split intopartials, having a smaller Peierls stress and higher mobility. It is believedthat the brittle-ductile transition in silicon can be related to the change indominance of one set over the other. Over 10 years ago, Rice and collabora-tors [31] proposed a model in which the resistance to nucleation of dislocationsat the crack tip can be measured by the so-called unstable stacking energy,which is the lowest energy barrier that needs to be crossed when one half ofa crystal slips relative to the other half. Through a combination of densityfunctional theory calculation with Vineyard’s transition state theory, Kaxi-ras and Duesbery [32] have shown that an abrupt transition from shuffle to

116 E. R. Hernandez, A. Antonelli, L. Colombo, and and P. Ordejon

400 600 800 1000 1200 1400 1600

-20

0

20

GLIDE

SHUFFLE

corrected EDIP DFT-LDA Exp. brittle fracture stress

Pres

sure

(kb

ar)

T(K)

Fig. 5.3. Coexistence curves separating the preferable nucleation of dislocations onshuffle planes versus glide set. The full line are the results obtained from the EDIPpotential after applying a correction for the overestimation of γus, the unstablestacking energy, by this potential. The dotted line shows the results obtained withDFT LDA calculations, and the dashed horizontal line gives the value of the ex-perimental brittle fracture stress. Figure reprinted with permission from de Koninget al., Phys. Rev. B 58, 12555 (1998). Copyright (1998) by the American PhysicalSociety.

glide dominance happens with temperature. In this approach, the tempera-ture effects were considered without taking into account the real dynamicsof the atoms on either side of the slip plane. The calculations by de Kon-ing et al. [33] using the adibatic switching method within MD simulationsfully included such temperature effects. Fig (5.3) shows the phase diagramfor preferable nucleation of dislocations on shuffle plane versus glide set. Onecan see from the figure that the inclusion of all anharmonic effects in thecalculations by de Koning et al. did not change substantially the diagrampreviously obtained by Kaxiras and Duesbery.

5.3.2 Melting of Silicon

Probably the first calculation of the melting point of crystalline Si from sim-ulation was carried out by Broughton and Li [34], employing the well-knownStillinger-Weber [35] potential. Broughton and Li calculated the zero pressuremelting point by obtaining the free energy of the crystal and liquid phases

5 Free energies in semiconductors 117

at different temperatures. The free energy of the crystal was calculated bythermodynamic integration at different temperatures, taking as reference sys-tem the harmonic solid. For the liquid phase, the free energy was obtainedby first switching off the attractive two-body part of the potential to avoidthe occurrence of a possible first-order liquid-vapour transition, and then ex-panding the system to zero density. They obtained a melting temperature of1691 K, with an estimated error of ±20 K. This value is in excellently goodagreement with the currently accepted experimental value of 1687 K [36], anagreement which is perhaps not so surprising, as the Stillinger-Weber poten-tial was parametrised on the basis of data from both the crystal and liquidphases. The careful free energy tabulation of the Stillinger-Weber potentialfor the crystal and liquid phases of Si achieved by Broughton and Li has al-lowed subsequent studies (see below) to use this model as a reference systemfor free energy calculations of Si employing higher levels of theory.

In a subsequent study, Cook and Clancy [37] compared several empiri-cal potential models in their ability to describe the thermal behaviour of Siand Ge; in particular they considered the well-known Tersoff potential [38],and a form of the embedded atom potential modified to describe covalentbonding (Modified Embedded Atom Method, MEAM) [39]. In this work themelting point for each model was determined employing the two-phase coex-istence method, i.e. by simulating the solid and liquid in a large cell in directcoexistence. It was found that the Tersoff model for Si lead to a melting tem-perature of 2547 K, well above the experimental value of 1687 K, while theMEAM showed the opposite behaviour, sub-estimating the melting tempera-ture at 1475 K. Both models correctly predict, like does the Stillinger-Weberpotential, a slight increase of the density upon melting, though all modelsunderestimate somewhat the crystal and liquid densities. A similar study,comparing several tight binding models in their ability to reproduce the liq-uid structure, and the melting temperature of Si has been recently carriedout by Kaczmarski et al. [15].

Very soon after the proposal of the adiabatic switching technique byWatanabe and Reinhardt [10], Sugino and Car [48] employed this technique,combined with first principles Car-Parrinello dynamics [49], to study themelting of Si. These authors employed as reference system the Stillinger-Weber potential for both the crystal and liquid phases, for which the free en-ergy had been previously determined by Broughton and Li [34] as a functionof temperature, as discussed above. Sugino and Car thus calculated the freeenergy difference between the reference model and a first principles descrip-tion of Si employing density functional theory (DFT) with the local densityapproximation (LDA) for the exchange and correlation energy, combined withthe pseudopotential approximation. They predicted a melting temperature of1350± 100 K, some 300 K below the experimental value. This error was at-tributed to the relative de-stabilisation of the solid phase with respect to theliquid in the LDA, which would result in a reduction of the melting point.

118 E. R. Hernandez, A. Antonelli, L. Colombo, and and P. Ordejon

Any small error in the difference of free energies translates into a rather largeerror in the melting temperature, and hence these kind of calculations re-quire very stringent accuracy constraints. More recently Alfe and Gillan [50]have revisited the theme of the melting of Si from first principles simulations.These authors, employing the LDA approximation obtained a melting tem-perature of 1300±50 K, in good agreement with the value reported earlier bySugino and Car. They also performed calculations employing a generalisedgradient approximation (GGA) to the exchange and correlation energy, ob-taining in this case a somewhat improved value of the melting temperatureof 1492± 50 K, which, although still below the experimental value by nearly200 K, clearly improves upon the LDA result. This observation confirms thesuggestion made by Sugino and Car that the underestimation of the meltingpoint was most likely attributable to errors in the exchange and correla-tion functional, and not in any other of the approximations involved in thesimulations (pseudopotential, basis set, etc). It is worth noting that Si is atroublesome case, as the melting point separates two phases of different elec-tronic character: while the crystal is semiconducting, the liquid is metallic.Errors due to the approximate nature of the functionals may in general beof different size in insulators and metals. Indeed, when systems in which theelectronic character of the material does not change upon melting have beenstudied, DFT methods can predict melting temperatures within a few tensof K of the experimental result. This is the case for Al, which has been stud-ied by de Wijs et al. [51], who reported a value of Tm = 890±20 K, comparedto an experimental value of 933.47 K.

Both liquid silicon (l-Si) and water present an anomalous behaviour of thedensity and specific heat with temperature. Also, like water, l-Si is a very poorglass former, in the sense that it is extremely difficult to experimentally obtainthe amorphous phase by quenching from the melt. In the case of silicon, as amatter of fact, this has never been achieved. Due to these similarities, it hasbeen proposed that, as in water, supercooled l-Si could present polymorphism,as well as polyamorphism. Experiments have suggested the existence of afirst-order-like liquid-liquid (l-l) transition in supercooled l-Si [40, 41]. Also,it has been considered, based on experimental results, that the transition fromamorphous silicon (a-Si) to l-Si (l-a transition) would be also a discontinuoustransition [40]. From the computational point of view, MD simulations usingthe Stillinger-Weber potential [42] have determined a l-l transition at about1060 K, but have not found evidence of a l-a transition. Other computationalresults, using the EDIP [43] model for silicon, found a first-order-like l-atransition at 1170 K and no evidence of a l-l transition [44]. In a recentwork using the reversible scaling method within MC simulations and alsousing the EDIP model, Miranda and Antonelli [45] have shown that a l-ltransition occurs at 1135 K and a l-a transition takes place at 843 K. Fig. (5.4)shows the normalised excess entropy of the liquid phase with respect thecrystalline phase as a function of the normalised temperature. From the figure

5 Free energies in semiconductors 119

Fig. 5.4. Kauzmann plot for environment-dependent interatomic potential (EDIP)-Si. Normalized entropy difference between the disordered and crystalline phasesaccording to the EDIP model as a function of normalised temperature. Reprintedwith permission from Miranda and Antonelli, J. Chem. Phys. 120, 11672 (2004).Copyright (2004), American Institute of Physics.

one can see that as the temperature is lowered, first a discontinuous transitiontakes place, and as the temperature is lowered further, a glass-like transitionoccurs, beyond which the excess entropy becomes constant. This would bein agreement with recent experiments [46] which have indicated that a glass-like transition takes place at about 1000 K. Also, the configurational entropyobtained by Miranda and Antonelli for the amorphous phase is in agreementwith that obtained by Vink and Barkema [12] using a methodology based ongraph and information theories.

5.3.3 Phase diagrams

As discussed above, the efficient free energy techniques exposed in section 5.2,combined with Kofke’s Clausius-Clapeyron integration method, can be em-ployed to calculate phase diagrams of materials entirely from simulation. Twoparticularly interesting examples of applications of these techniques are thedetermination of the phase diagram of water by Sanz et al. [52], and thephase diagram of carbon by Ghiringhelli et al. [53].

The phase diagram of water is very rich, with at least nine stable icephases documented. Many different empirical potentials aiming at describingthis complicated system have been proposed in the literature, and Sanz andcoworkers [52] have

compared two of them, namely the so-called TIP4P model of Jorgensen etal. [54], and the SPC/E model of Berendsen et al. [55], in their ability to re-

120 E. R. Hernandez, A. Antonelli, L. Colombo, and and P. Ordejon

0.01

0.1

1

10

100 150 200 250 300 350 400 450

P (

GP

a)

T (K)

I

II

VI

VIII VII

liquidV

III

0.01

0.1

1

10

150 200 250 300 350 400 450

P (

GP

a)

T (K)

I

II

VI

VIII VII

liquid

V

III

0.01

0.1

1

10

100 150 200 250 300 350 400 450

P+

0.1

(GP

a)

T (K)

I

II

VI

VIII VII

liquid

Fig. 5.5. Phase diagram of water. The left panel shows the phase diagram predictedby the TIP4P potential; the left panel that obtained from the SPC/E model, and thecentral panel displays the experimental phase diagram. The SPC/E phase diagramhas been shifted by 0.1 GPa in order to make phase I visible, as this phase appears atsmall negative pressures, according to this model. Figure reprinted with permissionfrom Sanz et al., Phys. Rev. Lett. 92, 255701 (2004). Copyright (2004) by theAmerican Physical Society.

produce the experimental phase diagram of water. Sanz et al. used standardthermodynamic integration to locate coexistence points between the differ-ent phases, and then obtained the corresponding coexistence lines by usingClausius-Clapeyron integration as proposed by Kofke [17,18]. The theoreticalphase diagrams derived from TIP4P and SPC/E models are compared withthe experimental water phase diagram in Fig. (5.5).

As can be seen in Fig. (5.5), the phase diagram predicted by the TIP4Pmodel agrees fairly well with the experimental one, while that resulting fromthe SPC/E model deviates noticeably. The TIP4P model correctly predictsice phases I, II, III, V, VI, VII and VIII to be stable, as indeed they are foundto be in real water, while the SPC/E model, on the other hand, predictswrongly that phases III and V are metastable (i.e. there is no temperature-pressure domain in which these phases are found to have the lowest Gibbsfree energy).

A second example, more in the line of the theme of this book, is providedby the phase diagram of carbon obtained by Ghiringhelli et al. [53]. If mod-elling water accurately is a significant challenge for empirical potentials, asthe work of Sanz et al. [52] so nicely illustrates, the situation with carbonis by no means easier. The three phases considered in the study of Ghir-inghelli and coworkers, namely graphite, diamond and the liquid phase, havemarkedly different structures and types of bonding, and devising an empir-ical potential which is capable of reproducing not only these structural andbonding differences, but also the thermodynamic behaviour of each phase, isa significant achievement. In this study, the recently developed model of Losand Fasolino [56] was used. Ghiringhelli et al. proceeded, like in the previ-ously discussed example of water, by first locating coexistence points betweenthe different phases (graphite-diamond, graphite liquid, and diamond-liquid),

5 Free energies in semiconductors 121

0 10 20 30 40 50 60Pressure [GPa]

0

1000

2000

3000

4000

5000

6000

7000

Tem

pera

ture

[K

]

D

L

Calc. Graphite melting lineCalc. Graphite-Diamond coex.Calc. Diamond melting lineExp. Graphite melting line (Bundy 1996)Exp. Graphite melting line (Togaya 1997)Exp. Graphite-Diamond coex. (Bundy 1961)0 K Graphite-Diamond coex. (Bundy 1996)

G

Fig. 5.6. Phase diagram of carbon as obtained by Ghiringhelli and coworkers [53].This figure shows the phase diagram only up to pressures of 60 GPa, but thediamond melting line was followed up to pressures of 400 GPa. The solid blacktriangle, blue diamond and red square are the three coexistence points from whichlater the phase boundaries were obtained by Clausius-Clapeyron integration. Thepink solid circle is an experimental estimate of the liquid-graphite-diamond triplepoint. Figure reprinted with kind perission from Ghiringhelli et al., Phys. Rev. Lett.94, 145701 (2004). Copyright (2005) by the American Physical Society.

and then using Clausius-Clapeyron integration to determine the correspond-ing phase diagram. A portion of the phase diagram they obtained is shown inFig. (5.6), where also some experimentally measured coexistence points areshown for comparison.

The phase diagram of carbon is a challenge not only for simulation, buteven more so for experiments. Indeed the melting temperature of graphiteoccurs well above 4000 K. The melting line of diamond has not to date beenexperimentally characterised. The situation is a bit better for the graphite-diamond coexistence line, for which some measurements exist. As can be seenfrom Fig. (5.6), the carbon model of Los and Fasolino [56] predicts a phasediagram which agrees well with the available experimental data. In particularthe slope of the graphite-diamond coexistence curve seems to be well repro-duced, although the model predicts this transition to occur at slightly higherpressures than observed experimentally. The graphite melting line occurs atsomewhat lower temperatures (ca. 4000 K) than the experimental values,and is predicted to be monotonic with a slight positive pressure derivative,while it appears that the experimental curves show a maximum at around6 GPa. It should be noticed, however, that the experimental values do notcorrespond to real measurements, and are in fact indirect estimations basedon the ambient pressure value [53].

122 E. R. Hernandez, A. Antonelli, L. Colombo, and and P. Ordejon

A final example that we wish to discuss is that of the phase diagram ofSi obtained recently by Kaczkarski and coworkers [16]. These authors em-ployed the tight binding model of Lenosky et al. [14], as it is known that noempirical potential currently available predicts correctly the structure of theliquid phase in this system. The range of temperatures and pressures con-sidered by Kaczmarski et al. encompassed four phases, namely the diamondphase, which is the stable one at ambient conditions, the liquid phase, theβ-Sn structure, into which the diamond phase transforms when subject toa sufficient amount of compressive pressure, and the Si36 clathrate phase,which becomes stable at negative (expansive) pressures. Like in the case ofC, not much experimental data exists concerning the equilibrium conditionsof these phases of Si. It is known, however, that the zero pressure meltingpoint of the diamond phase occurs at 1687 K, and that its melting line hasa negative slope, which results from the fact that the liquid phase is denserthan the solid, while having a higher enthalpy. In fact, some potential modelsand even some tight binding ones, fail to reproduce this behaviour correctly,and would consequently produce a qualitatively wrong melting behaviour,even if some of them predict the zero pressure melting point rather accu-rately. In a previous study, Kaczmarski et al. [15] compared different tightbinding models in their ability not only to accurately reproduce the meltingtemperature of the Si diamond phase, but also the structural properties ofthe liquid. The choice of the model due to Lenosky et al. for the study of thephase diagram was based on the results of that comparison.

One difference between the previous examples of water and C and thework of Kaczmarski et al. [16] is that in the latter the dynamical versionof the Clausius-Clapeyron integration method due to de Koning et al. wasused. Also, coexistence points were located from free energies obtained fromthe reversible scaling and adiabatic switching techniques. The combined useof these techniques considerably reduced the computational costs comparedto what would have been required if standard thermodynamic and Clausius-Clapeyron integration had been used instead.

Fig. (5.7) shows the computed phase diagram for Si, compared with avail-able experimental data. It is seen that the overall features of the phase di-agram are qualitatively well reproduced by the model. In particular, thenegative slope of the melting line of the diamond phase is correct, even if theactual absolute slope value is probably underestimated. The diamond-β-Sncoexistence line is also predicted to have a negative slope, though again thevalue seems to be smaller than the experimental one. The coexistence linebetween the diamond and clathrate phases is in good agreement with theonly available experimental data on this transition [57].

5 Free energies in semiconductors 123

−5 0 5 10 15 20Pressure (GPa)

0

200

400

600

800

1000

1200

1400

1600

1800

Tem

pera

ture

(K

)

III

L

C

Fig. 5.7. Silicon phase diagram as predicted by the Lenosky model. The con-tinuous and dashed black curves are calculated coexistence lines. Dashed curvesindicate phase boundaries in regions where the separated phases are metastable,while continuous black curves separate thermodynamically stable phases; uncer-tainty bounds estimated at specific points of the phase diagram (marked by filledcircles) are provided by the error bars. For comparison purposes, a schematic phasediagram summarising the experimental data is shown with red dotted lines, andexperimental data at specific temperatures and pressures is shown in the form ofcoloured symbols. The asterisk corresponds to the zero-pressure melting point ofphase I (diamond), 1687 K [36]; the brown circle is the zero-pressure melting pointof the (metastable) clathrate phase (C), at 1473 K [57]; the blue diamond is thediamond-liquid-β-Sn triple point, with estimated coordinates of 1003 ± 20 K and10.5 ± 0.2 GPa [58]; red squares and triangles indicate the pressures at which theβ-Sn phase (II) was first observed and where the diamond phase ceased to be de-tected, respectively, in the experiments of Voronin et al. [58]; left and right pointingorange triangles give the same information as obtained by Hu et al. [59]; finally, thedownward pointing green triangle is the estimated diamond-clathrate-liquid triplepoint, at 1710 K and −2.5 GPa [57]. Figure reprinted from Kaczmarski, Bedoya-Martınez and Hernandez, Phys. Rev. Lett. 94, 095701 (2005). Copyright (2005) bythe American Physical Society.

124 E. R. Hernandez, A. Antonelli, L. Colombo, and and P. Ordejon

5.4 Conclusions and outlook

In this chapter we have given an overview of recent theoretical developmentsin the methodology of free energy calculations, and we have also provided aselection of applications, most of them in the fields of semiconductors, whichillustrate current capabilities and provide a glimpse of what the future maybring in this topic. Many outstanding problems still remain, however, and ap-plications in fields such as metallurgy and mineral science, particularly whenvariable chemical composition is involved, are still extremely challenging. Wefeel sure, nevertheless, that future theoretical and methodological develop-ments will render such problems more accessible than they are at present.

Acknowledgements

E.R.H. and P.O. thank the Spanish Ministry of Education and Science(MEC) for funding their work through project BFM2003-03372-C03 andthe bilateral Italian-Spanish collaborative action HI2003-0337; A.A. acknowl-edges the support from the Brazilian funding agencies FAPESP, CNPq andFAEP-UNICAMP; L.C. acknowledges financial support under project MIUR-FIRB2001 (contract RBAU01LLX2).

References

1. M. P. Allen and D. J. Tildesley, Computer Simulation of Liquids, (ClarendonPress, Oxford 1987).

2. D. Frenkel and B. Smit, Understanding Computer Simulation (Academic Press,New York 1996).

3. J. M. Rickman and R. LeSar, Annu. Rev. Mater. Res. 32, 195 (2002).4. G. J. Ackland, J. Phys.: Condens. Matter 14, 2975 (2002).5. M. de Koning and W. P. Reinhardt, in Handbook of Materials Modeling , (S. Yip,

Ed.), Springer, 2005.6. J. G. Kirkwood, J. Chem. Phys. 3, 300 (1935).7. D. Frenkel and A. J. C. Ladd, J. Chem. Phys. 81, 3188 (1984).8. J. K. Johnson, J. A. Zollweg and K. E. Gubbins, Mol. Phys. 78, 591 (1993).9. D. A. Young and F. J. Rogers, J. Chem. Phys. 81, 2789 (1984).

10. M. Watanabe and W. P. Reinhardt, Phys. Rev. Lett. 65, 3301 (1990).11. M. de Koning, A. Antonelli and S. Yip, Phys. Rev. Lett. 83, 3973 (1999).12. M. de Koning, A. Antonelli and S. Yip, J. Chem. Phys. 115, 11025 (2001).13. C. M. Goringe, D. R. Bowler and E. R. Hernandez, Rep. Prog. Phys. 60, 1447

(1997).14. T. J. Lenosky, et al., Phys. Rev. B 55, 1528 (1997).15. M. Kaczmarski, R. Rurali and E. R. Hernandez, Phys. Rev. B 69, 214105

(2004).16. M. Kaczmarski, O. N. Bedoya-Martınez and E. R. Hernandez, Phys. Rev. Lett.

94, 095701 (2005).

5 Free energies in semiconductors 125

17. D. A. Kofke, Mol. Phys. 78, 1331 (1993).18. D. A. Kofke, J. Chem. Phys. 98, 4149 (1993).19. R. Agrawal and D. A. Kofke, Mol. Phys. 85, 43 (1995).20. A. Jaaskelainen, L. Colombo and R. Nieminen, Phys. Rev. B 64, 233203 (2001).21. I. Kwon, R. Biswas, C. Z. Wang, K. M. Ho and C. M. Soukoulis, Phys. Rev. B

49, 7242 (1994).22. M. Tang, L. Colombo, J. Zhu and T. Diaz de la Rubia, Phys. Rev. B 55, 14279

(1997).23. H. Bracht, E. E. Haller and R. Clark-Phelps, Phys. Rev. Lett. 81, 393 (1998).24. T. Y. Tan and U. Gosele, Appl. Phys. A: Solids Surf. 37, 1 (1985).25. P. E. Blochlm, E. Smargiassi, R. Car, D. B. Lacks, W. Andreoni and S. T. Pan-

telides, Phys. Rev. Lett. 70, 2435 (1993).26. R. Car, P. Blochl and E. Smargiassi, Mater. Sci. Forum 83-87, 433 (1992).27. E. Smargiassi and R. Car, Phys. Rev. B 53, 9760 (1996).28. S. K. Estreicher, M. Garaibeh, P. A. Fedders and P. Ordejon, Phys. Rev. Lett.

86, 1247 (2001).29. M. Cogoni, B. P. Uberuaga, A. F. Voter and L. Colombo, Phys. Rev. B 71,

121203 (2005).30. J. Samuels and S. G. Roberts, Proc. R. Soc. London, Ser. A 421, 1 (1989).31. J. R. Rice, J. Mech. Phys. Solids 40, 239 (1992); J. R. Rice and G. E. Beltz,

ibid 42, 333 (1994); Y. Sun and G. E. Beltz, Mater. Sci. Eng. A 170, 67 (1993).32. E. Kaxiras and M. S. Duesbery, Phys. Rev. Lett. 70, 3752 (1993).33. M. de Koning, A. Antonelli, M. Z. Bazant, E. Kaxiras and J. F. Justo, Phys.

Rev. B 58, 12555 (1998).34. J. Q. Broughton and X. P. Li, Phys. Rev. B 35, 9120 (1987).35. F. H. Stillinger and T. A. Weber, Phys. Rev. B 31, 5262 (1985).36. R. Hull (Ed.), Properties of Crystalline Silicon (Inspec, London 1999).37. S. J. Cook and P. Clancy, Phys. Rev. B 47, 7686 (1993).38. J. Tersoff, Phys. Rev. Lett., 56, 632 (1986); Phys. Rev. B, 37, 6991 (1988);

Phys. Rev. B, 38, 9902 (1988).39. M. I. Baskes, J. S. Nelson and A. F. Wright, Phys. Rev. B 40, 6085 (1989).40. E. P. Donovan, F. Spaepen, D. Turnbull, J. M. Poate and D. C. Jacobson, J.

Appl. Phys. 57, 1795 (1985).41. S. K. Deb, M. Wilding, M. Somayazulu and P. F. McMillan, Nature 414, 528

(2001).42. S. Sastry and C. A. Angell, Nature Materials 2, 739 (2003).43. J. F. Justo, M. Z. Bazant, E. Kaxiras, V. V. Bulatov and S. Yip, Phys. Rev.

B 58, 2539 (1998).44. P. Keblinski, M. Z. Bazant, R. K. Dash and M. M. Treacy, Phys. Rev B 66,

064104 (2002).45. C. R. Miranda and A. Antonelli, J. Chem. Phys. 120, 11672 (2004).46. A. Hedler, S. L. Klaumunze and W. Wesch, Nature Materials 3, 804 (2004).47. R. L. C. Vink and G. T. barkema, Phys. Rev. Lett. 89, 076405 (2002).48. O. Sugino and R. Car, Phys. Rev. Lett. 74, 1823 (1995).49. R. Car and M. Parrinello, Phys. Rev. Lett.

55, 2471 (1985).50. D. Alfe and M. J. Gillan, Phys. Rev. B 68, 205212 (2003).51. G. A. de Wijs, G. Kresse and M. J. Gillan, Phys. Rev. B 57, 8223 (1998).52. E. Sanz, C. Vega, J. L. F. Abascal and L. G. MacDowell, Phys. Rev. Lett. 92,

255701 (2004).

126 E. R. Hernandez, A. Antonelli, L. Colombo, and and P. Ordejon

53. L. M. Ghiringhelli, J. H. Los, E. J. Meijer, A. Fasolino and D. Frenkel, Phys.Rev. Lett. 94, 145701 (2004).

54. W.L. Jorgensen, J. Chandrasekhar, J. D. Madura, R. W. Impey andM. L. Klein, J. Chem. Phys. 79, 926 (1983).

55. H. J. C. Berendsen, J. R. Grigera and T. P. Straatma, J. Phys. Chem. 91, 6269(1987).

56. J. H. Los and A. Fasolino, Phys. Rev. B 68, 024107 (2003).57. P. F. McMillan, Nature Materials, 1, 19 (2002).58. G. A. Voronin, C. Pantea, T. W. Zerda, L. Wang and Y. Zhao, Phys. Rev. B.

68, 020102 (2003).59. J. Z. Hu, L. D. Merkle, C. S. Menoni and I. L. Spain, Phys. Rev. B 34, 4679

(1986).

6 Quantum Monte Carlo techniques and

defects in semiconductors

R.J. Needs

Theory of Condensed Matter Group, Cavendish Laboratory, University ofCambridge, Madingley Road, Cambridge, CB3 0HE, United [email protected]

6.1 Introduction

Quantum Monte Carlo (QMC) methods in the variational (VMC) and diffu-sion (DMC) forms are stochastic methods for evaluating expectation valuesof many-body wave functions. [1] One of the main attractions of these meth-ods is that the computational cost scales roughly as the square or cube of thenumber of particles, which is very favorable compared with quantum-chemicalmany-body

techniques, allowing applications to condensed matter. Applications ofQMC to defects in semiconductors are still in their infancy but, as I willdescribe, these methods possess features which make them well-suited foraddressing important issues in the field. In this chapter I first discuss whywe expect VMC and DMC calculations to be valuable in studying defectsin semiconductors. I then review the techniques themselves, concentratingon the issues most pertinent to defects in solids. Finally I describe recentapplications to point defects in crystalline silicon, diamond and magnesiumoxide, ending with a perspective on the future of such studies.

It is important to treat electron correlation effects accurately when de-scribing the energetics of condensed matter systems. Defects in semiconduc-tors are not special in this regard, but they exemplify almost all the difficultiesthat may be encountered. For example, we would like to calculate the for-mation and migration energies of defects. These quantities are related to theenergy differences between structures, which may involve atoms with differ-ent coordination numbers and bond lengths. We might also be interested,for example, in the properties of a 3d transition metal impurity in a semi-conductor, whose electronic properties are difficult to describe accurately.We may also be interested in the excited electronic states of defects, whichmay involve calculating the energies of multiplets. These are severe tests ofelectronic structure methods, which can only be met by techniques whichincorporate a sophisticated description of electronic correlation.

QMC methods have the potential to provide very accurate energies and,in principle, there are no restrictions on the types of electronic state that canbe treated. The N2 −N3 scaling of the computational cost with the numberof electrons N allows applications to quite large systems, which are certainlysufficient for studies of defects in semiconductors. However, the prefactor is

128 R.J. Needs

large, and a QMC calculation is generally much more expensive than thecorresponding Density-Functional Theory (DFT) one.

The DMC calculations of Ceperley and Alder [2] provided accurate ener-gies for the homogeneous electron gas, which have been used in constructingparameterized density functionals. Since then, the DMC method has beenused in studies of the electrons in molecules, clusters and solids. The powerof the DMC method for describing electronic correlation in large systemshas been amply demonstrated by recent applications, including studies ofpoint defects in semiconductors [3, 4], the reconstruction of the silicon (001)surface [5] and its interaction with H2 [6], and the calculation of opticalexcitation energies [7, 8]. The methodology of QMC calculations is being im-proved rapidly which, together with advances in computing technologies, willallow them to be applied to more complicated systems. I believe that QMCwill play an important role in the future of electronic structure computations.They will not replace DFT studies, but rather complement them by providinghigher accuracy, albeit at considerably higher cost.

6.2 Quantum Monte Carlo methods

The VMC method is conceptually very simple, the energy being calculatedas the expectation value of the Hamiltonian with an approximate many-bodytrial wave function. In the more sophisticated DMC method the estimate ofthe ground state energy is improved by performing an evolution of the wavefunction in imaginary time.

6.2.1 The VMC method

The variational theorem of quantum mechanics states that, for a proper trialwave function ΨT , the variational energy,

EV =∫Ψ∗

T (R)HΨT (R) dR∫Ψ∗

T (R)ΨT (R) dR, (6.1)

is an upper bound on the exact ground state energy E0, i.e., EV ≥ E0. InEq. (6.1), ΨT is the many-body wave function, the symbol R denotes the3N -dimensional vector of the electron coordinates, and H is the many-bodyHamiltonian.

The VMC method was first used by McMillan in a study of liquid 4He. [9]To facilitate the stochastic evaluation, EV is written as

EV =∫p(R)EL(R) dR , (6.2)

where the probability distribution p is

6 Quantum Monte Carlo techniques and defects in semiconductors 129

p(R) =|ΨT (R)|2∫|ΨT (R)|2 dR , (6.3)

and the local energy, EL, is

EL(R) = Ψ−1T HΨT . (6.4)

In VMC one uses a Metropolis algorithm [10] to sample the distributionp, generating M configurations, Ri, drawn from p. The variational energy isthen estimated as

EV � 1M

M∑i=1

EL(Ri) . (6.5)

Eq. (6.2) is an importance sampling transformation of Eq. (6.1). Eq. (6.2)exhibits the zero variance property ; as the trial wave function approaches anexact eigenfunction (ΨT → φi), the local energy approaches the correspondingeigenenergy, Ei, everywhere in configuration space. As ΨT is improved, EL

becomes a smoother function of R and the number of sampling points, M ,required to achieve an accurate estimate of EV is reduced.

VMC is a simple and elegant method. There are no restrictions on theform of trial wave function which can be used and it does not suffer froma fermion sign problem. The main problem is, essentially, that you get outwhat you put in. Even if the underlying physics is well understood it is verydifficult to prepare trial wave functions of equivalent accuracy for differentsystems, and therefore the energy differences between them will be biased. Weuse the VMC method mainly to optimize parameters in trial wave functions,see section 6.2.4, and our main calculations are performed with the moresophisticated DMC method, which is described in the next section.

6.2.2 The DMC method

The DMC procedure largely overcomes the problem of bias inherent in VMC.In DMC the operator exp(−tH) is used to project out the ground statefrom the initial state. This can be viewed as solving the imaginary-timeSchrodinger equation,

− ∂

∂tΦ(R, t) =

(H −ET

)Φ(R, t) , (6.6)

where t is a real variable measuring the progress in imaginary time and ET

is an energy offset. The solution of Eq. (6.6) can be obtained by expandingΦ(R, t) in the eigenstates of the Hamiltonian,

Φ(R, t) =∑

i

ci(t)φi(R) , (6.7)

130 R.J. Needs

which leads to

Φ(R, t) =∑

i

exp[−(Ei − ET )t] ci(0)φi(R) . (6.8)

For long times one finds

Φ(R, t→ ∞) � exp[−(E0 − ET )t] c0(0)φ0(R) , (6.9)

which is proportional to the ground state wave function, φ0. If we interpretthe initial state,

∑i ci(0)φi, as a probability distribution, then the time evo-

lution process can be thought of as taking a set of points in configurationspace distributed according to the initial state and subjecting them to dif-fusion and branching (branching is also referred to as “birth” and “death”of configurations). The diffusion process arises from the kinetic energy oper-ator and the branching from the potential energy. So far we have assumedthat the wave function can be interpreted as a probability distribution, buta wave function for more than two identical fermions must have positive andnegative regions. One can construct algorithms which use two distributionsof configurations, one for positive regions and one for negative, but such al-gorithms suffer from a fermion sign problem which manifests itself by thesignal-to-noise ratio decaying exponentially with imaginary time.

The fixed-node approximation [11, 12] allows us to avoid these problems.This method is equivalent to placing infinite potential barriers everywhereon the nodal surface of a trial wave function ΨT . (The nodal surface is the3N−1 dimensional surface in configuration space on which the wave functionis zero and across which it changes sign.) The infinite potential barriers haveno effect if the trial nodal surface is correct, but if it is incorrect the energy isalways raised. The DMC projection can then be done separately in each nodalpocket, which gives the lowest energy solution in each pocket. It thereforefollows that the DMC energy is always less than or equal to the VMC energywith the same trial wave function, and always greater than or equal to theexact ground state energy, i.e., EV ≥ ED ≥ E0. In addition, it turns out thatall the nodal pockets of the exact fermion ground state are equivalent [13]and, if ΨT has this tiling property , we do not need to sample all of the pockets.

Such a fixed-node DMC algorithm is extremely inefficient and a vastlysuperior algorithm can be obtained by introducing an importance samplingtransformation. [14, 15] Consider the mixed distribution,

f(R, t) = ΨT (R)Φ(R, t) , (6.10)

which has the same sign everywhere if the nodal surface of Φ(R, t) is con-strained to be the same as that of ΨT (R). Substituting in Eq. (6.6) for Φ weobtain

−∂f∂t

= −12∇2f + ∇ · [vf ] + [EL − ET ]f , (6.11)

6 Quantum Monte Carlo techniques and defects in semiconductors 131

where the 3N -dimensional drift velocity is defined as

v(R) = Ψ−1T (R)∇ΨT (R) . (6.12)

The three terms on the right-hand side of Eq. (6.11) correspond to diffu-sion, drift, and branching processes, respectively. The importance samplingtransformation has several consequences. First, the density of configurationsis increased where ΨT is large. Second, the rate of branching is now con-trolled by the local energy which is generally a much smoother function thanthe potential energy. Finally, the importance sampling actually enforces thefixed-node constraint in Eq. (6.11). In practice one has to use a short timeapproximation [1] for the Green’s function diffusion equation which meansthat occasionally configurations attempt to cross the nodal surface, but suchmoves are simply rejected.

This algorithm generates the distribution f , but fortunately we can cal-culate the corresponding DMC energy using the formula

ED =∫fEL dR∫f dR

(6.13)

� 1M

M∑i=1

EL(Ri) . (6.14)

The importance-sampled fixed-node fermion DMC algorithm was first usedby Ceperley and Alder. [2]

It is not possible to obtain directly from f the proper expectation valuesof operators which do not commute with the Hamiltonian, such as the chargedensity or pair correlation functions. There are, however, three methods bywhich such expectation values can be obtained: the approximate (but oftenvery accurate) extrapolation technique [16], the forward walking technique(explained in Hammond et al. [17]) which is

formally exact but statistically poorly behaved, and the reptation QMCtechnique of Baroni and Moroni [18], which is formally exact and well be-haved, but quite expensive.

6.2.3 Trial wave functions

It should be clear that the trial wave function is of central importance inVMC and DMC calculations. The trial wave function introduces importancesampling and controls both the statistical efficiency and the accuracy ob-tained. In DMC the accuracy depends only on the nodal surface of the trialwave function via the fixed-node approximation, while in VMC it dependson the entire trial wave function, so that VMC energies are more sensitive tothe quality of the approximate wave function than DMC energies.

QMC calculations require a compact trial wave function, and most studiesof electronic systems have used the Slater-Jastrow form, in which a pair ofup- and down-spin determinants is multiplied by a Jastrow correlation factor,

132 R.J. Needs

ΨT = eJ(R)D↑(R↑)D↓(R↓) , (6.15)

where eJ is the Jastrow factor and the D are determinants of single particleorbitals. The quality of the single particle orbitals is very important, andthey are often obtained from DFT or Hartree-Fock (HF) calculations.

The Jastrow factor is a positive, symmetric, function of the electron coor-dinates, which is chosen to be small when electrons are close to one another,thereby introducing electron correlation into the wave function. The Jastrowfactor is also used to enforce the electron-electron Kato cusp conditions. [19]The idea is that when two electrons are coincident the contribution to thelocal energy from the Coulomb interaction diverges, and therefore the exactwave function must have a cusp at this point so that the kinetic energy oper-ator supplies an equal and opposite divergence. It seems very reasonable toenforce these conditions on a trial wave function as they are obeyed by theexact wave function, and in VMC, and more particularly DMC, calculationsthey are very important because divergences in the local energy can lead topoor statistical behaviour.

Many excellent QMC results have been obtained using Slater-Jastrowwave functions, but it may be profitable to consider other forms. One variantis to use more than one pair of determinants,

ΨT = eJ(R)∑

i

αiDi,↑(R↑)Di,↓(R↓) , (6.16)

where the αi are coefficients. This approach is not suitable for improvingthe energy per atom of a solid because the number of determinants requiredwould have to increase exponentially with system size. It is, however, usefulwhen considering a point defect with an open shell electronic configurationin a solid, in which case the electronic states of the multiplet may be de-scribed by a small number of determinants. An example of this is describedin Sec. 6.6.3. Another possibility is to include backflow correlations, whichmay be derived by demanding that the wave function conserves the localcurrent. [20] This approach has been used successfully in QMC studies ofthe electron gas [21], but it has not been explored much for electrons in realatomic systems. [22] Including backflow correlations improves the nodal sur-face of a wave function and leads to a compact form which is suitable for usein QMC calculations, although it is more complicated than the Slater-Jastrowform and the computational cost is increased.

6.2.4 Optimization of trial wave functions

Optimizing trial wave functions is perhaps the most important technical issuein QMC calculations, and it consumes large amounts of human and comput-ing resources. It is standard to include a number of variable parameters in theJastrow factor whose values are determined by a stochastic optimization pro-cedure. The determinant coefficients, αi, in Eq. (6.16) may also be optimized,and in some calculations the orbitals themselves have been optimized.

6 Quantum Monte Carlo techniques and defects in semiconductors 133

Developing better wave function optimization techniques is currently avery active field, with many new approaches being tested. Rather than delv-ing into the technical details of this enterprise, I will mention some of thekey issues. Ideally the wave function should be optimized within DMC, butthis is generally much too costly and so optimizations are performed at theVMC level. Two measures of wave function quality have generally been con-sidered. The most obvious idea is to minimize the expectation value of theVMC energy, although more commonly the variance of the VMC energy hasbeen minimized. There are good reasons to suppose that better DMC resultswould be obtained by using energy-minimized trial wave functions, ratherthan variance-minimized ones. First, it seems likely that a better nodal sur-face would be obtained by minimizing the energy rather than the variance.Second, it turns out that the variance of the DMC energy is approximatelyproportional to the difference between the VMC and DMC energies, ratherthan the variance of the VMC energy. [23] Why then has it been more com-mon to minimize the variance of the VMC energy? The answer is simply thatit has proved easier to devise algorithms for minimizing the variance whichare stable and efficient in the presence of statistical noise. [24–26] Most of thecurrent effort is being devoted to developing energy minimization schemes, al-though the applications to defects in semiconductors described in this articlewas performed using variance minimization.

6.2.5 QMC calculations within periodic boundary conditions

QMC calculations for defects in solids may be performed using cluster modelsor periodic boundary conditions, just as in other techniques. Periodic bound-ary conditions are preferred because they give smaller finite size effects. Inthe standard supercell approach a point defect in an otherwise perfect crys-tal is modelled by taking a large cell containing a single defect and repeatingit throughout space. Just as in other techniques, one must ensure that thesupercell is large enough for the interactions between defects in different su-percells to be small.

In techniques involving many-electron wave functions such as QMC it isnot possible to reduce the problem to solving within a primitive unit cell(when a defect is present the primitive unit cell is the supercell). Such areduction is allowed in single-particle methods because the Hamiltonian isinvariant under the translation of a single electronic coordinate by a transla-tion vector of the primitive lattice, but this is not a symmetry of the many-body Hamiltonian. Consequently QMC calculations may be performed onlyat a single k-point. It is possible to perform QMC calculations at differentk-points [27, 28] and average the results [29], but this is not exactly equiv-alent to a Brillouin zone integration within single-particle theories. In theapplications described in this article a single k-point was used.

Many-body techniques such as QMC also suffer from another source offinite size effects - the interaction of the electrons with their periodic images.

134 R.J. Needs

Such an effect is absent in local DFT calculations because the interactionenergy is written in terms of the electronic charge density, but the effectis present in HF calculations. These “Coulomb finite size effects” are oftencorrected for using extrapolation techniques (see [1] and references therein),and they may also be reduced using the so-called Model Periodic Coulomb(MPC) interaction. [30]

6.2.6 Using pseudopotentials in QMC calculations

The cost of a DMC calculation increases rapidly with the atomic number, Z,as roughly Z5.5 [23, 31]. This effectively rules out applications to atoms withZ greater than about 10, and consequently pseudopotentials are commonlyused in DMC calculations. The use of non-local pseudopotentials within VMCis quite straightforward, but DMC poses an additional problem, because thematrix element of the imaginary time propagator is not necessarily positive,which leads to a sign problem analogous to the fermion sign problem. Tocircumvent this difficulty an additional approximation known as the pseu-dopotentiallocalization approximation has been used. [32] If the trial wavefunction is accurate the error introduced by the localization approximationis small and is proportional to (ΨT −φ0)2, [33] although the error term can beof either sign. Recently, Casula et al. [34] have introduced a fully-variationaltechnique for dealing with non-local pseudopotentials within DMC.

Currently it is not possible to generate pseudopotentials entirely within aQMC framework, and therefore they are obtained from other sources. Thereis evidence which indicates that HF theory provides better pseudopotentialsthan DFT for use within QMC calculations [35], and recently we have de-veloped smooth relativistic HF pseudopotentials for H to Ba and Lu to Hg,which are suitable for use in QMC calculations. [36, 37]

6.3 DMC calculations for excited states

It might appear that the DMC method cannot be applied to excited statesbecause the imaginary time propagation would evolve it towards the groundstate. However, the fixed node constraint ensures convergence to the lowestenergy state compatible with the nodal surface of the trial wave function. Ifthe nodal surface is exactly that of an eigenstate then DMC gives the exactenergy of that state. If the nodal surface is approximately that of an excitedstate then the DMC energy gives an approximation to the energy of thatexcited state. There is, however, an important difference from the groundstate case, in that the existence of a variational principle for the energy ofan excited state depends on the symmetry of the trial wave function. [38] Inpractice DMC works quite well for excited states. [39, 40]

6 Quantum Monte Carlo techniques and defects in semiconductors 135

6.4 Sources of error in DMC calculations

Here we summarize the potential sources of errors in DMC calculations.

– Statistical error. The standard error in the mean is proportional to 1/√M ,

where M is the number of moves. It therefore costs a factor of 100 incomputer time to reduce the error bars by a factor of 10. On the otherhand, a random error is much better than a systematic one.

– The fixed-node error. This is the central approximation of the DMC tech-nique, and is often the limiting factor in the accuracy of the results.

– Timestep bias. The short time approximation leads to a bias in the fdistribution and hence in expectation values. This bias can be serious andis often corrected for by using several different timesteps and extrapolatingto zero timestep.

– Population control bias. In DMC the f distribution is represented by afinite population of configurations which fluctuates due to the branchingprocedure. The population is usually controlled by occasionally changingthe trial energy, ET , in Eq. (6.11) so that the population returns towards atarget size. This procedure results in a population control bias. In practicethe population control bias is normally extremely small, in fact so smallthat it is difficult to detect.

– Pseudopotentialerrors including the localization error. Pseudopotentialsinevitably introduce errors which can be significant in some cases. Thelocalization error also appears to be quite small, although this has notbeen tested in many cases.

– Finite size errors within periodic boundary conditions. It is important tostudy the finite size effects carefully, and to correct for them. A number ofcorrection procedures have been devised, mainly involving extrapolationto the thermodynamic limit.

In electronic structure theory one is almost always interested in the differ-ence in energy between two similar systems. All electronic structure methodsfor complex systems rely on large cancellations of the errors in the energydifferences between systems. In DMC this helps with all the above sourcesof error except the statistical errors. Fixed-node errors tend to cancel be-cause the DMC energy is an upper bound, but even though DMC oftenretrieves 95% or more of the correlation energy, non-cancellation of nodalerrors is a severe problem for DMC calculations. As mentioned before, themost straightforward way forward is simply to improve the quality of the trialwave functions, but perhaps it might also be possible to balance the errors indifferent systems more successfully than is currently possible.

136 R.J. Needs

6.5 Applications of QMC to the cohesive energies ofsolids

A number of VMC and DMC studies have been performed of the cohesiveenergies of solids. The cohesive energy is calculated as the energy differencebetween the isolated atom and an atom in the solid. This is a severe test ofQMC methods because the trial wave functions used for the atom and solidmust be closely matched in quality. Data for Si [30], Ge [28], C [4], Na [41]and NiO [42] have been collected in Table 6.1. The Local Spin Density Ap-proximation (LSDA) data shows the standard overestimation of the cohesiveenergy, while the QMC data is in good agreement with experiment. Thesestudies have been important in establishing DMC as an accurate method forcalculating the energies of solids.

6.6 Applications of QMC to defects in semiconductors

6.6.1 Using structures from simpler methods

In the applications of QMC to defects in semiconductors reported to date, thestructures were obtained from DFT calculations and the energies recalculatedwithin QMC. Surely this leads to a bias in the energy differences which mightbe considerable? The answer must certainly be yes, it does lead to a bias,but here I argue that it may often be quite small. We can pose the questionmore clearly as “is it sensible to calculate structures using a simple methodof limited accuracy and then recalculate the energy difference between thestructures with a sophisticated method of much higher accuracy?”.

Consider a system with a single structural coordinate Q, and supposethere are two structures of interest, denoted by 1 and 2, corresponding tolocal minima in the energy at coordinates Q1 and Q2, with energies E1 andE2. Around the two energy minima the energy can be expanded as

Around Q1 E(Q) = E1 + α1(Q−Q1)2 (6.17)Around Q2 E(Q) = E2 + α2(Q−Q2)2 . (6.18)

Table 6.1. LSDA, VMC and DMC cohesive energies of solids in eV per atom (eVper 2 atoms for NiO), compared with the experimental values. The numbers inbrackets indicate standard errors in the last significant figure. All calculated valuescontain a correction for the zero-point energy of the solid.

Method Si Ge C Na NiO

LSDA 5.28 4.59 8.61 1.20 10.96

VMC 4.48(1) 3.80(2) 7.36(1) 0.9694(8) 8.57(1)

DMC 4.63(2) 3.85(2) 7.346(6) 1.0221(3) 9.442(2)

Exp. 4.62 3.85 7.37 1.13 9.45

6 Quantum Monte Carlo techniques and defects in semiconductors 137

Suppose further that we have a simple method for calculating the energyE(Q) which gives a smoothly varying error of the form

ΔE(Q) = βQ , (6.19)

where β is a constant.Within the simple method the errors in the coordinates of the equilibrium

structures are obtained by minimizing E +ΔE, which gives

ΔQ1 = − β

2α1, ΔQ2 = − β

2α2, (6.20)

and the error in the energy difference between these structures is

Δ(E1 − E2) = β(Q1 −Q2) −β2

4

(1α1

− 1α2

). (6.21)

Now suppose that we use the structures from the simple method, but recal-culate the energies within a sophisticated method which gives a negligibleerror in the energy. The error in the energy difference between the structuresis then

Δ(E1 − E2) =β2

2

(1α1

− 1α2

). (6.22)

Within the simple method the error in the energy difference is first orderin β, while for the sophisticated method it is second order. The error inthe energy difference within the simple method is proportional to Q1 −Q2,so that the error is large if the structures are very different, while in thesophisticated method the error is independent of Q1 −Q2. This behavior isillustrated in Fig. 6.1. We therefore conclude that using the sophisticatedmethod to calculate the energy difference between structures obtained fromthe simple method leads only to quite small errors. Note also that within thismodel the quadratic coefficients in the energy, α1 and α2, are unaltered bythe linear error term, so that the simple method gives accurate values for thevibrational frequencies.

The model is, of course, highly simplified, but it illustrates my contentionthat it is reasonable to take structures from DFT calculations and recalculatethe energy differences between them within QMC.

6.6.2 Silicon Self-Interstitial Defects

The diffusion of dopant impurity atoms during thermal processing limits howsmall silicon devices can be made, and understanding this effect requires aknowledge of diffusion on the microscopic scale in situations far from equi-librium. Dopant diffusion is mainly controlled by the presence of intrinsicdefects such as self-interstitials and vacancies, and therefore it is importantto improve our understanding of these defects.

138 R.J. Needs

-2 -1 0 1 2Q

0.0

0.2

0.4

0.6

0.8

1.0

E

Fig. 6.1. The energy E versus a structural coordinate Q showing two minima fora model system. Solid line: exact energy, dashed line: approximate energy from asimple theory which has an error of 0.1Q. The error in the energy difference betweenthe two minima obtained from the simple theory is 0.186, while the error in theenergy difference calculated using the accurate energy curve, but evaluated at theminima obtained from the simple theory, is 0.00045.

Both experimental and theoretical techniques have been brought to bearon these defects but, unfortunately, it has not been possible to detect siliconself-interstitials directly and we must rely on theoretical studies to elucidatetheir microscopic properties. Electron paramagnetic resonance studies [43]have determined unambiguously that the symmetry of neutral vacancies insilicon is D2d, which is consistent with a Jahn-Teller distortion. Positron-lifetime experiments have given a value for the enthalpy of formation of aneutral vacancy in silicon of 3.6±0.2 eV. [44]

The self-diffusivity of silicon at high temperatures follows an Arrheniusbehaviour with an activation energy in the range 4.1 − 5.1 eV. [45] Signifi-cant difficulties arise with the interpretation of experimental data when thecontributions to the self-diffusivity from different mechanisms are considered.The self-diffusivity is usually written as the sum of contributions from inde-pendent diffusive mechanisms. The contribution of a particular microscopicmechanism can be written as the product of the diffusivity, Di, and the con-centration, Ci, of the relevant defect, i.e.,

DSD =∑

i

DiCi . (6.23)

Gosele et al. [46] give estimates of the contributions to the self-diffusivity as

DICI = 914 exp(−4.84/kBT ) cm2 s−1 , (6.24)

from self-interstitials and

6 Quantum Monte Carlo techniques and defects in semiconductors 139

DVCV = 0.6 exp(−4.03/kBT ) cm2 s−1 , (6.25)

from vacancies, where kBT is in units of eV. It is believed that self-diffusivityin silicon is dominated by vacancies at low temperatures and self-interstitialsabove 1300 K. The experimental situation regarding the individual values ofDi and Ci is, however, controversial. Indeed, experimental data has been usedto support values of the diffusivity of the silicon self-interstitial, DI, whichdiffer by ten orders of magnitude at the temperatures of around 1100 K atwhich silicon is processed. [47]

The main challenge to theory is to identify the important defects and topredict their properties, comparing with experimental data where available.This includes identifying the diffusive mechanisms of the defects, providingaccurate values for the Di and the equilibrium values of Ci, and reproducingthe experimental data for the self-diffusivity. This programme represents anenormous challenge, and although many theoretical studies of point defectsin silicon have been performed, the goal is still far away.

A number of different structures for silicon self-interstitials have beenpostulated, including the split-〈110〉, hexagonal and tetrahedral defects, seeFig 6.2. The consensus view from modern DFT calculations is that the split-〈110〉 and hexagonal structures are the lowest energy self-interstitials in sili-con [48–50], and that the tetrahedral interstitial is unstable. However, evenhere there has been a surprise. Recent DFT calculations have shown that thehexagonal interstitial is unstable to a distortion out of the hexagonal ring by0.48 A, which lowers the formation energy by 0.03 eV. [50] Another surprisehas been that, within DFT, it turns out that the lowest energy point defectin silicon is neither an interstitial nor the vacancy! Goedecker et al. [49] foundthat a four-fold coordinated defect (FFCD) had a formation energy at least0.5 eV lower than the interstitials and vacancy. It is not clear whether thisdefect could play a role in self-diffusion, as its possible diffusive mechanismshave not yet been studied.

There are some discrepancies between DFT values reported for the for-mation energies of the split-〈110〉 and hexagonal self-interstitials and for theassociated migration energies. However, modern calculations [48–50] indicatethat the sum of the DFT formation and migration energies is significantlysmaller than the exponent of 4.84 eV deduced from experimental studies, seeEq. (6.24).

An interesting suggestion by Pandey [51], based on the results of DFT cal-culations, was that an exchange of neighbouring atoms in the perfect latticeatoms would contribute to self-diffusion

in silicon. In Pandey’s concerted exchange mechanism two nearest neigh-bour atoms interchange via a complicated 3-dimensional path which allowsthe atoms to avoid large energy barriers. [51] One of the main planks ofPandey’s argument was that the calculated value for the activation energy forthis concerted exchange was within the experimental range for self-diffusion of4.1− 5.1 eV. [45] However, as we have already noted, DFT gives even lower

140 R.J. Needs

energies for interstitial and vacancies, and therefore Pandey’s argument isunsatisfactory.

All of the calculations mentioned so far assumed no thermal motion at all,but it is also important, of course, to study finite temperature effects. TwoDFT molecular dynamics studies have been reported [52, 53], but the energybarriers to interstitial self-diffusion are very small (at least within DFT) andhighly accurate calculations are required to obtain reliable results. It is rathereasier to calculate the equilibrium concentration of defects including the ef-fects of thermal vibrations, and a recent study [50] using DFT perturbationtheory gave results which are broadly in agreement with the rather scatteredexperimental data.

Given this background, what role can QMC calculations play in unrav-elling this complicated problem? In the medium term its role must be togive accurate values for the energies of various defect structures. Given thatthe LSDA and Generalized Gradient Approximation (GGA) self-interstitialdefect formation energies can differ by 0.5 eV it is valuable to quantify theerrors in DFT calculations of the defect formation energies. As argued inSec. 6.6.1, it should be sufficient to use defect structures obtained from DFTcalculations for this purpose.

DFT calculations on silicon self-interstitials

Leung et al. [3] and Needs [48] reported LSDA and PW91-GGA results forvarious interstitial defect formation energies and the saddle point of Pandey’sconcerted-exchange mechanism, see Table I. This work also indicated that theLSDA and PW91-GGA equilibrium defect structures are very similar. One ofthe features of the various defect structures is the wide range of inter-atomicbonding they exhibit. In the split-〈110〉 structure the two atoms forming thedefect are four-fold coordinated, but two of the surrounding atoms are five-fold coordinated. The hexagonal interstitial is six-fold coordinated and its sixneighbours are five-fold coordinated. The tetrahedral interstitial is four-foldcoordinated and its four neighbours, which are therefore five-fold coordinated.At the saddle point of Pandey’s concerted exchange two bonds are broken sothat the exchanging atoms and two other atoms are five-fold coordinated.

Table 6.2. LDA, PW91-GGA, and DMC formation energies in eV of the self-interstitial defects and the saddle point of the concerted-exchange mechanism. a16atom supercell, b54 atom supercell.

Defect LDA GGA DMCa DMCb

Split-〈110〉 3.31 3.84 4.96(24) 4.96(28)

Hexagonal 3.31 3.80 4.70(24) 4.82(28)

Tetrahedral 3.43 4.07 5.50(24) 5.40(28)

Concerted Exchange 4.45 4.80 5.85(23) 5.78(27)

6 Quantum Monte Carlo techniques and defects in semiconductors 141

Split−(110) interstitial

Concerted exchange

Hexagonal interstitial

Perfect crystal

Tetrahedral interstitial

Fig. 6.2. The structures of the split-〈110〉, hexagonal, and tetrahedral interstitialdefects, the saddle point of the concerted-exchange mechanism, and the perfectsilicon crystal. After Needs. [48]

The DFT energy barriers to diffusive jumps between the low energy struc-tures were calculated, which gave a path for split-〈110〉-hexagonal diffusionwith a barrier of 0.15 eV (LDA) and 0.20 eV (PW91-GGA), and a barrier forhexagonal-hexagonal diffusive jumps of 0.03 eV (LDA) and 0.18 eV (PW91-GGA).

QMC calculations on silicon self-interstitials

The DMC calculations we performed for 16 and 54 atom fcc structures ob-tained from LDA calculations. The Slater-Jastrow trial wave functions wereformed from determinants of LDA orbitals and Jastrow factors containingup to 64 parameters, whose optimal values were obtained by minimizing thevariance of the energy.

The DMC defect formation energies are shown in Table I. Note that theDMC results for the 16 and 54-atom simulation cells are consistent, indicating

142 R.J. Needs

that the residual finite size effects are small. The clearest conclusion is thatthe DMC formation energies are roughly 1 eV larger than the PW91-GGAvalues and 1.5 eV larger than the LDA values. The hexagonal interstitial hasthe lowest formation energy within DMC, while the split-〈110〉 interstitial isslightly higher in energy, and the tetrahedral interstitial has a considerablyhigher energy. The saddle point of the concerted exchange has an energywhich is too high to explain self-diffusion in silicon.

Calculating the energy barriers to diffusion within DMC is currently pro-hibitively expensive, but they can be roughly estimated as follows. The tetra-hedral interstitial is a saddle point of a possible diffusion path between neigh-bouring hexagonal sites. The DMC formation energy of 5.4 eV for the tetra-hedral interstitial is therefore an upper bound to the formation+migrationenergy of the hexagonal interstitial. The true formation+migration energyis expected to be less than this. Within the LDA there is a diffusion pathfor the hexagonal interstitial with a barrier of only 25% of the tetrahedral-hexagonal energy difference. Applying the same percentage reduction to theDMC barrier then gives an estimate of the formation+migration energy ofthe hexagonal interstitial of 5 eV. This estimate is in good agreement with theexperimental activation energy for self-interstitial diffusion of 4.84 eV. [46]

This study demonstrated the importance of a proper treatment of elec-tron correlation when calculating defect formation energies in silicon. DFTpredicts formation+migration energies which are smaller than those deducedfrom experiment. The larger defect formation energies found in the DMCcalculations indicate a possible resolution of this problem, which might be animportant step in improving our understanding of self-diffusion in silicon.

6.6.3 Neutral vacancy in diamond

The vacancy in diamond is a dominant defect associated with radiation dam-age, and it exhibits a wider variety of physical phenomena than its counter-part in silicon. Self-diffusion in diamond is relevant to growing single crystaldiamond thin-films on non-diamond substrates. DFT studies have suggestedthat self-diffusion in diamond is dominated by vacancies [54] and that inter-stitials do not play a significant role.

A vacancy may be formed by removing an atom, leaving four danglingbonds. The simplest model of the electronic structure of a vacancy is a one-electron molecular defect picture in which symmetry adapted combinationsof the four dangling bond states have a1 and t2 symmetry. These levels are,respectively, singly and triply degenerate. In the neutral defect four electronsare placed in the dangling bond states, giving a a2

1t22 configuration. This

system is unstable to Jahn-Teller distortion and a spontaneous reductionin symmetry occurs in which the t2 states split into a doubly occupied a1

and empty e state, and a structure with D2d symmetry results. This modelis appropriate for the vacancy in silicon but, because of the large electron-electron interaction effects, not for diamond.

6 Quantum Monte Carlo techniques and defects in semiconductors 143

The prominent GR1 optical transition at 1.673 eV [55] is associated withthe neutral vacancy. Uniaxial stress perturbations show that the GR1 transi-tion is between a ground state which is orbitally doubly degenerate with sym-metry E and a orbitally triply degenerate excited state of T symmetry. [56]Both the ground and excited states undergo Jahn-Teller relaxations, but theeffects are dynamic during absorption and the vacancy therefore appears tomaintain the Td point group of an atomic site in diamond. [57]

Correlation effects among the electrons in the dangling bonds are veryimportant, but the strong coupling to the solid state environment must alsobe included. The electronic states of the defect form a multiplet requiringa multi-determinant description [58, 59]. The GR1 optical transition can-not be expressed as a transition between one-electron states, which makesa first-principles approach very difficult. For example, approaches based onone-electron states, such as Kohn-Sham DFT or the GW self-energy ap-proach [60, 61], cannot give a full description of the multiplet structure. VonBarth has suggested an approximate scheme for obtaining multiplet energiesfrom DFT calculations [62], but unfortunately this scheme does not permitthe calculation of the energy of what is believed to be the excited state ofthe GR1 band. [63, 64] QMC appears to be an ideal method for this prob-lem, as one can use a multi-determinant description of the defect states whilemaintaining the favourable scaling with system size.

VMC and DMC calculations on the neutral vacancy in diamond

Hood et al. [4] performed VMC and DMC calculations of the neutral va-cancy in diamond using periodic boundary conditions and a simulation cellcontaining 53 atoms. The relaxed ionic positions were obtained from an LSDAcalculation in which each of the t2 states was fractionally occupied. The near-est neighbor atoms of the vacancy were found to relax outwards by 0.1 A,while the relaxation of the next nearest neighbors was negligible.

The trial wave functions were of the Slater-Jastrow type with the single-particle orbitals obtained from an LSDA calculation using a Gaussian basisset. To calculate the multiplet structure of the electronic states of the vacancy,symmetrized multi-determinants were used with the configuration in whichthe a1 states were fully occupied and the t2 states were doubly occupied. Thisgives rise to 1A1, 1E, 1T2, and 3T1 states, with orbital degeneracies of 1, 2,3, 3, respectively. The corresponding trial wave functions took the form,

Ψ1A1 = eJ(R)(Dx

↑Dx↓ +Dy

↑Dy↓ +Dz

↑Dz↓)

(6.26)

Ψ1E = eJ(R)(2Dx

↑Dx↓ −Dy

↑Dy↓ −Dz

↑Dz↓)

(6.27)

Ψ1T2 = eJ(R)(Dy

↑Dz↓ −Dz

↑Dy↓)

(6.28)

Ψ3T1 = eJ(R)Dyz↑ D↓ , (6.29)

144 R.J. Needs

Fig. 6.3. DMC energy differences from the ground state in eV with statistical errorsfor the lowest symmetry states of the neutral vacancy in diamond. The dotted arrowindicates the lowest spin and orbitally dipole allowed transition. After Hood et al. [4]

In this notation, the superscripts x, y, z label which of the t2 orbitals areincluded in the determinant, in addition to the a1 and lower energy orbitals.The values of the parameters in the Jastrow factors were optimized by min-imizing the variance of the local energy. [24, 25]

Fig. 6.3 shows the DMC energies of the states obtained by Hood et al. [4]The 1E state was found to be the ground state, in agreement with experiment.The state with the lowest energy electronic excitation from the ground statewhich is spin and orbitally dipole allowed was the 1T2 state, and these statesare identified as giving rise to the GR1 line. The associated DMC transitionenergy was calculated to be 1.51(34) eV, which lie within the the statisticalerror bars of the experimental value of 1.673 eV.

The vacancy formation energy was calculated to be 6.98 eV within theLSDA, and 5.96(34) eV within DMC. Both of these values include a correc-tion for the Jahn-Teller relaxation energy of the vacancy, which is estimatedto be 0.36 eV. [63] The DMC formation energy indicates that it is approx-imately one eV more favorable to form a neutral vacancy in diamond thanpredicted by the LSDA. The vacancy formation energy in diamond has yetto be measured.

6.6.4 Schottky defects in magnesium oxide

Alfe and Gillan [65] recently used DMC to study the Schottky defect for-mation energy in MgO. Although MgO is classified as an insulator ratherthan a semiconductor, these calculations illustrate very nicely the methodol-ogy which can be used to calculate the energetics of charged defects within

6 Quantum Monte Carlo techniques and defects in semiconductors 145

QMC. Oxide materials are, of course, of great importance as insulating layersin semiconductor devices.

The Schottky defect energy of MgO is defined to be the energy changewhen a Mg2+ and an O2− ion are removed from a perfect MgO crystal, theions being replaced to form an additional unit cell of perfect crystal. TheSchottky defect energy, ES, governs the concentration of vacancies present inthermal equilibrium.

The methodology used for calculating ES follows that used in DFT cal-culations. One way to model the Schottky defect would be to remove an Mgatom and an O atom from a large supercell, but in electronic structure calcu-lations it is normally preferable to study supercells containing only a singledefect, because otherwise the interactions between the defects would be large.

Consider a crystal containingN cation andN anion sites. Let EN(ν+, ν−)denoted the total energy of a crystal containing ν+ cation vacancies and ν−

anion vacancies. The Schottky defect energy is then

ES = EN(1, 0) + EN(0, 1)− 2(N − 1)N

EN(0, 0) . (6.30)

The systems with a single Mg2+ vacancy and a single O2− vacancy were mod-elled by supercells subject to periodic boundary conditions. These supercellswould each carry a net charge, in which case the electrostatic energy of thesystem cannot be defined, and therefore an appropriate uniform backgroundcharge was added to each supercell to make them charge neutral. In the limitof large N the presence of the uniform background charges makes no differ-ence to ES. In reality a finite value of N must be used, which introduces finitesize errors. The finite size error due to the interaction between the periodicimages of the defects was corrected for by a standard method. [66]

The DMC calculations used N = 27, i.e., 53 atoms for the defective crys-tals and 54 for the perfect crystals, and the relaxed atomic positions wereobtained from LSDA calculations. Slater-Jastrow wave functions were used,with orbitals obtained from plane-wave LSDA calculations, and optimizedJastrow factors. For efficient evaluation within the QMC calculations the or-bitals were re-expanded in a basis of B-splines or “blips”, which are functionswhich are strongly localized in real space. [67]

The DMC calculations gave ES = 7.5 ± 0.53 eV, while equivalent LSDAcalculations using the same pseudopotentialand cell size gave ES = 6.99 eV,and a highly converged LSDA calculation gave ES = 6.76 eV. It is verydifficult to measure the Schottky defect energy, but it is estimated to liewithin the range 4 − 7 eV.

6.7 Conclusions

QMC techniques provide a unified treatment of both the ground and excitedstate energies of correlated electron systems. They are therefore widely ap-

146 R.J. Needs

plicable and hold great promise as a computational method to enhance ourunderstanding of defects in solids.

Considering that QMC is very costly, there seems little point in using itunless the results are reliably more accurate than those

obtained from less costly methods such as DFT. The pursuit of higheraccuracy in QMC calculations is, in my opinion, the most important prac-tical issue facing today’s practitioners. Very high accuracy has already beendemonstrated for small systems, but the situation for larger systems is lessclear. I believe that many careful studies are required before we can confi-dently assert that QMC techniques reliably achieve highly accurate resultsfor complicated systems such as defects in semiconductors.

QMC techniques are currently in a phase of rapid development. The mostchallenging problem in fermion QMC is the infamous fermion sign problem.Solving the sign problem is certainly extremely difficult and perhaps impos-sible. [68] It is thought that exact fermion methods which avoid the signproblem will be exponentially slow on a classical computer. However, it isclear that QMC methods can deliver highly accurate results provided thetrial wave functions are accurate enough. It is important to appreciate thattrial wave functions can be improved using optimization techniques, whereasimproving DFT results requires the development of better density function-als, which seems likely to be a much harder problem. Another importantissue is that currently it is not possible to calculate accurate atomic forceswithin QMC for large systems, and therefore it is not possible to relax defectstructures. So far this problem has been avoided by using relaxed structuresobtained from DFT calculations. The problem of calculating accurate forceswithin QMC for large systems is unlikely to be insurmountable, and I believethat a satisfactory solution will be developed in due course. Although QMCmethods have a long way to go before they are routinely applied to defectsin semiconductors I hope that I have persuaded the reader that such a goalis both desirable and possible.

I have discussed applications of the DMC method to self-interstitials insilicon, the neutral vacancy in diamond and Schottky defects in magnesiumoxide. These studies have demonstrated the feasibility of DMC studies of de-fects in semiconductors, and have already produced interesting results whichchallenge those from other methods.

Acknowledgments

I would like to thank all of my collaborators who have contributed so much toour QMC project. Much of this work has been supported by the Engineeringand Physical Sciences Research Council (EPSRC), UK.

6 Quantum Monte Carlo techniques and defects in semiconductors 147

References

1. For a review of the VMC and DMC methods and applications to solids seeW.M.C. Foulkes, L. Mitas, R.J. Needs, and G. Rajagopal, Rev. Mod. Phys. 73,33 (2001)

2. D.M. Ceperley and B.J. Alder, Phys. Rev. Lett. 45, 566 (1980)3. W.-K. Leung, R.J. Needs, G. Rajagopal, S. Itoh, and S. Ihara, Phys. Rev. Lett.

83, 2351 (1999)4. R.Q. Hood, P.R.C. Kent, R.J. Needs, and P.R. Briddon, Phys. Rev. Lett. 91,

076403 (2003)5. S. B. Healy, C. Filippi, P. Kratzer, E. Penev, and M. Scheffler, Phys. Rev. Lett.

87, 016105 (2001)6. C. Filippi, S. B. Healy, P. Kratzer, E. Pehlke, and M. Scheffler, Phys. Rev. Lett.

89, 166102 (2002)7. A.J. Williamson, J.C. Grossman, R.Q. Hood, A. Puzder, and G. Galli, Phys.

Rev. Lett. 89, 196803 (2002)8. N.D. Drummond, A.J. Williamson, R.J. Needs, and G. Galli, to appear in Phys.

Rev. Lett. (2005)9. W.L. McMillan, Phys. Rev. A138, 442 (1965)

10. N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, and E. Teller,J. Chem. Phys. 21, 1087 (1953)

11. J.B. Anderson, J. Chem. Phys. 63, 1499 (1975)12. J.B. Anderson, J. Chem. Phys. 65, 4121 (1976)13. D.M. Ceperley, J. Stat. Phys. 63, 1237 (1991)14. R.C. Grimm and R.G. Storer, J. Comput. Phys. 7, 134 (1971)15. M.H. Kalos, D. Levesque, and L. Verlet, Phys. Rev. A 9, 257 (1974)16. D.M. Ceperley and M.H. Kalos. in Monte Carlo Methods in Statistical Physics,

edited by K. Binder (Springer, Berlin, 1979), pp 145–19417. B.L. Hammond, W.A. Lester, Jr., and P.J. Reynolds, Monte Carlo Methods in

ab initio quantum Chemistry , (World Scientific, Singapore, 1994)18. S. Baroni and S. Moroni, Phys. Rev. Lett. 82, 4745 (1999)19. T. Kato, Comm. Pure Appl. Math. 10, 151 (1957)20. R.P. Feynman and M. Cohen, Phys. Rev. 102, 1189 (1956)21. Y. Kwon, D.M. Ceperley, and R.M. Martin, Phys. Rev. B 58, 6800 (1998)22. M. Holzmann, D.M. Ceperley, C. Pierleoni, and K. Esler, Phys. Rev. E 68,

046707 (2003)23. D.M. Ceperley, J. Stat. Phys. 43, 815 (1986)24. C.J. Umrigar, K.G. Wilson, and J.W. Wilkins, Phys. Rev. Lett. 60, 1719 (1988)25. P.R.C. Kent, R.J. Needs, and G. Rajagopal, Phys. Rev. B 59, 12344 (1999)26. N.D. Drummond and R.J. Needs, to appear in Phys. Rev. B (2005)27. G. Rajagopal, R.J. Needs, S. Kenny, W.M.C. Foulkes, and A. James, Phys.

Rev. Lett. 73, 1959 (1994)28. G. Rajagopal, R.J. Needs, A. James, S. Kenny, and W.M.C. Foulkes, Phys.

Rev. B 51, 10591 (1995)29. C. Lin, F.H. Zong, and D.M. Ceperley, Phys. Rev. E 64, 016702 (2001)30. P.R.C. Kent, R.Q. Hood, A.J. Williamson, R.J. Needs, W.M.C. Foulkes, and

G. Rajagopal, Phys. Rev. B 59, 1917 (1999)31. A. Ma, N.D. Drummond, M.D. Towler, and R.J. Needs, Phys. Rev. E 71,

066704 (2005)

148 R.J. Needs

32. M.M. Hurley and P.A. Christiansen, J. Chem. Phys. 86, 1069 (1987)33. L. Mitas, E.L. Shirley, and D.M. Ceperley, J. Chem. Phys. 95, 3467 (1991)34. M. Casula, C. Filippi, and S. Sorella, unpublished35. C.W. Greeff and W.A. Lester, Jr., J. Chem. Phys. 109, 1607 (1998)36. J.R. Trail and R.J. Needs, J. Chem. Phys. 122, 014112 (2005)37. J.R. Trail and R.J. Needs, J. Chem. Phys. 122, 224322 (2005)38. W.M.C Foulkes, R.Q. Hood, and R.J. Needs, Phys. Rev. B 60, 4558 (1999)39. A.J. Williamson, R.Q. Hood, R.J. Needs, and G. Rajagopal, Phys. Rev. B 57,

12140 (1998)40. M.D. Towler, R.Q. Hood, and R.J. Needs, Phys. Rev. B 62, 2330 (2000)41. Ryo Maezono, M.D. Towler, Y. Lee, R.J. Needs, Phys. Rev. B 68, 165103 (2003)42. R.J. Needs and M.D. Towler, Int. J. Mod. Phys. B 17, 5425 (2003)43. G.D. Watkins in Defects and Their Structure in Non-metallic Solids, edited by

B. Henderson and A.E. Hughes, (Plenum, New York, 1976), p 20344. S. Dannefaer, P. Mascher, and D. Kerr, Phys. Rev. Lett. 56, 2195 (1986)45. W. Frank, U. Gosele, H. Mehrer, and A. Seeger, in Diffusion in Crystalline

Solids, edited by G.E. Murch and A.S. Nowick (Academic, Orlando, FL, 1985),p 64

46. U. Gosele, A. Plossl, and T.Y. Tan, in Process Physics and Modeling in Semi-conductor Technology, edited by G.R. Srinivasan, C.S. Murthy, and S.T. Dun-ham (Electrochemical Society, Pennington, NJ, 1996), p 309

47. D. Eaglesham, Physics World 8 No. 11, 41 (1995)48. R.J. Needs, J. Phys.: Condensed Matter 11, 10437 (1999)49. S. Goedecker, T. Deutsch, and L. Billard, Phys. Rev. Lett. 88, 235501 (2002)50. Omar Al-Mushadani and R.J. Needs, Phys. Rev. B 68, 235205 (2003)51. K.C. Pandey, Phys. Rev. Lett. 57, 2287 (1986); E. Kaxiras and K.C. Pandey,

Phys. Rev. B 47, 1659 (1993)52. P.E. Blochl, E. Smargiassi, R. Car, D.B. Laks, W. Andreoni, and S.T. Pan-

telides, Phys. Rev. Lett. 70, 2435 (1993)53. S.J. Clark and G.J. Ackland, Phys. Rev. B 56, 47 (1997)54. J. Bernholc, A. Antonelli, T.M. Del Sole, Y. Bar-Yam, and S.T. Pantelides,

Phys. Rev. Lett. 61, 2689 (1988)55. G. Davies and C.P. Foy, J. Phys. C 13, 4127 (1980)56. C.D. Clark and J. Walker, Proc. R. Soc. London Ser. A 234, 241 (1973)57. G. Davies, Rep. Prog. Phys. 44, 787 (1981)58. M. Lannoo and J. Bourgoin, Point Defects in Semiconductors I: Theoretical

Aspects (Berlin, Springer, 1981), pp 141–14559. C.A. Coulson and M.J. Kearsley, Proc. R. Soc. London Ser. A 241, 433 (1957)60. L. Hedin, Phys. Rev. 139, A796 (1965)61. G. Onida, L. Reining, and A. Rubio, Rev. Mod. Phys. 74, 601 (2002)62. U. von Barth, Phys. Rev. A 20, 1693 (1979)63. S.J. Breuer and P.R. Briddon, Phys. Rev. B 51, 6984 (1995)64. A. Zywietz, J. Furthmuller, and F. Bechstedt, Phys. Rev. B 62, 6854 (2000)65. D. Alfe and M.J. Gillan, Phys. Rev. B 71, 220101 (2005)66. M. Leslie and M.J. Gillan, J. Phys. C 18, 973 (1985)67. D. Alfe and M.J. Gillan, Phys. Rev. B 70, 161101 (2004)68. M. Troyer and U.-J. Wiese, Phys. Rev. Lett. 94, 170201 (2005)

7 Quasiparticle Calculations for Point Defects

at Semiconductor Surfaces

Arno Schindlmayr1,2 and Matthias Scheffler2

1 Institut fur Festkorperforschung, Forschungszentrum Julich, D-52425 Julich,Germany [email protected]

2 Fritz-Haber-Institut der Max-Planck-Gesellschaft, Faradayweg 4–6,D-14195 Berlin-Dahlem, Germany [email protected]

We present a quantitative parameter-free method for calculating defect statesand charge-transition levels of point defects in semiconductors. It com-bines the strength of density-functional theory for ground-state total ener-gies with quasiparticle corrections to the excitation spectrum obtained frommany-body perturbation theory. The latter is implemented within the G0W0

approximation, in which the electronic self-energy is constructed non-self-consistently from the Green’s function of the underlying Kohn–Sham system.The method is general and applicable to arbitrary bulk or surface defects.As an example we consider anion vacancies at the (110) surfaces of III–Vsemiconductors. Relative to the Kohn–Sham eigenvalues in the local-densityapproximation, the quasiparticle corrections open the fundamental band gapand raise the position of defect states inside the gap. As a consequence, thecharge-transition levels are also pushed to higher energies, leading to closeagreement with the available experimental data.

7.1 Introduction

The electric properties of semiconductors, and hence their applicability inelectronic devices, are to a large degree governed by defects that are eitherintrinsic or incidentally or intentionally introduced impurities. Considerableefforts, therefore, focus on determining the factors that lead to the formationof point defects and their influence on a material’s electric properties.

The progress in the understanding of the atomic structure of point de-fects at cleaved III–V semiconductor surfaces, which serve as an illustrationin this work, was recently reviewed by Ebert [1]. The possibility to imageindividual defects using scanning tunneling microscopy (STM) with atomicresolution, in particular, has yielded a wealth of data, but as STM providesa somewhat distorted picture of the electronic states close to the Fermi en-ergy, these results cannot (and should not) be identified directly with theatomic geometry. Electronic-structure calculations have hence turned out tobe an indispensable tool for the interpretation of the experimental findings.A deeper discussion of this point together with an example for the practicalstructure determination of a semiconductor surface is given in [2]. For point

150 Arno Schindlmayr and Matthias Scheffler

defects at the (110) surfaces of III–V semiconductors several calculationswere reported [3–8]. They are based on density-functional theory [9], wherethe exchange-correlation energy is typically treated in the local-density ap-proximation (LDA) [10] or generalized gradient approximations (GGA) [11].The agreement with experimental STM data appears to be very good. Forexample, the enhanced contrast of the empty pz orbitals of the two Ga atomsnearest to an anion vacancy at p-type GaAs(110) observed under positive biasand initially interpreted as an outward relaxation [12] could thus be under-stood to result, instead, from a downward local band bending accompaniedby an inward relaxation [3]. The band bending itself is caused by the positivecharge of the defect. Another controversy centers on the lateral relaxation ofthe positively charged anion vacancy. While STM images show a density ofstates preserving the mirror symmetry of the surface at the defect site [12],early theoretical studies of the lattice geometry based on total-energy min-imization produced conflicting evidence for [3] and against [4, 5] a possiblebreaking of the mirror symmetry. Well converged electronic-structure calcu-lations later confirmed that the distortion is indeed asymmetric [6–8] and thatthe apparently symmetric STM image results from the thermally activatedflip motion between two degenerate asymmetric configurations.

In contrast, theoretical predictions of the electronic properties of pointdefects have been less successful and still show significant quantitative de-viations from experimental results. The principal quantities of interest arethe location of defect states in the fundamental band gap as well as thecharge-transition levels. We will carefully distinguish in this chapter betweenthese two quantities: “defect states” or “defect levels” on the one hand and“charge-transition levels” on the other. The former are part of the electronicstructure and can, in principle, be probed by photoemission spectroscopy,although standard spectroscopic techniques are often not applicable due tothe low density of the surface defects. The Franck–Condon principle is typ-ically well justified, because the rearrangement of the atoms happens on amuch slower time scale than the photoemission process. Nevertheless, thecoupling of the electrons to the lattice may be visible in the line widthsand shapes. The defect levels thus contain the full electronic relaxation inresponse to the created hole in direct photoemission or the injected extraelectron in inverse photoemission, but no atomic relaxation. Although inves-tigations of electronic properties frequently rely on the Kohn–Sham eigenval-ues from density-functional theory, these only provide a first approximationto the true band structure, and quantitative deviations from experimentalresults must be expected. In particular, for many materials the eigenvaluegap both in the LDA and the GGA underestimates the fundamental bandgap significantly. Likewise, the position of defect states in the gap cannotbe determined without systematic errors. With these words of warning wenote, however, that the Kohn–Sham eigenvalues constitute a good and welljustified starting point for calculating band structures and defect states [13].

7 Quasiparticle Calculations for Point Defects at Semiconductor Surfaces 151

Therefore, a perturbative approach starting from the Kohn–Sham eigenvaluespectrum is indeed appropriate.

While the measurement of the defect levels (see the discussion on page2 of this chapter) probes the geometry before the electron is added or re-moved, the charge-transition levels are thermodynamic quantities and spec-ify the values of the Fermi energy where the stable charge state of the defectchanges. Therefore, the charge-transition levels are affected noticeably by theatomic relaxation taking place upon the addition or removal of an electron.A quantitative analysis must hence be able to accurately compare the forma-tion energies of competing configurations with different numbers of electrons.Density-functional theory at the level of the LDA or GGA is capable of giv-ing a good account of the atomic geometries for many materials. A criticalfeature is the nonlinearity of the exchange-correlation functional. The pseu-dopotentialapproximation, which effectively removes the inner shells from thecalculation by modeling their interaction with the valence electrons in termsof a modified potential, linearizes the core–valence interaction and thus doesnot treat this contribution correctly. In particular, freezing the d electrons inthe core of a pseudopotentialleads to poor lattice constants and a distortedelectronic structure for some III–V semiconductors, such as GaN, where theGa 3d states resonate strongly with the N 2s states [14]. For GaAs and InPthis is a lesser problem, because the cation d states are energetically wellbelow the anion 2s states and thus relatively inert. As a result, the LDAand the GGA yield only small deviations from the experimental lattice con-stants and provide a good starting point for quasiparticle band-structure cal-culations. However, when competing configurations with a different numberof electrons are compared, the relevant energy differences lack the requiredquantitative accuracy. As we will show in more detail below, the reason isthat these jellium-based approximations of the exchange-correlation energyignore important features of the exact functional, such as the discontinuityof the exchange-correlation potential with respect to a change in the particlenumber. As a consequence, previous calculations of charge-transition levelsbased on the LDA exhibit systematic errors, for example for anion vacanciesat InP(110) [6].

As an alternative approach to the electronic structure of point defects,we employ techniques adapted from many-body perturbation theory that wehave found to be very fruitful in the past [15]. Exchange and correlationeffects are here described by a non-local and frequency-dependent self-energyoperator. The solutions of the ensuing nonlinear eigenvalue equations havea rigorous interpretation as excitation energies and can be identified withthe electronic band structure. We discuss the calculation of defect statesas well as charge-transition levels within this framework and show that theresults improve significantly upon earlier values obtained from the Kohn–Sham scheme in the LDA. As an example we consider anion vacancies atGaAs(110) and InP(110), but the method is general and can also be applied to

152 Arno Schindlmayr and Matthias Scheffler

other defects at surfaces as well as in the bulk. The (110) surfaces are not onlythe natural cleavage planes of III–V semiconductors, but they have severalcharacteristics that make them particularly interesting for defect studies. Asno surface states exist inside the fundamental band gap [16, 17], the Fermienergy of a system that is clean and free from intrinsic defects is not pinnedbut controlled by the doping of the crystal. Only imperfections, such as anionvacancies or antisite defects, can introduce gap states and pin the Fermienergy at the surface. STM can probe filled and empty surface states byreversing the bias voltage [12], and both the GaAs(110) and the InP(110)surface are well characterized experimentally.

This chapter is organized as follows. We start in Sect. 7.2 by reviewingthe computational methods. In Sect. 7.3 we then explain the physics of thedefect-free GaAs(110) and InP(110) surfaces. The calculation of defect levelsis discussed in Sect. 7.4 and that of charge-transition levels in Sect. 7.5,together with a comparison with the available theoretical and experimentaldata. Finally, Sect. 7.6 summarizes our conclusions. Unless where explicitlyindicated otherwise, we use Hartree atomic units.

7.2 Computational Methods

A quantitative analysis of the electronic properties of point defects requirescomputational schemes that describe not only the ground state but alsothe excitation spectrum. While density-functional theory with state-of-the-art exchange-correlation functionals can be used to determine ground-stateatomic geometries, many-body perturbation theory is the method of choicefor excited states. In this work we take the Kohn–Sham eigenvalues in theLDA as a first estimate and then apply the G0W0 approximation for theelectronic self-energy as a perturbative correction. The latter provides a goodaccount of the discontinuity as well as other shortcomings of the LDA. Forthis reason we first review both schemes, emphasizing their strengths as wellas limitations.

7.2.1 Density-Functional Theory

Density-functional theory is based on the Hohenberg–Kohn theorem [9],which observes that the total energy EN,0 of a system of N interactingelectrons in an external potential Vext(r) is uniquely determined by theground-state electron density nN (r). While the Hohenberg–Kohn theoremitself makes no statement about the mathematical form of this functional, ithas inspired algorithms that exploit the reduced number of degrees of free-dom compared to a treatment based on many-particle wave functions. In theKohn–Sham scheme [10], which underlies all practical implementations, thedensity is constructed from the orbitals of an auxiliary noninteracting systemaccording to

7 Quasiparticle Calculations for Point Defects at Semiconductor Surfaces 153

nN(r) = 2∞∑

j=1

fN,j |ϕN,j(r)|2 . (7.1)

The occupation numbers fN,j are given by the Fermi distribution; at zerotemperature they equal one for states below the Fermi energy and zero forstates above. The factor 2 stems from the spin summation. Here we onlyconsider non magnetic systems and thus assume two degenerate spin channelsthroughout, although the formalism can easily be generalized if necessary.The energy functional is decomposed as

EN,0 ≡ E[nN ] = Ts[nN ] +∫Vext(r)nN (r) d3r +EH[nN ] + Exc[nN ] , (7.2)

where Ts[nN ] is the kinetic energy of the auxiliary noninteracting systemand EH[nN ] the Hartree energy. The last term incorporates all remainingexchange and correlation contributions, and is not known exactly. In practicalimplementations it must be approximated, for example by the LDA, whichreplaces the exchange-correlation energy Exc[nN ] by that of a homogeneouselectron gas with the same local density [10]. A variational analysis finallyshows that the total energy is minimized if the single-particle orbitals satisfy[

−12∇2 + Vext(r) + VH([nN ]; r) + Vxc([nN ]; r)

]ϕN,j(r) = εKS

N,jϕN,j(r) .

(7.3)The Hartree potential VH([nN ]; r) and the exchange-correlation potentialVxc([nN ]; r) are defined as functional derivatives of the corresponding en-ergy terms with respect to the density. The eigenvalues εKS

N,j are Lagrangeparameters that enforce the normalization of the orbitals.

We use the Kohn–Sham scheme to determine the equilibrium geometryof clean surfaces and surface defects by relaxing the atomic coordinates andallowing the system to explore its energetically most favorable configuration.Although the Kohn–Sham eigenvalues differ from the true excitation energiesand constitute only an approximation to the quasiparticle band structure,they are often numerically close in practice. This follows from the transition-state theorem of Slater [18] and Janak [19] and allows a correction withinperturbation theory. Here we are interested in the position of the defect state,which may be occupied or unoccupied, relative to the surface valence-bandmaximum. As the defect state is separated from the valence-band maximumby a finite energy difference, the location obtained from the Kohn–Shameigenvalue spectrum contains two sources of errors: In addition to the cho-sen approximation for the exchange-correlation functional, there is anothersystematic error that is due to fundamental limitations of the Kohn–Shamscheme and would also be present if the exact functional was employed. Inorder to understand the latter, we now briefly sketch its origin.

One rigorous result is that the eigenvalue of the highest occupied Kohn–Sham state matches the corresponding quasiparticle energy [20], which is in

154 Arno Schindlmayr and Matthias Scheffler

turn equal to the ionization potential that marks the threshold for photoe-mission, i.e., εKS

N,N = EN,0−EN−1,0. If the highest occupied state correspondsto the valence-band maximum, then the unoccupied defect state equals theelectron affinity εKS

N+1,N+1 = EN+1,0−EN,0 , because it is the next to be pop-ulated by one extra electron added to the system. Unfortunately, this is notthe same as the eigenvalue εKS

N,N+1 of the first unoccupied state obtained from(7.3). The difference is due to the fact that the exchange-correlation potentialof an insulator Vxc([nN+1]; r) = Vxc([nN ]; r) + Δxc + O(1/N) with Δxc > 0changes discontinuously upon addition of an extra electron [21,22]. A similarargument can be made if the defect state is occupied. As a consequence, theKohn–Sham eigenvalue gaps differ systematically from the gaps in the truequasiparticle band structure. The magnitude of the discontinuity is still amatter of controversy but is believed to be significant. For pure sp-bondedsemiconductors like GaAs the LDA underestimates the experimental bandgap by about 50%. The GGA, designed only to improve the total energy,yields a very similar eigenvalue spectrum as the LDA when applied to thesame atomic geometry. An entirely different construction that permits a moresystematic treatment of exchange and correlation is the optimized effectivepotential method [23]. When evaluated to first order in the coupling con-stant, this approach yields the exact exchange potential, which can be usedin band-structure calculations [24]. Remarkably, for many semiconductors theresulting Kohn–Sham eigenvalue gaps are very close to the true quasiparticleband gaps [25]. The significance of this observation is under debate, how-ever, because there are indications that it may not uphold if correlation istreated on the same footing. Preliminary results suggest that the eigenvaluegaps are again close to the LDA values if correlation is included within therandom-phase approximation [26, 27], but there is currently too little datato make a definite statement, and all existing calculations at this level alsocontain additional simplifications, for example a shape approximation for thepotential in the linearized muffin-tin orbital method in [27].

The reasoning above suggests that density-functional theory is still, inprinciple, applicable for calculating the difference in total energy betweenground-state configurations with different electron numbers, which is neededto determine the energetically most favorable charge state of a point defect.However, all jellium-based functionals lack the derivative discontinuity Δxc

of the exact exchange-correlation potential. This neglect reduces the electronaffinity EN+1,0 −EN,0 both in the LDA and the GGA if the additional elec-tron occupies a state separated from the valence-band maximum by a finiteenergy difference. For systems in contact with an electron reservoir, such asdefects in solids, it hence lowers the threshold for an increase of the elec-tron population. This is consistent with the observation that the calculatedcharge-transition levels for materials like InP are significantly smaller thanthe available experimental measurements [6]. The exact exchange potential,an implicit functional of the density defined in terms of the Kohn–Sham

7 Quasiparticle Calculations for Point Defects at Semiconductor Surfaces 155

orbitals, includes a derivative discontinuity [13], but the latter exceeds theexperimental band gap significantly and must hence be partially canceled bya correlation contribution with similar magnitude and opposite sign [25]. Asthe total energies are by construction close to the corresponding Hartree–Fock values, the electron affinities EN+1,0 − EN,0 are grossly overestimated,and no reliable charge-transition levels can be obtained in this way.

Our implementation of density-functional theory employs the plane-wavepseudopotentialmethod in combination with the LDA. We use the parame-terization by Perdew and Zunger [28], which is in turn based on the quan-tum Monte-Carlo data by Ceperley and Alder for the homogeneous electrongas [29]. The norm-conserving pseudopotentials are of the fully separableKleinman–Bylander form [30]. We choose d as the local component for allpseudopotentials except for In, where p is used instead. The Kohn–Shamwave functions are expanded in plane waves with a cutoff energy of 15 Ry.Our calculations are performed with the FHImd code [31, 32]. The bulk lat-tice constants obtained in this way, 5.55 A for GaAs and 5.81 A for InP inthe absence of zero-point vibrations, are in good agreement with previouslypublished data [33] and slightly smaller than the experimental values at roomtemperature by 1.8% and 1.1%, respectively [34]. We use the theoretical lat-tice constants in order to prevent errors resulting from a non equilibriumunit-cell volume during the relaxation of the surface geometries.

7.2.2 Many-Body Perturbation Theory

Many-body perturbation theory [35] provides powerful techniques to analyzethe electronic structure in the gap region, because the framework is designedspecifically to give access to excited states. Quasiparticle excitations createdby the addition or removal of one electron are obtained from the one-particleGreen’s function

G(r, r′; t− t′) = −i〈ΨN,0|T {ψ(r, t)ψ†(r′, t′)}|ΨN,0〉 , (7.4)

where |ΨN,0〉 denotes the ground-state wave function of the interacting elec-tron system in second quantization, ψ†(r′, t′) = exp(iHt′)ψ†(r′) exp(−iHt′)and ψ(r, t) = exp(iHt)ψ(r) exp(−iHt) are the electron creation and annihi-lation operators in the Heisenberg picture, respectively, and the symbol Tsorts the subsequent list of operators according to ascending time argumentsfrom right to left with a change of sign for every pair permutation. TheGreen’s function can be interpreted as a propagator: For t > t′ it describesa process in which an extra electron is added to the system at time t′. Theresulting wave function is, in general, no eigenstate of the Hamiltonian Hbut a linear combination of many eigenstates |ΨN+1,j〉. Between the times t′

and t each projection evolves with its own characteristic phase, and a Fourieranalysis of this oscillatory behavior immediately yields the energy spectrumεN,j = EN+1,j −EN,0 of the accessible excited states. Likewise, for t′ > t the

156 Arno Schindlmayr and Matthias Scheffler

Green’s function describes the propagation of an extra hole between t and t′,yielding the energies εN,j = EN,0−EN−1,j. In mathematical terms, we inserta complete set of eigenstates between the field operators in (7.4) and Fouriertransform to the frequency axis. The resulting expression

G(r, r′;ω) = limη→+0

∞∑j=1

fN,j

ψN,j(r)ψ∗N,j(r

′)ω − εN,j − iη

(7.5)

+ limη→+0

∞∑j=1

(1 − fN,j)ψN,j (r)ψ∗

N,j(r′)

ω − εN,j + iη

shows that the poles of the Green’s function correspond directly to the quasi-particle energies εN,j . Significantly, these not only yield the true band struc-ture, but the highest occupied state also equals the exact ionization potentialEN,0 − EN−1,0 and the lowest unoccupied state the exact electron affinityEN+1,0 − EN,0. Therefore, many-body perturbation theory provides a con-venient way to analyze both defect states and the energetics of charge tran-sitions. The wave functions ψN,j(r) = 〈ΨN−1,j |ψ(r)|ΨN,0〉 for occupied andψN,j(r) = 〈ΨN,0|ψ(r)|ΨN+1,j〉 for unoccupied states are obtained from thequasiparticle equations[

−12∇2 + Vext(r) + VH(r)

]ψN,j (r)

+∫Σxc(r, r′, εN,j)ψN,j(r′) d3r′ = εN,jψN,j(r) . (7.6)

The self-energy Σxc(r, r′, ε) incorporates all contributions from exchange andcorrelation processes. In contrast to the exchange-correlation potential ofdensity-functional theory, it is non-local, energy-dependent and has a finiteimaginary part, which is proportional to the damping rate resulting fromelectron–electron scattering. Together with other relevant decay channels,such as scattering from phonons or impurities, this mechanism is responsiblefor a finite lifetime of the excitations.

For real materials the self-energy can only be treated approximately. Likethe majority of practical applications, we use the G0W0 approximation [36]

Σxc(r, r′; t− t′) = limη→+0

iG0(r, r′; t− t′)W0(r, r′; t− t′ + η) . (7.7)

The Fourier transform of G0(r, r′; t− t′) on the frequency axis is constructedin analogy to (7.5) from the eigenstates of an appropriate mean-field system,in our case from the Kohn–Sham orbitals ϕN,j (r) and eigenvalues εKS

N,j. Thedynamically screened Coulomb interactionW0(r, r′; t− t′) can be modeled indifferent ways. Many implementations use plasmon-pole models, in which thefrequency-dependence is described by an analytic function whose parametersare determined by a combination of known sum rules and asymptotic limits

7 Quasiparticle Calculations for Point Defects at Semiconductor Surfaces 157

[37, 38]. This simplification has the advantage that it facilitates an analytictreatment, but it ignores details of the dynamic screening processes in amaterial and thus constitutes a potential source of errors. Here we take thefull frequency-dependence of the dielectric function into account by employingthe random-phase approximation

W0(k, ω) = v(k) + v(k)P0(k, ω)W0(k, ω) (7.8)

with the bare Coulomb potential v(k) = 4π/|k|2 and the polarizability

P0(r, r′; t− t′) = −2iG0(r, r′; t− t′)G0(r′, r; t′ − t) . (7.9)

The inclusion of dynamic screening derives from the concept of quasiparticles,which comprise an electron or hole together with its surrounding polariza-tion cloud. The composite is called quasiparticle because it behaves in manyways like a single entity. The polarization cloud is created by the repulsiveCoulomb potential and reduces the effective charge of the quasiparticle com-pared to that of the bare particle at its center. The G0W0 expression (7.7)constitutes the leading term in the expansion of the self-energy and is of firstorder in the dynamically screened interaction. Finally, we exploit the formalsimilarity between (7.6) and the Kohn–Sham equations (7.3) by evaluatingthe quasiparticle energies within first-order perturbation theory as

εN,j = εKSN,j + 〈ϕN,j |Σxc(εN,j) − Vxc[nN ]|ϕN,j〉 . (7.10)

The treatment within first-order perturbation theory is justified if the eigen-values of the underlying Kohn–Sham system are already sufficiently closeto the expected quasiparticle band structure. This is guaranteed by thetransition-state theorem [18, 19]. Besides, the orbitals ϕN,j(r) are usually agood approximation to the true quasiparticle wave functions. For the homo-geneous electron gas both are plane waves and thus coincide exactly. Numer-ical calculations for bulk semiconductors indicate an overlap close to unitybetween the Kohn–Sham orbitals in the LDA and the quasiparticle wavefunctions obtained from (7.6) with the G0W0 approximation for states nearthe band edges [39]. Larger effects have been observed for surfaces, especiallyfor image states, because the latter are located outside the surface in a regionwhere the LDA potential is qualitatively wrong [40,41]. Changes in the wavefunctions of other surface states can also be identified but have only a minorinfluence on the quasiparticle energies in the gap region. For GaAs(110) thishas been confirmed explicitly [42].

In principle, the equations (7.4) to (7.9) could be solved self-consistentlyby successively updating the self-energy with the quasiparticle orbitals de-rived from it. This approach is appealing on formal grounds because it makesthe results independent of the original mean-field approximation. In addi-tion, the self-consistent Green’s function satisfies certain sum rules, includ-ing particle-number conservation [43]. In practice, however, this procedure

158 Arno Schindlmayr and Matthias Scheffler

produces a poor excitation spectrum. The reason lies in the mathematicalstructure of Hedin’s equations, which describe the recipe for constructingthe self-energy from the Green’s function by means of functional-derivativetechniques [36]. The G0W0 approximation is the result of a single iteration.Further iterations would not only ensure self-consistency in the Green’s func-tion but would also introduce higher-order self-energy terms, so-called vertexcorrections. If the latter are neglected, then the spectral features deteriorate,as was first demonstrated for the homogeneous electron gas [44]. For bulksemiconductors self-consistency appears to lead to a gross overestimation ofthe fundamental band gap [45] in combination with poor spectral weightsand line widths. We hence follow the established procedure for practical ap-plications and terminate the cycle at the G0W0 approximation. Of course,the final results depend on the input Green’s function in this case.

Another point that has been raised in the same context concerns possi-ble errors resulting from the pseudopotentialapproach, which has tradition-ally dominated applications of the G0W0 approximation. Numerical devia-tions must be expected because the matrix elements of the self-energy in(7.10) are influenced by the pseudoization of the wave functions - that isthe neglect of core states plus appropriate smoothening of the valence statesin the core region - and the treatment of the core–valence interaction inpseudopotentialcalculations. Indeed, a number of early all-electron imple-mentations of the G0W0 approximation, based on the linearized augmentedplane-wave or the linearized muffin-tin orbital method, found quasiparticleband gaps for prototype semiconductors that were significantly smaller thanpreviously reported values and blamed the discrepancy on the pseudopo-tentialapproximation [46, 47]. However, this claim was later refuted by all-electron calculations using plane waves [48, 49]. The issue is currently stillunder debate, but there is mounting evidence that a large part of the ob-served discrepancy is due to insufficient convergence of the early calculationsin combination with deficiencies of the linearized basis sets for the descriptionof high-lying unoccupied states [48,50]. If these factors are properly accountedfor, then the deviation is significantly reduced, and the all-electron results areagain in good agreement with experimental measurements. A certain discrep-ancy with respect to pseudopotentialcalculations still remains, although thedifference is small compared to that from the Kohn–Sham eigenvalues. Whilesome errors of the pseudopotentialapproach can be partially suppressed, forexample through a better description of the core–valence interaction [51],many others are inherent. In particular, the pseudopotentialconstruction alsoentails an incorrect description of high-lying states. A careful analysis of thequantitative implications of the pseudopotentialapproximation is the subjectof active research.

The good quantitative agreement between the G0W0 approximation andexperimental band structures has been demonstrated for a wide range of bulkmaterials, including III–V semiconductors [52–54]. Due to the high compu-

7 Quasiparticle Calculations for Point Defects at Semiconductor Surfaces 159

L Γ X W K Γ

-12

-8

-4

0

4

8

ener

gy (

eV)

Fig. 7.1. Bulk band structure of GaAs. The G0W0 approximation (straight lines)opens the too small Kohn–Sham eigenvalue gap in the LDA (dashed lines) and isin very good agreement with the experimental band gap. The calculation is carriedout at the theoretical lattice constant 5.55 A, and the valence-band maximum is setto zero in both schemes.

tational cost in present implementations, which stems from the evaluationof the non-locality of the propagators, their frequency-dependence, and theneed to include a large number of unoccupied states, there have been rela-tively few applications to more complex systems so far, however. Furthermore,these often contain additional simplifications: the two available studies of thequasiparticle band structure of GaAs(110) both employed plasmon-pole mod-els instead of the more accurate random-phase approximation [55, 56], andthe only published results for InP(110) were obtained within an even morerestrictive tight-binding formalism [57].

For a quasiparticle band-structure calculation the matrix elements of theself-energy in (7.6) must be evaluated in the frequency domain for stateswith a given wave vector k. Therefore, most practical implementations of theG0W0 approximation choose a reciprocal-space representation for all propa-gators. The cell-periodic part of the wave functions is often expanded in planewaves [52, 55], although localized basis sets like Gaussian orbitals have alsobeen used [54]. The disadvantage of the representation in reciprocal space isthat the products (7.7) and (7.9) turn into numerically expensive multidi-mensional convolutions. Therefore, we employ a representation in real spaceand imaginary time [58, 59], in which the self-energy and the polarizabilitycan be calculated by simple multiplications. The projection on wave vectorsand imaginary frequencies used to solve the Dyson-type equation (7.8) canbe done efficiently by exploiting fast Fourier transforms. The imaginary timeand frequency arguments are chosen because the functions are smoother on

160 Arno Schindlmayr and Matthias Scheffler

these axes and can be sampled with fewer grid points, although they containexactly the same information. The physical self-energy on the real frequencyaxis is eventually recovered by an analytic continuation.

As an illustration of the G0W0 approximation we show the calculated bulkband structure of GaAs in Fig. 7.1. The band gap is direct and located at Γin the center of the Brillouin zone. While the LDA underestimates the fun-damental band gap and yields a Kohn–Sham eigenvalue gap of only 0.78 eVat a lattice constant of 5.55A, the subsequent addition of the self-energy cor-rection opens the band gap to 1.55 eV, which is in very good agreement withthe experimental value of 1.52 eV [34]. As we measure all energies relative tothe valence-band maximum, we set the latter to zero and align the two setsof curves at this point. The principal effect of the G0W0 approximation is arigid upward shift of the conduction bands, although the dispersion is alsoslightly modified, as can be seen by the reduced band width at the bottom ofthe valence band. The band structure of InP looks very similar. In this casewe obtain a Kohn–Sham eigenvalue gap of 0.76 eV at the theoretical latticeconstant 5.81 A and a quasiparticle band gap of 1.52 eV that is again close tothe experimental value 1.42 eV [34]. Incidentally, the band gap depends sensi-tively on the lattice constant. For GaAs it decreases at a rate of −4.07 eV/Ain the LDA and −4.59 eV/A in the G0W0 approximation when the latticeconstant increases. For InP the values are −3.13 eV/A and −3.68 eV/A.

7.3 Electronic Structure of Defect-Free Surfaces

Before focusing on defect states we first briefly discuss the electronic struc-ture of the defect-free GaAs(110) and InP(110) surfaces. The nonpolar (110)surface, illustrated in Fig. 7.2, is the natural cleavage plane of the zincblendelattice, because it cuts the smallest number of bonds per unit area. Boththe cations and anions in the terminating layer are threefold coordinatedand each possess one dangling bond extending into the vacuum. Due to thedifferent electron affinities of the two species, charge is transfered from thedangling bonds of the cations to those of the anions. This charge transferis the driving force for a structural relaxation, which consists of an outwardmovement of the anion atoms and a corresponding inward movement of thecation atoms [33]. As a result, the orbitals of the latter rehybridize from ansp3 towards an energetically more favorable sp2 bonding situation in a nearlyplanar environment; the empty pz-like orbitals perpendicular to this plane arepushed to higher energies and form an unoccupied surface band. At the sametime, the three bonds between the anions and their neighboring group-IIIatoms are rearranged at almost right angles and become more p-like in char-acter; the non bonding electron pairs in the fourth orbital pointing away fromthe surface are in turn lowered in energy and give rise to an occupied surfaceband. The relaxation preserves the C1h point-group symmetry with a singlemirror plane perpendicular to the [110] direction.

7 Quasiparticle Calculations for Point Defects at Semiconductor Surfaces 161

Fig. 7.2. Geometry of the anion vacancy at the GaAs(110) surface. The As atomsare shown in light and the Ga atoms in dark grey. The three Ga atoms nearest tothe vacancy are indicated by arrows.

In Fig. 7.3 we show the calculated band structure of the clean and defect-free GaAs(110) surface at the theoretical lattice constant 5.55A, modeled asa slab element placed in a supercell with periodic boundary conditions in alldirections. The slab consists of six atomic layers, of which the top three layersare allowed to relax while the three base layers are kept fixed at their idealbulk positions. The dangling bonds at the bottom of the slab are passivatedby pseudoatoms with noninteger nuclear charges of 0.75 and 1.25 for anionand cation termination, respectively. The slabs are separated by a vacuumregion equivalent to four atomic layers. The grey-shaded regions in the fig-ure mark the projection of the G0W0 bulk bands onto the two-dimensionalsurface Brillouin zone, shown in the inset. The dashed lines indicate the oc-cupied and unoccupied surface band obtained from the LDA, the straightlines are the corresponding G0W0 results. For the calculation of the ground-state density and the Kohn–Sham eigenvalues we used four Monkhorst–Packk-points [60] in the irreducible part of the Brillouin zone, while the self-energy was evaluated at the four high-symmetry points Γ, X

′, M, and X. In

our implementation the k-point set enters merely as the reciprocal grid ofthe real-space mesh used to describe the nonlocality of G0(r, r′; t − t′) andΣxc(r, r′; t − t′) [58]; the four selected k-points correspond to a real-spacemesh that extends over four surface unit cells. This is sufficient, because thecorrelation length is of the order of the interatomic distance [61]. The positionof the Kohn–Sham surface bands relative to the corresponding bulk bandswas determined by aligning the electrostatic potential in the central part ofthe slab with that of the bulk. We then applied the self-energy correctionindependently to surface and bulk bands and again chose the valence-bandmaximum as the energy zero. From the figure it is evident that the G0W0

approximation has only a small effect on the dispersion of the surface bands.

162 Arno Schindlmayr and Matthias Scheffler

Γ X’ M X Γ-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

ener

gy (

eV)

Γ X’

MX

Fig. 7.3. Surface band structure of GaAs(110) in the LDA (dashed lines) and theG0W0 approximation (straight lines) at the theoretical lattice constant 5.55 A. Thegrey-shaded regions mark the projected G0W0 bulk bands. The inset indicates thetwo-dimensional surface Brillouin zone.

While the position of the occupied surface band relative to the valence-bandedge at Γ remains almost unchanged, the upward shift of the unoccupiedsurface band (0.86eV) slightly exceeds that of the bulk conduction bands(0.77eV). The larger impact of the self-energy correction on the surface gapwas noticed before [55, 56]; our result for the gap correction lies between thetwo previously published values. Small deviations of about 0.1 eV are dueto differences in the implementations, for example the reliance on plasmon-pole models in [55, 56] compared to the random-phase approximation for thedynamically screened Coulomb interaction in this work. Both the occupiedand the unoccupied surface band are in close proximity to the projected bulkbands and, in fact, overlap with them in large parts of the Brillouin zone.As neither extends into the fundamental gap between the bulk valence andconduction band edges at Γ, they do not pin the surface Fermi level.

Although prevalent in electronic-structure calculations for surfaces, thesupercell approach is a drastic alteration of the system’s geometry whose in-fluence must be carefully monitored, because the occurrence of electric mul-tipole moments may lead to artificial long-range interactions between the pe-riodic slabs. Static dipoles, if present, can be eliminated in density-functionaltheory [62]. With this correction the limit of isolated slabs is quickly reached.Dynamic dipoles are always created in dielectric media, however, and con-tribute to the polarizability in the G0W0 approximation. Actually, there aretwo contributions that must be distinguished. The first is the dynamic po-larization between the slabs, which gives rise to an additional slowly varyingpotential that reduces the band gap. This effect can be understood and even

7 Quasiparticle Calculations for Point Defects at Semiconductor Surfaces 163

quantitatively modeled in terms of classical image charges [63]; for the geom-etry used in this work it amounts to about 0.1 eV. The model thus allows ana posteriori extrapolation to the limit of an isolated slab. The second contri-bution is the finite width of the slab, which increases the gap due to quan-tization effects. At present there is no obvious cure for this problem withinthe G0W0 approximation, and as the two effects counteract each other, wehave not applied any partial correction to eliminate the dynamic polariza-tion between the slabs either. In principle, both problems could be avoidedby studying systems comprised of semi-infinite matter and vacuum regions.Within density-functional theory several methods have indeed been proposedfor this purpose [64–67]. Their efficiency relies on the fact that a perturbationbreaking the translational symmetry, such as a surface or defect, modifies theeffective potential only in its immediate vicinity. In the G0W0 formalism thiscannot be exploited to the same degree, because all propagators are explic-itly non-local, and a much larger simulation cell must hence be taken intoaccount. So far, only one G0W0 calculation for an effectively one-dimensionalsystem, a semi-infinite jellium surface, has been reported [68].

In preparation for later applications to larger supercells, we repeated theG0W0 calculation with lower cutoff energies and found that the self-energycorrection to the surface gap remained stable up to 10 Ry. The reason for therapid convergence, which is well known and can be exploited to reduce thecomputational expense considerably, is that the kinetic energy and the elec-trostatic Hartree potential, the two largest contributions to the quasiparticleenergies, are already included in the Kohn–Sham eigenvalues in (7.10); thematrix elements of the self-energy are less sensitive to the number of planewaves. Besides, we obtained essentially the same surface gap when reducingthe width of the slab to four layers. For this geometry and a cutoff energyof 10Ry, we performed test calculations in which we included up to 1049unoccupied bands in the Green’s function. With 379 bands the results arealready converged within 0.02 eV, sufficient for our purpose.

7.4 Defect States

The geometry of the anion vacancy is illustrated in Fig. 7.2 for the GaAs(110)surface. The removal of the As atom leaves each of the three Ga atoms sur-rounding the vacancy with a dangling bond. As a consequence, the two GaI

atoms in the first layer move downwards while the GaII atom in the secondlayer moves upwards and forms two new bonds with the GaI atoms acrossthe void. Its coordination number thus increases from four to five, while thethreefold coordination of the GaI atoms remains unchanged. The relaxationpreserves the C1h point-group symmetry, except in the positive charge statewhere an asymmetric distortion that pushes the unoccupied defect level inthe band gap to higher energies is more favorable [6, 8]. We find that thedistortion lowers the total energy by 0.17 eV for GaAs and 0.11 eV for InP.

164 Arno Schindlmayr and Matthias Scheffler

The anion vacancy gives rise to three electronic states, all localized at theGaI–GaII bond pair. They are labeled as 1a′, 1a′′ and 2a′, where a′ denotesstates that are even with respect to the mirror plane and a′′ denotes statesthat are odd. Although the asymmetric relaxation in the positive charge statedestroys this symmetry and deforms the orbitals slightly, it leaves the orderof the states intact, and we continue to use the same notation for simplicity.The 1a′ state is located several eV below the valence-band maximum andthus always filled with two electrons, while the 2a′ state is too high in energyto become populated. Only the 1a′′ state falls inside the fundamental bandgap. Depending on the doping, it can be occupied either by zero, one ortwo electrons, which corresponds to the positive, neutral or negative chargestate, respectively. It is important to note that the charge state influencesthe defect geometry, as the GaI–GaII bonds contract with increasing electronoccupancy, reflecting the bonding character of the 1a′′ state. In the followingwe examine the position of this defect level for a given charge state; thequestion which charge state is preferred under specific conditions, such asdoping, is answered in the next section.

For the density-functional calculations we choose a supercell consisting of2×4 surface unit cells and six atomic layers. For charged systems we includea uniform charge density with opposite sign in order to compensate the extraelectron or hole and restore overall neutrality in the supercell. Instead of awell defined defect state, the supercell periodicity gives rise to an artificialdispersion as illustrated in Fig. 7.4. At the surface of each slab the defectsform a rectangular grid whose lattice parameters ax and ay equal the dimen-sions of the supercell. Since the 1a′′ state is odd with respect to the mirrorplane, one can regard it as a p orbital that exhibits π-type bonding alongthe [001] and σ-type bonding along the [110] direction. We hence consider atight-binding model

εKSk = εKS

1a′′ + 2V KS1π cos(kxax) + 2V KS

1σ cos(kyay) (7.11)+V KS

2 cos(kxax) cos(kxax) + 2V KS3π cos(2kxax) + 2V KS

3σ cos(2kyay)

with parameters fitted to the calculated Kohn–Sham band, where εKS1a′′ equals

the eigenvalue of a single defect and the other parameters have the meaningof hopping integrals. The above expression includes interactions up to third-nearest neighbors and reproduces the dispersion with a correlation coefficientclose to 0.9999 for all systems under consideration.

As the G0W0 formalism involves non-local propagators, the amount ofdata that must be processed grows rapidly with the system size. In order tolimit the computational expense we use a smaller (2×2) supercell and fouratomic layers instead of the (2×4) cell to determine the self-energy correc-tion of the 1a′′ state. The stronger defect–defect interaction along the [110]direction increases the dispersion but does not change it qualitatively. As thepresence of the defect does not modify the range of the non-local propagatorsappreciably, we use one k-point and 1500 unoccupied bands for the construc-

7 Quasiparticle Calculations for Point Defects at Semiconductor Surfaces 165

X/4 Γ X’/2-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

3

3.5

ener

gy (

eV)

LDA

GW

asym.

sym.

(a)

X/4 Γ X’/2-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

3

3.5

Γ X’

X M

(b)

Fig. 7.4. Calculated 1a′′ defect level of the As vacancy at GaAs(110) in the (a)positive and (b) negative charge state (dashed line : LDA; solid line : G0W0). Theartificial dispersion in the LDA is due to the supercell periodicity; the horizontallines mark the actual defect level ε1a′′ and the corresponding G0W0 result. As thecalculations in (a) refer to the constrained symmetric relaxation, the additionalshift due to the asymmetric distortion is shown separately. The inset in (b) indicatesthe downfolded Brillouin zone for the (2×4) supercell.

tion of the Green’s functionG0, which corresponds to the same Brillouin-zonesampling as the four k-points and 379 bands that we found satisfactory forthe (1×1) unit cell of the defect-free surface. For the positively charged Asvacancy at GaAs(110) we evaluated the self-energy correction in the entireBrillouin zone [15]. By relating the calculated quasiparticle dispersion to atight-binding model equivalent to (7.11), we found very similar values for theparameters V3π and V3σ as in the LDA, which implies that the influence ofthe third-nearest neighbors on the self-energy correction is negligible. Thisobservation can be exploited as follows. At k′ = (2π/4)(1/ax, 1/ay, 0) thecontributions from the first- and second-nearest neighbors vanish, so thatthe Kohn–Sham eigenvalue is given by

εKSk′ = εKS

1a′′ − 2V KS3π − 2V KS

3σ (7.12)

and the corresponding quasiparticle energy by

εk′ = ε1a′′ − 2V3π − 2V3σ . (7.13)

If the surviving hopping integrals on the right-hand sides coincide, thenthe identity ε1a′′ − εKS

1a′′ = εk′ − εKSk′ holds. On the other hand, the self-energy

correction is defined through the relation

εk′ − εKSk′ = 〈ϕKS

k′ |Σ(εKSk′ ) − Vxc|ϕKS

k′ 〉 . (7.14)

166 Arno Schindlmayr and Matthias Scheffler

Instead of a scan over the whole Brillouin zone, we can hence determine theself-energy correction for an isolated defect from a single calculation at k′.The quasiparticle results are obtained by adding this correction to the Kohn–Sham value εKS

1a′′ of the larger (2×4) supercell, where the position of the latterrelative to the valence-band maximum can be established more accurately.We estimate that the uncertainty of the final quasiparticle energies, whichresults from the discrete k-point sampling, the approximate treatment of thecore–valence interaction in the pseudopotentialapproach, the finite size of thesupercell, and other convergence factors amounts to 0.1–0.2eV.

We performed explicit G0W0 calculations for the positive and negativecharge states, in the first case using a constrained symmetric relaxation. Theresults for GaAs(110) are displayed in Fig. 7.4. For the neutral anion vacancythe supercell contains an odd number of electrons; in combination with therequirement of spin degeneracy this leads to fractional occupation numbers ineach spin channel. At present we cannot treat such systems. For the positivecharge state with the proper asymmetric distortion the defect level in theG0W0 approximation is not calculated directly but deduced as follows. Asthe lowest unoccupied state, the 1a′′ defect level equals the electron affinity,i.e., ε1a′′ = Evac(0, Qasym

+ ) − Evac(+, Qasym+ ), where Evac(q, Q) denotes the

total energy of a surface featuring an anion vacancy with the actual electronpopulation q ∈ {+, 0,−} and a geometric structure optimized for the chargestate Q ∈ {Qasym

+ , Qsym+ , Q0, Q−}. This expression can be rewritten as

ε1a′′ =[Evac(0, Qasym

+ ) −Evac(0, Qsym+ )

]+[Evac(0, Qsym

+ ) −Evac(+, Qsym+ )

]+[Evac(+, Qsym

+ ) − Evac(+, Qasym+ )

](7.15)

by adding and subtracting intermediate configurations. The second term onthe right-hand side equals the quasiparticle energy for the correspondingsymmetric relaxation, which can be calculated with less computational costby exploiting the C1h point-group symmetry. The other two terms are simpletotal-energy differences between the symmetric and the asymmetric geometryfor a constant number of electrons; both are positive and can be obtained fromdensity-functional theory

The calculated defect levels are summarized in Table 7.1 for GaAs(110)and Table 7.2 for InP(110). Our LDA results are consistent with most of thepreviously published calculations [3, 5, 7]. A notable exception are the valuesby Zhang and Zunger [3] for the negative charge state of the As vacancy atGaAs(110) and for the constrained symmetric geometry of the positive chargestate in Table 7.1. After a full relaxation of the asymmetric distortion in thelatter case, the discrepancy vanishes, however. The agreement with the resultsby Kim and Chelikowsky [5] for the same system is very good, except for adifference of 0.1 eV for the neutral charge state. The origin of these deviationscannot be traced, because both groups of authors give very little informationabout their computational details, and the size of the corresponding Kohn–Sham eigenvalue gap is not stated. For all configurations with symmetric

7 Quasiparticle Calculations for Point Defects at Semiconductor Surfaces 167

Table 7.1. Calculated defect levels for the As vacancy at GaAs(110) in eV relativeto the valence-band maximum. For the positive charge state the first column refersto the asymmetric and the second to the constrained symmetric relaxation. Thequasiparticle band gap of 1.55 eV in this work, calculated at the theoretical latticeconstant 5.55 A, is close to the experimental value of 1.52 eV.

charge state (+1) (0) (−1)

LDA (this work) 0.70 (0.06) 0.13 0.24

LDA (Ref. [3]) 0.73 (0.41) 0.5

LDA (Ref. [5]) (0.06) 0.23 0.24

G0W0 (this work) 1.08 (0.59) 0.48

geometries the defect level moves steadily upwards with increasing electronpopulation. The asymmetric distortion has a large effect on the unoccupieddefect level and pushes it to significantly higher energies. Concomitant withthe opening of the fundamental band gap, the G0W0 approximation adds apositive quasiparticle correction to all defect levels and predicts larger valuesthan the LDA for all charge states. The size of the self-energy shift dependsboth on the charge state and the geometry, although the differences are of thesame order as the error bar of the calculation. We note that our numericalvalues differ slightly from those reported earlier in [15]. The discrepancy is dueto an improved description of the anisotropic screening in the slab geometryin this work but lies within the estimated overall error bar of the calculation.For the P vacancy at InP(110) we obtain a similar picture, although the defectlevels are at higher energies both in the LDA and the G0W0 approximation.

Unfortunately, due to experimental difficulties no direct measurements ofthe defect states by photoemission are available. In this situation one can onlytry to extract values from indirect methods like surface photovoltage imagingwith STM. One study using this technique claimed a value of 0.62±0.04eVfor the As vacancy in the positive charge state, based on the known positionof the sample’s Fermi energy 0.09eV above the valence-band maximum and

Table 7.2. Calculated defect levels for the P vacancy at InP(110) in eV relativeto the valence-band maximum. For the positive charge state the first column refersto the asymmetric and the second to the constrained symmetric relaxation. Thequasiparticle band gap of 1.52 eV in this work, calculated at the theoretical latticeconstant 5.81 A, slightly exceeds the experimental value of 1.42 eV.

charge state (+1) (0) (−1)

LDA (this work) 0.89 (0.39) 0.49 0.60

LDA (Ref. [7]) (0.326) 0.479 0.580

G0W0 (this work) 1.36 (0.91) 1.01

168 Arno Schindlmayr and Matthias Scheffler

a local band bending of 0.53 eV [69]. An STM measurement on a differentsample found a band bending of only 0.1 eV, however [12]. The origin of thisdiscrepancy is unclear but points to a strong influence of experimental con-ditions and/or sample quality. Because of this uncertainty, no experimentalvalues are included in the tables.

7.5 Charge-Transition Levels

The occurrence of different charge states is a direct consequence of the posi-tion of the defect level inside the fundamental band gap: in p-doped materialsthe defect state lies above the Fermi energy and is hence depopulated, whereasit is fully occupied in n-doped materials with a higher Fermi energy close tothe conduction-band edge. Indeed, a charge state of (+1) has been confirmedexperimentally for anion vacancies at p-GaAs(110) [70] and p-InP(110) [71]and a charge state of (−1) at n-GaAs(110) [72] and n-InP(110) [73]. In atheoretical treatment the stable charge state can be identified by comparingthe different formation energies

Eform(q, μA, εF) = Evac(q, Qq) + μA + qεF − Esurf , (7.16)

where we use the same notation for the total energy of the vacancy as in theprevious section. The chemical potential μA of the anion atoms is controlledby the partial pressure and temperature, εF denotes the Fermi level and Esurf

is the total energy of the defect-free surface. The qualitative behavior of theformation energies is illustrated in Fig. 7.5: due to their different slopes thestable charge state with the lowest formation energy changes from positiveto negative as the Fermi energy varies between the valence-band maximumand the conduction-band minimum. In between there may be a region wherethe neutral vacancy is stable. The charge-transition levels are defined as thevalues of the Fermi energy where the curves intersect and the stable chargestate changes. They are given explicitly by ε+/0 = Evac(0, Q0)−Evac(+, Q+)for the transition from q = +1 to q = 0 and ε0/− = Evac(−, Q−)−Evac(0, Q0)for the transition from q = 0 to q = −1.

All quantities in (7.16) are ground-state energies and can, in principle,be calculated within density-functional theory. As explained above, however,the parameterizations most commonly used in practical implementations lackessential properties of the exact exchange-correlation functional, so that thecharge-transition levels obtained in this way suffer from systematic errors.For a more accurate quantitative description we use the same trick as in(7.15) and rewrite ε+/0 as

ε+/0 = [Evac(0, Q0) − Evac(0, Q+)] + [Evac(0, Q+) −Evac(+, Q+)] (7.17)

by adding and subtracting the total energy Evac(0, Q+) of a configurationwith the atomic geometry of the positively charged vacancy but with one

7 Quasiparticle Calculations for Point Defects at Semiconductor Surfaces 169

VBM

∋+/0 ∋0/− ∋

CBMFermi energy

form

atio

n en

ergy

VA

+

VA

0

VA

Fig. 7.5. Qualitative behavior of the formation energies for anion vacancies in thepositive (V+

A), neutral (V0A), and negative (V−

A) charge state. The Fermi energyis limited by the valence-band maximum and the conduction-band minimum. Thecharge-transition levels ε+/0 and ε0/− mark the values of the Fermi energy wherethe stable charge state with the lowest formation energy changes.

extra electron. In this way the charge-transition level is naturally decom-posed into two distinct contributions. The first term on the right-hand sideis purely structural and describes the relaxation energy of the neutral sys-tem from the atomic structure optimized for the positive charge state to itsown equilibrium geometry. It is always negative and can be calculated withindensity-functional theory. We obtain −0.59 eV for GaAs(110) and −0.54 eVfor InP(110) when taking the asymmetric distortion into account. The sec-ond term on the right-hand side is purely electronic and equals the electronaffinity of the positively charged vacancy, which in a many-body frameworkcorresponds to the lowest unoccupied state, i.e., the empty 1a′′ defect level inthe band gap. This was already calculated within the G0W0 approximationin the preceding section and can be taken from Tables 7.1 and 7.2. In thesame spirit we rewrite ε0/− as

ε0/− = [Evac(−, Q−) −Evac(0, Q−)] + [Evac(0, Q−) − Evac(0, Q0)] . (7.18)

The first term now equals the ionization potential of the negatively chargedvacancy, corresponding to the highest occupied quasiparticle state. Againthis is the 1a′′ defect level, which in this case is filled with two electrons, andthe G0W0 results can be taken from the tables in the previous section. Thesecond term is the energy difference between the electrically neutral systemin the atomic structure optimized for the positive charge state and its ownrelaxed geometry. This contribution is always positive and can be obtainedfrom density-functional theory. We obtain 0.12eV for GaAs(110) and 0.08 eVfor InP(110). The structural component is much smaller than for ε+/0 becausethere is no symmetry-breaking distortion in this case, only a minor reductionof the Ga–Ga bond length across the vacancy.

Incidentally, the charge-transition levels within the LDA, which are usu-ally obtained according to the definitions ε+/0 = Evac(0, Q0) − Evac(+, Q+)

170 Arno Schindlmayr and Matthias Scheffler

Table 7.3. Calculated charge-transition levels for the As vacancy at GaAs(110) ineV relative to the valence-band maximum. For ε+/0 the first column refers to theasymmetric and the second to the constrained symmetric relaxation.

transition level ε+/0 ε0/−

LDA (this work) 0.24 (0.07) 0.15

LDA (Ref. [3]) 0.32 0.4

LDA (Ref. [5]) (0.10) 0.24

G0W0 (this work) 0.49 (0.32) 0.60

and ε0/− = Evac(−, Q−) − Evac(0, Q0) by evaluating the total-energy differ-ences directly, can also be decomposed into structural and electronic energycontributions. The latter are not given by the Kohn–Sham eigenvalues inTables 7.1 and 7.2, however. Instead, they must be calculated with the helpof the transition-state theorem [18] from intermediate configurations withhalf-integer occupation numbers of 1/2 and 3/2 electrons, respectively.

In Table 7.3 we summarize the results for the As vacancy at GaAs(110).In contrast to earlier studies [3,5] that found a small energy window in whichthe neutral charge state is stable, our own calculation at the level of the LDApredicts ε+/0 > ε0/− if the correct asymmetric distortion is taken into accountand hence a direct transition from the positive to the negative charge state,but the small energy difference is within the uncertainty of the calculation.The G0W0 approximation, on the other hand, reverses this ordering and si-multaneously moves all charge-transition levels to higher energies. The valuesfor the constrained symmetric relaxation are merely shown for the purpose ofcomparison with earlier work. The deviations from earlier LDA studies, espe-cially by Zhang and Zunger [3], are related to differences of similar magnitudein the defect states, which were already mentioned above.

The results for the P vacancy at InP(110) are given in Table 7.4. For thismaterial a quantitative experimental measurement of ε+/0, obtained witha combination of scanning tunneling microscopy and photoelectron spec-troscopy, is available [6]. Consistent with previous studies [6, 7], we findthat the LDA significantly underestimates the experimental value. The larger

Table 7.4. Calculated charge-transition levels for the P vacancy at InP(110).

transition level ε+/0 ε0/−

LDA (this work) 0.47 (0.39) 0.54

LDA (Ref. [6]) 0.52 (0.45)

LDA (Ref. [7]) 0.388 0.576

G0W0 (this work) 0.82 (0.79) 1.09

Expt. (Ref. [6]) 0.75±0.1

7 Quasiparticle Calculations for Point Defects at Semiconductor Surfaces 171

G0W0 result, however, lies within the error bar of the experimental measure-ment. We take this as a confirmation of our approach and an indicator thatthe other defect states and charge-transition levels calculated within the sameframework are also meaningful. Nevertheless, further calculations for differentsystems are necessary to establish the general validity of this scheme.

7.6 Summary

We have presented a parameter-free method for calculating defect states andcharge-transition levels of point defects in semiconductors. Compared to pre-vious studies that extracted these quantities directly from the self-consistentiteration of the Kohn–Sham equations, it apparently corrects important er-rors that are inherent in all jellium-based exchange-correlation functionalsand employs a separation of structural and electronic energy contributions.While the former are accurately obtainable in density-functional theory, weuse many-body perturbation theory and the G0W0 approximation for theself-energy to calculate the latter with proper quasiparticle corrections. Thescheme is general and can be applied to arbitrary bulk or surface defects. Asan example we examined the electronic structure of anion vacancies at the(110) surfaces of III–V semiconductors. For the As vacancy at GaAs(110)our calculation indicates that all three charge states including the neutralconfiguration are stable, in contrast to the LDA that predicts a direct tran-sition from the positive to the negative charge state. Due to a general lackof experimental data, a direct comparison between theoretical and experi-mental values is only possible for the charge-transition level ε+/0 of the Pvacancy at InP(110). In this case our calculation is in good agreement withthe experimental result and constitutes a clear improvement over previousLDA treatments. Nevertheless, besides an improvement of experimental tech-niques, further developments in theoretical and computational procedures arehighly desirable. Our method only opens the door to novel approaches; withpresent implementations the results are derived at a cost that may be infea-sible for more complex systems, and the numerical uncertainty of 01–0.2eVis often of the same order as the relevant energy differences. The study of ex-citons, which e.g. play an important role at GaAs(110) [74], further requiresan extension of the mathematical framework beyond single quasiparticles.

Acknowledgments

We thank Magnus Hedstrom, Gunther Schwarz, and Jorg Neugebauer forfruitful collaborations in the course of this work and Philipp Ebert for usefuldiscussions. This work was funded in part by the EU through the NanophaseResearch Training Network (Contract No. HPRN-CT-2000-00167) and theNanoquanta Network of Excellence (Contract No. NMP-4-CT-2004-500198).

172 Arno Schindlmayr and Matthias Scheffler

References

1. P. Ebert: Curr. Opin. Solid State Mater. Sci. 5, 211 (2001)2. V.P. LaBella, H. Yang, D.W. Bullock, P.M. Thibado, P. Kratzer, M. Scheffler:

Phys. Rev. Lett. 83, 2989 (1999)3. S.B. Zhang, A. Zunger: Phys. Rev. Lett. 77, 119 (1996)4. H. Kim, J.R. Chelikowsky: Phys. Rev. Lett. 77, 1063 (1996)5. H. Kim, J.R. Chelikowsky: Surf. Sci. 409, 435 (1998)6. P. Ebert, K. Urban, L. Aballe, C.H. Chen, K. Horn, G. Schwarz, J. Neugebauer,

M. Scheffler: Phys. Rev. Lett. 84, 5816 (2000)7. M.C. Qian, M. Gothelid, B. Johansson, S. Mirbt: Phys. Rev. B 66, 155326

(2002)8. M.C. Qian, M. Gothelid, B. Johansson, S. Mirbt: Phys. Rev. B 67, 035308

(2003)9. P. Hohenberg, W. Kohn: Phys. Rev. 136, B864 (1964)

10. W. Kohn, L.J. Sham: Phys. Rev. 140, A1133 (1965)11. J.P. Perdew, K. Burke, M. Ernzerhof: Phys. Rev. Lett. 77, 3865 (1996)12. G. Lengel, R. Wilkins, G. Brown, M. Weimer: Phys. Rev. Lett. 72, 836 (1994)13. A. Gorling, Phys. Rev. A 54, 3912 (1996)14. V. Fiorentini, M. Methfessel, M. Scheffler: Phys. Rev. B 47, 13353 (1993)15. M. Hedstrom, A. Schindlmayr, M. Scheffler: phys. stat. sol. (b) 234, 346 (2002)16. L. Sorba, V. Hinkel, H.U. Middelmann, K. Horn: Phys. Rev. B 36, 8075 (1987)17. B. Reihl, T. Riesterer, M. Tschudy, P. Perfetti: Phys. Rev. B 38, 13456 (1988)18. J.C. Slater: Adv. Quantum Chem. 6, 1 (1972)19. J.F. Janak: Phys. Rev. B 18, 7165 (1978)20. C.-O. Almbladh, U. von Barth: Phys. Rev. B 31, 3231 (1985)21. J.P. Perdew, M. Levy: Phys. Rev. Lett. 51, 1884 (1983)22. L.J. Sham, M. Schluter: Phys. Rev. Lett. 51, 1888 (1983)23. A. Gorling, M. Levy: Phys. Rev. A 50, 196 (1994)24. A. Gorling: Phys. Rev. B 53, 7024 (1996)25. M. Stadele, J.A. Majewski, P. Vogl, A. Gorling: Phys. Rev. Lett. 79, 2089

(1997)26. R.W. Godby, M. Schluter, L.J. Sham: Phys. Rev. B 37, 10159 (1988)27. T. Kotani: J. Phys. Condens. Matter 10, 9241 (1998)28. J.P. Perdew, A. Zunger: Phys. Rev. B 23, 5048 (1981)29. D. Ceperley, B. Alder: Phys. Rev. Lett. 45, 566 (1980)30. L. Kleinman, D.M. Bylander: Phys. Rev. Lett. 48, 1425 (1982)31. M. Bockstedte, A. Kley, J. Neugebauer, M. Scheffler: Comput. Phys. Commun.

107, 187 (1997)32. M. Fuchs, M. Scheffler: Comput. Phys. Commun. 119, 67 (1999)33. J.L.A. Alves, J. Hebenstreit, M. Scheffler: Phys. Rev. B 44, 6188 (1991)34. Landolt-Bornstein: Numerical Data and Functional Relationships in Science

and Technology – New Series, vol III/41A, ed by O. Madelung, U. Rossler,M. Schulz (Springer Berlin Heidelberg 2001)

35. G.D. Mahan: Many-Particle Physics, 3rd edn (Springer Berlin Heidelberg 2000)36. L. Hedin: Phys. Rev. 139, A796 (1965)37. W. von der Linden, P. Horsch: Phys. Rev. B 37, 8351 (1988)38. G.E. Engel, B. Farid: Phys. Rev. B 47, 15931 (1993)39. M.S. Hybertsen, S.G. Louie: Phys. Rev. B 34, 5390 (1986)

7 Quasiparticle Calculations for Point Defects at Semiconductor Surfaces 173

40. I.D. White, R.W. Godby, M.M. Rieger, R.J. Needs: Phys. Rev. Lett. 80, 4265(1998)

41. M. Rohlfing, N.-P. Wang, P. Kruger, J. Pollmann: Phys. Rev. Lett. 91, 256802(2003)

42. O. Pulci, F. Bechstedt, G. Onida, R. Del Sole, L. Reining: Phys. Rev. B 60,16758 (1999)

43. A. Schindlmayr, P. Garcıa-Gonzalez, R.W. Godby: Phys. Rev. B 64, 235106(2001)

44. B. Holm, U. von Barth: Phys. Rev. B 57, 2108 (1998)45. W.-D. Schone, A.G. Eguiluz: Phys. Rev. Lett.: 81, 1662 (1998)46. W. Ku, A.G. Eguiluz: Phys. Rev. Lett. 89, 126401 (2002)47. T. Kotani, M. van Schilfgaarde: Solid State Commun. 121, 461 (2002)48. M.L. Tiago, S. Ismail-Beigi, S.G. Louie: Phys. Rev. B 69, 125212 (2004)49. K. Delaney, P. Garcıa-Gonzalez, A. Rubio, P. Rinke, R.W. Godby: Phys. Rev.

Lett. 93, 249701 (2004)50. C. Friedrich, A. Schindlmayr, S. Blugel, T. Kotani: to be published51. E.L. Shirley, X. Zhu, S.G. Louie: Phys. Rev. B 56, 6648 (1997)52. R.W. Godby, M. Schluter, L.J. Sham: Phys. Rev. B 35, 4170 (1987)53. X. Zhu, S.G. Louie: Phys. Rev. B 43, 14142 (1991)54. M. Rohlfing, P. Kruger, J. Pollmann: Phys. Rev. B 48, 17791 (1993)55. X. Zhu, S.B. Zhang, S.G. Louie, M.L. Cohen: Phys. Rev. Lett. 63, 2112 (1989)56. O. Pulci, G. Onida, R. Del Sole, L. Reining: Phys. Rev. Lett. 81, 5374 (1998)57. S.J. Jenkins, G.P. Srivastava, J.C. Inkson: Surf. Sci. 352–254, 776 (1996)58. M.M. Rieger, L. Steinbeck, I.D. White, H.N. Rojas, R.W. Godby: Comput.

Phys. Commun. 117, 211 (1999)59. L. Steinbeck, A. Rubio, L. Reining, M. Torrent, I.D. White, R.W. Godby:

Comput. Phys. Commun. 125, 105 (2000)60. H.J. Monkhorst, J.D. Pack: Phys. Rev. B 13, 5188 (1976)61. A. Schindlmayr: Phys. Rev. B 62, 12573 (2000)62. J. Neugebauer, M. Scheffler: Phys. Rev. B 46, 16067 (1992)63. P. Eggert, C. Freysoldt, P. Rinke, A. Schindlmayr, M. Scheffler: Verhandl. DPG

(VI) 40, 2/491 (2005)64. N.D. Lang, A.R. Williams: Phys. Rev. B 18, 616 (1978)65. P.J. Braspenning, R. Zeller, A. Lodder, P.H. Dederichs: Phys. Rev. B 29, 703

(1984)66. M. Scheffler, C. Droste, A. Fleszar, F. Maca, G. Wachutka, G. Barzel: Physica

B 172, 143 (1991)67. J. Bormet, J. Neugebauer, M. Scheffler: Phys. Rev. B 49, 17242 (1994)68. G. Fratesi, G.P. Brivio, L.G. Molinari: Phys. Rev. B 69, 245113 (2004)69. S. Aloni, I. Nevo, G. Haase: Phys. Rev. B 60, R2165 (1999)70. K.J. Chao, A.R. Smith, C.K. Shih: Phys. Rev. B 53, 6935 (1996)71. P. Ebert, X. Chen, M. Heinrich, M. Simon, K. Urban, M.G. Lagally: Phys. Rev.

Lett. 76, 2089 (1996)72. C. Domke, P. Ebert, K. Urban: Phys. Rev. B 57, 4482 (1998)73. P. Ebert, K. Urban, M.G. Lagally: Phys, Rev. Lett. 72, 840 (1994)74. O. Pankratov, M. Scheffler: Phys. Rev. Lett. 75, 701 (1995)

8 Multiscale modelling of defects in

semiconductors: a novel molecular dynamics

scheme

Gabor Csanyi1, Gianpietro Moras2, James R. Kermode3, Michael C.Payne4, Alison Mainwood5, Alessandro De Vita6

1 Cavendish Laboratory, University of Cambridge, Madingley Road, CB3 0HE,United Kingdom [email protected]

2 Department of Physics, King’s College London, Strand, London, UnitedKingdom [email protected] and Cavendish Laboratory, Universityof Cambridge, Madingley Road, CB3 0HE, United Kingdom [email protected]

3 Cavendish Laboratory, University of Cambridge, Madingley Road, CB3 0HE,United Kingdom [email protected]

4 Department of Physics, King’s College London, Strand, London, UnitedKingdom [email protected]

5 Department of Physics, King’s College London, Strand, London, UnitedKingdom, DEMOCRITOS National Simulation Center and CENMAT-UTS,Trieste, Italy alessandro.de [email protected]

8.1 Introduction

Over the last twenty years, the improvement in computational modellingof materials problems has been remarkable. This is due to the significantincrease in the capacity and speed of computers matched (and arguably sur-passed) by the ingenuity of those who write the computational codes. It wouldbe invidious to do other than refer the reader to other papers in this volumeto support these assertions. However, these advances are also challenged bythe complexities of systems that need to be modelled.

There has been a great deal written about different scales of modelling,usually where the results of calculations on one scale are used to determinethe parameters that are used to model the material at a larger scale. Manyof the embedding approaches are designed in this way. Additionally, wheredynamic processes are being modelled, it may be useful to treat differenttime scales by different means. A combination of molecular dynamics andquantum mechanical modelling using a first principles technique for example,can model only a few hundred atoms for a period of a few picoseconds. Usingmore approximate methods, one may be able to extend the number of atomsto hundreds or thousands, or the time period by a few orders of magnitude.In order to extend the time period to the several seconds for some biologicalprocesses, or years or aeons for geological ones, the activation energies andreaction pathways from the static or picosecond MD calculations have tobe inserted into Monte Carlo models or analytical rate equations and theirvariants.

176 Csanyi et al.

In this paper, we suggest a selection of problems where the complexityof such an approach makes it difficult or impossible to treat every lengthscale with a separate calculation, largely because the physical processes onthe various length scales are strongly coupled to each other. We then go onto describe a method which allows some of these complex problems to betackled.

8.2 A hybrid view

Radiation or implantation damage

An important technique for doping semiconductors is to implant the dopantwith an energy (or range of energies) that determines the location of thedoped area or layer. However, in this process damage in the form of vacan-cies and interstitials and their complexes are formed, some of which can beelectrically active. Annealing may heal most of the damage (e.g. [1]), but in-variably some defects remain and can be detected by photoluminescence andpositron annihilation spectroscopy [2]. Intrinsic defects have very low activa-tion energies for diffusion, so they may migrate long distances before they aretrapped by dopants, or other radiation–induced defects. The same problemsarise when the radiation damage of silicon particle detectors is modelled. Avery comprehensive empirical analysis of the processes involved in the an-nealing of damage was undertaken by Huhtinen [3], but he did not attemptto understand the microscopic processes. However, his analysis showed thatsome long range migration of radiation damage products and displaced im-purities was occurring. It is well within the current state of the art to modelthe dynamics of the initial impact of a particle or ion with a silicon atom in asmall finite cluster (or periodic supercell), and the subsequent displacementof that atom. If the damage trail formed by the particle, the knock-on atomand any further damage products can all be contained in the finite cluster orsupercell, then a reasonable model of the full process can be built up. Morecommonly, though, one or more of the damage products may exit the clusterin a direction not easily predicted from the initial conditions, or the intrin-sic defects may migrate longer distances than the dimensions of the largestclusters. The problem is then to construct a method to follow the importantspecies such that their interactions are treated with the required level of ac-curacy, while keeping the whole problem to within acceptable computationallimits.

Point defect diffusion

A very similar problem is the annealing of point defects in semiconductors,where an additional complication is the chemical interaction between theimpurity and the defects in the host lattice. As an example, atomic hydro-gen in silicon has a stable site in a bond–centered position, which is only

8 Multiscale modelling of defects in semiconductors 177

stable when the bond is greatly elongated—a self–trapped site [4]. It alsohas a metastable site at the tetrahedral site, where the migration barrier isvery low. To model the migration, a full molecular dynamics model mustbe used, but it must also be capable of following the long–range migrationof the hydrogen through the tetrahedral site routes. For hydrogen, a propermodel using quantum mechanics for the proton as well as for the electronsis necessary, and has been developed in a limited way [5], but the principlesillustrated by this example apply to the diffusion of many other impuritiesand defects. Classical interatomic potentials are available for some of the sim-pler chemical systems, but they tend to be constructed for bonding situationswhich are close to equilibrium, and have rather more questionable validitywhere the local environment of the host–defect complex is greatly distorted.

Both the relative stability and the rate of migration of point defects canbe altered by the presence of strain fields and strain gradients which areubiquitous in semiconductor systems. While the point defects typically alterthe local atomic structure only on a scale of a nanometer, the strain fieldsof epitaxial layers extend much further due to lattice mismatch. Dislocationsalso give rise to a slow-decaying strain field. This means that while the quan-tum mechanical treatment of new bonding arrangements near each individualdefect can be accomplished using a few hundred atoms, the environment inwhich the diffusion and possible interaction of the defects takes place needsto be represented by tens of thousands of atoms or more.

Dislocation motion

The strength of real materials is dominated by the behaviour of its dislo-cations. Around the core of a dislocation, there is a 1–D region in whichthe bonding of the host is distorted. Once there is a kink in the dislocation,by which means it moves, the distortions become much more pronounced,and this part of the crystal must be modelled by particularly accurate tech-niques, see Figure 8.1 for a schematic view. Moreover, the movement of thekink means that this highly distorted region also moves, which typically in-volves bond breaking and forming. More complexity arises when dislocationsinteract with each other or with grain boundaries, and they may also besources (or sinks) for point defects.

Grain boundaries

Nanocrystalline silicon and other semiconductors are becoming more andmore technologically important. Similarly, the crystal quality of many of thenewer wide band gap semiconductors used in electronics (the nitrides, siliconcarbide, diamond) is not nearly as perfect as silicon—it is no longer acceptableto ignore dislocations and grain boundaries when trying to understand thedegradation or limitations of their electronic properties. The exact role thatgrain boundaries play in plasticity is far from being completely understood.In particular, the grain boundaries need to be investigated as sources and

178 Csanyi et al.

Fig. 8.1. Schematic view of a screw dislocation with a double kink. In covalentmaterials, dislocations move by the formation of double kinks. Gray circles approx-imately indicate the region in which active rebonding takes place

sinks for dislocations. Significant departure from the equilibrium crystallinestructure is present. This takes the form of a mixture of locally disorderedbonding and longer range elastic distortion—a situation which needs a mul-tiscale description. Furthermore, the grain boundaries can act as traps fordopants, electrons and holes, adding to the local chemical complexity. Onceagain, although the longer range interactions are well described by empiricalpotentials, the description of the local chemistry requires quantum mechani-cal accuracy.

Fracture

Fracture is perhaps the textbook example of a multiscale problem. The im-portance of understanding both the catastrophic failure of brittle materialsand the conditions under which normally ductile materials become brittlecannot be overemphasized. Classical continuum models of the stress field doan excellent job in predicting the enhancement of stress near the crack tip.However, the divergence of components of the stress tensor at the tip implythat elasticity theory must break down in this region since it is not applica-ble when distortions go beyond the harmonic range. Indeed, during failurethe interatomic bonds are stretched well into the anharmonic region and ul-timately are broken. In covalent systems this leaves dangling bonds, whoseenergetics can only be described by a fully quantum mechanical treatment.Moreover, the influence of impurities and corrosive agents on failure processeshas never been explored theoretically at this level of detail, yet there is ev-idence for the critical role they play, e.g. in fracture of silicon notches in a

8 Multiscale modelling of defects in semiconductors 179

Fig. 8.2. Displacement field of a propagating crack under uniaxial tension in theopening mode (analytical continuum solution)

wet environment [6, 7] and in the controlled fabrication of thin Si films byhydrogen implantation [8–10].

The key aspect of brittle fracture that makes it amenable to hybrid mod-elling is that the propagating crack is atomically sharp [11]. Thus, althoughthe crack surface that is opening up is two dimensional, the active region,where bonds are being broken, is a one dimensional line, perpendicular tothe direction of propagation. The strongly coupled multiscale aspect of crackpropagation comes about because the opening crack gives rise to a stress fieldthat has a singularity (in the continuum approximation), diverging as 1/

√r

where r is the distance ahead of the crack tip, and in turn, it is this largestress that breaks the bonds thus advances the crack forward. The extensionof the stress field is nevertheless very large. If we wish to represent all theatoms that contribute significantly to elastic relaxation as the stress field ad-vances, we need to include several tens of thousands of atoms. Only then canwe hope to be able to quantitatively predict critical loading levels.

8.3 Hybrid simulation

The general pretext in a hybrid simulation is that we are going to use twodifferent physical models to describe the system. It is understood that one ofthese is accurate enough to describe

the physics of the activated regions properly, which, in practice, meanswe will use some sort of quantum mechanical description, e.g. a tight bind-ing (TB) formalism, or indeed density functional theory (DFT); henceforth

180 Csanyi et al.

this first model will be referred to as the QM model. Using such sophisti-cated modelling comes at a price, in the very real terms of computing costs.With the currently available hardware, it is typically possible to simulatethousands of atoms using TB, and hundreds of atoms using DFT for tensof picoseconds. These are evidently modest capabilities if we are faced withany of the above problems, and hence we need to employ a second model,to be used in parts of the system that are not activated at a given instantin time. This second model should capture correctly the basic topology ofbonding and the response to small deformations, while at the same time berelatively inexpensive to compute. empirical atomistic models fit the bill per-fectly. It should be noted that the recent progress in the field of empirical“ball and spring” type modelling has been in the direction of refining the an-alytical forms in an attempt to extend their range of applicability [12–14,64].However, once we have decided to employ a hybrid technology there is littlebenefit in complicating this second, classical model. It is more important thatit is as robust as possible and uses only a small number of free parameterssince in any particular situation where the classical model would fail we willuse the QM model.

In theory, one could consider not just two models, but a whole hierarchyof ever more coarse grained descriptions of physical systems. The next one upfrom classical atomistic models would be a continuum finite element model. Inour experience, the cost ratio between the QM and the classical models is soextreme, that for all intents and purposes the classical atoms can be regardedas “free” in most applications. If we are at liberty to consider millions ofclassical atoms, the necessity of a continuum description is less pressing.The standard technical tools of classical molecular dynamics can thus beused on this large, atomistically modelled region to impose the correct elasticor thermal boundary conditions of the problem under study. The boundaryregions between the QM and classical zones need, on the contrary, a specialtreatment and a separate discussion.

Boundary problems

It is not hard to recognize that the biggest challenge in designing a hybridsimulation scheme is to overcome the problems associated with the artificialboundary that separates regions of the system that are described by thetwo different models. In fact there are two distinct issues that have to beaddressed. The first one generally goes under the name of termination, andit concerns the accurate computation of forces on atoms near the boundary. Ifwe attempt the most naive partitioning and simply omit the classical atomsfrom the quantum mechanical calculation and vice versa, the atoms closeto the boundary will behave like those near an open surface. This is clearlywrong. To get accurate forces on these atoms, we should “trick” the systeminto believing that these surface atoms are really bulk atoms, while avoiding

8 Multiscale modelling of defects in semiconductors 181

implicit inclusion of all the atoms from the other region (which is the pointof the hybrid simulation in the first place).

The next simplest solution is to use so–called “termination atoms”, whichis standard practice in the hybrid simulation schemes of biological systems(called “QM/MM” models [16]). Here, atoms from the region we are presentlynot considering are also taken away, but wherever covalent bonds are bro-ken by this removal, a monovalent atom (typically a hydrogen) is placed tosaturate the dangling bond. For the classical model, this procedure can beseen to be sufficient. The classical description of covalent bonding is verynear–sighted, i.e. the energy of the “last” classical atom can be essentiallyperfectly recovered using the terminator atom technique. Moreover, since theclassical total energy is usually expressed as a simple sum of atomic energies,the terminator atom can simply be excluded from this sum. It has to be notedthat if we are dealing with a strongly ionic material, even our classical

model will not be so near–sighted, and the problems that are usuallyassociated with the termination of the quantum region in standard QM/MMtreatments apply to the classical part as well.

Terminating the quantum mechanical subsystem is more tricky. Quantummechanics is not a nearest–neighbour model, so even if our free surface isterminated by monovalent terminator atoms, the atoms close to the boundarywill feel an artificial environment. Secondly, it is more or less impossiblein any QM scheme to exclude the terminator atoms from the total energyin a consistent fashion, so we have just traded artificial surface atoms forartificial terminator atoms—not an altogether satisfactory situation. Thesetwo problems are solved by the same trick: we stop worrying about the totalenergy and concentrate on obtaining accurate forces on the atoms in the QMregion, which is achieved by allowing a thicker termination region, ratherthan just terminator atoms. In other words, we include a shell of nominallyclassical atoms (a thickness of about a nanometer is sufficient in silicon)in the QM calculation, which ensures that the forces that we compute forthe QM atoms are accurate. Now in contrast to the total energy, the forcesare local quantities, so it is trivial to exclude the forces on atoms in the“contaminated” termination region from the subsequent calculations: fromthe QM calculation, we only keep the forces on atoms in the original QMregion, not the termination region.

Mechanical matching

And thus we squarely arrive at the second major issue associated with theboundary. Suppose we have computed forces with the desired method in eachregion, we now have to propagate the system forward in time along its tra-jectory. On the two sides of the boundary, we originally resolved to computewith different models, but these yield different, and incompatible trajectoriesfor the atoms on the two sides. The set of forces we just computed (from twodifferent models) are not the derivatives of some total energy and do not nec-

182 Csanyi et al.

essarily add up to zero for an isolated system. If we simply plugged them intoan integrator formula (e.g. velocity Verlet [17]), the trajectory may be unsta-ble. The solution to this problem forms the basis of our recently developedmethod [18–20] called “Learn on the fly” (LOTF): instead of using the setof forces directly, we consider a simplistic universal “ball and spring” model,but allow every spring to be different, and also to change with time. At everytime step, the set of forces we computed using the classical and quantummodels are used as targets in an optimization of the spring constants, whichare tuned until the forces derived from the universal model match their tar-gets. The trajectory integrator is then used on the universal model, whoseforces are now consistent across the whole system. The universal model in ef-fect “interpolates” across the boundary, closely matching the target on bothsides, while maintaining global consistency.

In practice, a number of simplifications immediately arise. The targetforces do not have to be computed at every time step, as the universal modelcan be considered an instantaneous interpolator not just in space, but alsoin time and so it can be used with unchanged spring constants for a fewtime steps without incurring a significant deviation from the true hybridtrajectory. More importantly however, the scheme is greatly simplified if theuniversal model is chosen to be the classical model itself! Thus an alternativeand complementary point of view is that we take the classical model andremove the constraints from its parameters, allowing them to be different foreach atom. We then tune them in the QM region to incorporate the quantummechanically accurate force information.

Another important consequence of this scheme is that we are at liberty tomove the QM region in space without upsetting the stability of the simulation.The universal model will adapt to the target forces at each optimization,regardless of where and how we obtained the set of target forces. The use ofa thick transition region to pad the quantum mechanical calculation meansthat as we shift the quantum region in space together with its padding theHamiltonian in the nominal quantum region changes smoothly.

8.4 The LOTF scheme

The universal potential

We now describe in detail the LOTF scheme which is based on the aboveideas. The choice of the universal force model that we will use to interpolatethe quantum and classical models is crucial. There are two obvious choices.One, already alluded to above, is to use the classical model itself. This meansthat there would only be two force models in our scheme, a classical modelwith variable parameters and a quantum model. Most of the applications inthe following sections have been carried out using this approach. An alterna-tive and more flexible choice is to pick a universal force model that offers a

8 Multiscale modelling of defects in semiconductors 183

1.5 2 2.5 3 3.5 4−3

−2

−1

0

1

2

r [A]

eV

−1 −0.5 0 0.5 10

20

40

60

80

100

cos θ

eV

Fig. 8.3. Examples of cubic spline potentials for the two body (top) and threebody terms (bottom). The large dots represent spline knots, the red dotted linesare the corresponding functions of the Stillinger–Weber potential [46].

good compromise between expressive power and robustness. In particular, wewill be changing the model parameters on the fly, so we would like the poten-tial form to change smoothly as we change these parameters. In our searchfor the parameter set optimised at any given point in the simulation, we willmake use of gradient search techniques, so the derivative of the potentialwith respect to its parameters should be straightforward and easy to evalu-ate. Keeping the bond lengths and bond angles as fundamental coordinates,a cubic spline functional form fits the above criteria [21].

For the two body term, we can take a spline with four knots (e.g. atr = {1.5, 2.0, 2.5, 3.5} A for Si, for other elements scaled appropriately by theatomic radius). The spline function is completely determined by its valuesat the knot points and its derivatives at the two endpoints. We fix boththe value and the derivative at the outer endpoint to be zero, and at theinner endpoint to match the corresponding values of the quantum model fora dimer. This leaves two free parameters, the value of the spline at the twoinner knots. Keeping these two values negative guarantees the existence of asingle minimum in the spline function between the endpoints.

For the three body term, we could keep the traditional quadratic form inthe cosine of the bond angle θ, but with a variable curvature and positionof the minimum. To allow more flexibility, and in particular, an asymmetricangular dependence, we can again take a cubic spline of cos θ, with two knotsat −1 and 1, the free parameters are then the values and derivatives at theendpoints. As long as we restrict the derivatives to have the appropriate sign,

184 Csanyi et al.

we can guarantee that the spline will have a single minimum in (−1, 1) forany set of values of the parameters.

Figure 8.3 shows a pictorial illustration of the splines. Since the splinefunctions are linear functions of the values at the knot points, our free pa-rameters, evaluating the splines and their derivatives at fixed atomic positionsfor different parameter values is trivial and fast.

Parameters

Since we want to optimize potential parameters associated with atoms closeto the quantum region, the choice of initial values is important. Unless theelastic constants of our classical and quantum models match closely, acousticwaves will partially reflect from the artificial quantum/classical boundary.Therefore, the starting values of classical parameters have to be set to matchthe elastic constants of our quantum model. During the dynamics, as thequantum region moves, we have two choices for atoms which are no longer inthe quantum region: either we reset the classical parameters to their initialvalues, or we leave them with last fitted parameters. Which is best dependson the application. In the case of defect diffusion, once the defect leaves aparticular area, it makes sense to reset the parameters. In other cases, if thephysical effects in the

quantum region have resulted in some permanent topological change, e.g.the opening of a new surface, possibly followed by surface reconstruction, wemight want to keep the last fitted parameters, as they could give a better de-scription of the new topology than the original parameters. It seems difficultto formulate an optimal strategy for the general case.

Algorithm

The following sequence of steps constitutes the hybrid scheme, a graphicalillustration is shown in Figure 8.4.

1. Initialize universal force model parameters.2. Extrapolate the atomic trajectory for n steps using fixed model param-

eters.3. Identify atoms which need quantum treatment. This is done using geo-

metric and topological criteria, e.g. atoms are selected if they are under–or overcoordinated. For each new application, a suitable selection cri-terion has to be designed, which captures the relevant processes. Theuser-defined goal of the calculation may determine the exact recipe usedhere. Indeed in calculations involving surface chemical reactions in a slabgeometry, we may choose to treat quantum mechanically only one sur-face of the slab, thus effectively halving the accurately rendered area butdoubling the simulation time for a given computer budget.

4. Quantum mechanical calculations are carried out to obtain accurateforces on the selected quantum atoms.

8 Multiscale modelling of defects in semiconductors 185

α(λ)=α + 6(α − α )(λ /3 − λ /2)

α1

10

R’2 α

R’1

α0

3

0

2

0 αR

( , )

( , )

0

α

α11R

2R

( , )

2

( , )

α( , )0

1

Fig. 8.4. Predictor–corrector style parameter fitting. The black and red lines repre-sent the predictor and corrector part of the trajectory, respectively. The gray circlesdenote the assumed domain of validity of the newly fitted parameters at each fitpoint (black dots).

5. Optimize the force model parameters to match the target forces bothin the quantum and classical regions.

6. Interpolate the trajectory over the previous n steps between the oldand new parameters. This achieves a smooth evolution of parameters intime.

7. Back to 2.

8.5 Applications

Point defects

We now present a series of applications of the LOTF scheme in silicon.In all cases, the classical model is that of Stillinger-Weber [46] and, unlessstated otherwise, the quantum model is empirical tight binding with variousparametrizations [24, 98]. The first application is more of a quantitative testof the algorithm, rather than a real application which necessitates a hybridmethodology. We consider the diffusion of point defects in crystalline Si, andshow that the LOTF scheme, using a spherical QM region around the movingdefect, recovers the same diffusion coefficient as that calculated from a fullyquantum mechanical trajectory, while both are significantly different fromthe case of a purely classical trajectory. Figure 8.5 shows the diffusivity as afunction of temperature for a vacancy in silicon calculated in three differentways. The LOTF simulations have been performed with two different QMpackages (different orthogonal tight binding parametrizations), which—not

186 Csanyi et al.

0.7 0.8 0.9 1−6

−5.5

−5

−4.5

−4

1000/T [K]

log 10

D [c

m2 /s

ec]

Stillinger−Weber

Kwon et al.

Lenosky et al.

Fig. 8.5. Vacancy diffusion rate in Si as a function of inverse temperature, ascomputed using fully quantum (black), classical(blue) and hybrid(red) models. Theunit cell had 63 atoms.

0 10 20 30 40 50 60 70 80 90 100time [ps]

0

5

10

15

20

msd

[nm

²]

TB (T=1030±34 K)corrected LOTF (T=977±23 K)LOTF (T=757±23 K)SW (T=1020±44 K)

Fig. 8.6. Mean square displacement of a H atom diffusing in Si. Quantum andclassical runs are represented by black, blue lines, respectively. The green curvecorresponds to the hybrid simulation with a Nose–Hoover thermostat, which isunable to maintain, by itself, the correct kinetic temperature of the H atom. Oncethis is corrected for results match the diffusion rate of the fully quantum mechanicalsimulation (red line).

surprisingly—give different values. Critically however, the LOTF simulationreproduces the correct curve, corresponding to whichever QM package wasused to get the target forces. Further details can be found in [19] and [20].

8 Multiscale modelling of defects in semiconductors 187

Figure 8.6 shows the mean square displacement of a hydrogen atom, dif-fusing in crystalline silicon at 1000 K. This test shows up an interestingaspect of the LOTF simulation. Because the algorithm retunes the classicalmodel continuously, it effectively uses a time–dependent Hamiltonian, andthus we can no longer define a total energy which is a constant of motion. Aslow (local) heating or cooling of the systems may occur during microcanonicsimulations, or in constant temperature simulations using an inappropriatelytuned thermostat, in spite of the forces being all the time accurate within afew percent. The problem is, however, lifted in the canonic ensemble if a suit-able thermostat is used to stabilize the dynamics. In the present case, withtwo different atomic species and a large mass ratio (∼ 28), a simple velocityrescaling thermostat is unable to maintain the correct kinetic temperaturefor both the hydrogen and the silicon atoms. This problem can be correctedwith the use of a more sophisticated thermostat. Figure 8.7 shows the tem-perature evolution of the same system, where instead of a real QM engine,LOTF was used to track the classical forces, but the force on the H atomswas scaled by 0.9. The same behaviour is observed as shown in Figure 8.6:without a thermostat, the system is unstable. With a simple velocity rescal-ing thermostat, the kinetic temperature of the hydrogen is different from thesystem (in this case, much larger), while using a Langevin thermostat [25]eliminates this discrepancy. The Langevin thermostat adds a dissipative anda fluctuating term to the forces and these are precisely balanced to achievethe desired temperature. In contrast to other schemes, the Langevin thermo-stat does not rely on efficient coupling of phonons of the hydrogen and thebulk, but couples to each particle directly and separately.

Dislocation glide

To demonstrate a simple application of the hybrid scheme in a system whichdefinitely needs quantum mechanical accuracy yet is already too large foran ordinary simulation, we consider the glide of partial dislocations. Siliconpartial dislocations move by forming kinks that zip along the dislocation line.The most prevalent partial dislocation in silicon is the 30◦ partial, and thekink that most easily forms and migrates on it is the left–kink [26].

Figure 8.8 shows a pair of such partials, oppositely directed, so that thewhole unit cell is periodic, and the partials enclose a stacking fault. The unitcell is also skewed slightly, thus forcing the existence of a left kink on each par-tial. In a molecular dynamics simulation at high temperature (900–1100 K),the partials move toward each other, and eventually annihilate, leaving be-hind a number of scattered point defects (mostly four–fold–coordinated de-fects [27]). There are a number of salient features of the LOTF simulationthat are worth pointing out and which are not observed in a purely classicalsimulation. Firstly, the glide is an order of magnitude faster in the hybridsimulation. This does not just indicate a lower energy barrier for kink migra-tion, but the configurations that are most prevalent in the hybrid simulation

188 Csanyi et al.

0

2000

4000

T

No thermostatSystem THydrogen T

0

2000

4000

T

Velocity rescaling

0 5 10 15 20 25 30 35 40 450

1000

2000

Time /ps

T

Langevin

Fig. 8.7. Temperature evolution of a hydrogen atom in a 64 atom silicon cell usingdifferent thermostats with the hybrid scheme. The top panel shows that in such asmall system, there is a significant energy drift due to the time dependent Hamilto-nian. The middle panel shows that a simple velocity rescaling thermostat that, likethe Nose–Hoover, couples to the entire system via one extra degree of freedom tothe entire system, is unable to maintain the correct kinetic temperature for the lightH atom. The bottom panel shows that a more sophisticated Langevin thermostat,which couples to each atom individually, maintains the correct temperature for allspecies of atoms.

are very different from those in the classical case. The most striking exampleis the equilibrium state of the left kink itself, shown in Figure 8.9. Althoughthe equilibrium configurations in the quantum and classical models agree atzero temperature, at high temperature, the quantum mechanical free energyminimum corresponds to a kink configuration with a square and an ejectedanti–phase defect. During the classical simulation, the kink spent a largeproportion of the time trapped in a metastable state which had the effect ofpinning the partial.

A unique advantage of a modular hybrid scheme with a black–box quan-tum engine is that it is easy to investigate the effect of chemical defects.Swapping the empirical tight binding quantum engine for density functionaltight binding (DFTB) [28] enables us to simulate the interaction of the partialwith dopants without having to worry about constructing a classical modelfor a new type of atom. If we start the simulation after placing a boron atomat the position shown in in green Figure 8.8, the right hand partial is pinnedand remains completely stationary, while the left hand partial advances asbefore.

8 Multiscale modelling of defects in semiconductors 189

Fig. 8.8. A pair of oppositely directed 30◦ partial dislocations enclosing a stackingfault. The complete unit cell consists of 4536 atoms, but only the (111) planecontaining the stacking fault is shown. The red (Si) and green (B impurity) atomsare undercoordinated, so they and the surrounding blue (Si) atoms are treatedquantum mechanically.

Fig. 8.9. Detail of the dislocation kink, showing the configuration that correspondsto equilibrium at zero temperature (dotted lines), and the what is prevalent at 900 Kin the hybrid simulation.

Brittle fracture

In recent years, the problem of brittle fracture of covalent materials has beena prototypical problem addressed using hybrid methodology [29–31]. Fig-ure 8.10 depicts the von–Mises strain around a stationary crack tip, as cal-culated using the Stillinger–Weber potential by optimizing the positions of

190 Csanyi et al.

Fig. 8.10. Von–Mises strain field around a stationary crack tip in the openingmode in silicon.

the interior atoms, while holding the top and bottom rows of atoms fixed,corresponding to a given load in the opening mode. The stress concentratesnear the tip and when the load reaches a critical level, the most strained bondgives way and the crack propagates forward. While analytical values for therequired loading for crack propagation are easy to compute in the continuumapproximation, it is well known that the discreteness of the atomic latticegives rise to a barrier to this process, called lattice trapping [33, 34, 75]. Theheight of this barrier depends sensitively on the atomistic model employedand classical models typically overestimate this barrier, resulting in a muchhigher critical loading. The barrier associated with the Stillinger–Weber po-tential in particular is so high that, when the crack finally does propagate, somuch elastic energy is released, that the opening surfaces roughen, the cracktip blunts, emits dislocations, resulting in a reduction of the local stress fieldand an arrest to crack propagation: in other words, the ductile behaviouris incorrectly predicted. An example of such “ductile” crack propagation isshown in Figure 8.11. In contrast, real cracks in silicon at low temperatureare atomically flat and propagate continuously above the critical load. Thisis reproduced by the hybrid simulation, a snapshot of which is shown in Fig-ure 8.12. This simulation at 300 K shows the 5–7–5 Pandey reconstruction ofthe opening (111) surfaces. The motion of the crack tip is tracked with thequantum region by noting which atoms have changed their number of nearestneighbours since the start of the simulation, and treating all atoms quantummechanically within 5 A of these.

8.6 Summary

To conclude this overview, we reiterate that a large class of problems inthe area of semiconductor mechanical properties are inherently multiscale,

8 Multiscale modelling of defects in semiconductors 191

Fig. 8.11. Snapshot of ‘ductile’ crack propagation using the Stillinger–Webermodel. Defects are produced in the crack opening region and dissipate energy;the onset of cracking is not sharp.

and that significant advances in computer simulation will only be made byaddressing the different length scales directly and simultaneously. We haveintroduced a promising new scheme that scales well to large system sizes andthat deals with the intricacies of a hybrid simulation with fully controllableapproximations. The scheme is very general, and can be extended to dealwith arbitrary classical force-field potentials, one foreseen extension being,in particular, in the direction of modeling long range (classical electrostatic)interactions.

References

1. L. Pelaz, L. A. Marques, J. Barbolla, J. Appl. Phys. 96, 5947 (2004).2. M. Bruzzi et al, Nucl. Instr. Meth. A 514, 189 (2005)3. M. Huhtinen, Nucl. Instr. Meth. A 491, 194 (2002)4. S. K. Estreicher, Mat. Sci. Eng. Reports, 14, 319 (1995).5. A. Kerridge, A. H. Harker and A. M. Stoneham, J. Phys. Cond. Mat. 16, 8743

(2004).6. C. L. Muhlstein, E. A. Stach, R. O. Ritchie, Acta Mater. 50, 3579 (2002).7. H. Kahn, R. Ballarini, J. J. Bellante, A. H. Heuer, Science 298, 1215 (2002).8. J. Du, W. H. Ko, D. J. Young, Sensor. Actuator. A 112, 116 (2004).9. P. Nguyen, I. Cayrefourcq, K. K. Bourdelle, A. Bougassol,E. Guiot, N. Ben Mo-

hamed, N. Sousbie and T. Akatsu, J. Appl. Phys. 97, 083527 (2005).10. J. Weber, T. Fischer, E. Hieckmann, M. Hiller and E. V. Lavrov,J. Phys.: Con-

dens. Matter 17, S2303 (2005).11. B. Lawn, Fracture of brittle solids (Cambridge University Press, Cambridge,

1993).12. A.C.T. van Duin et al.: J. Phys. Chem. A 105,9396 (2001).

192 Csanyi et al.

Fig. 8.12. Snapshot from brittle crack propagation, using the hybrid method at300 K. Red atoms have been flagged as “active” because they have changed theirneighbour–count since the start of the simulation, and together with the blue atoms,are treated quantum mechanically. In contrast to the ductile case, the crack prop-agates continuously above critical loading.

13. S.J.Stuart et al.: J. Chem. Phys. 112, 6472 (2000).14. D.G.Pettifor et al.: Comp. Mat. Sci.23,33 (2002).15. D.W.Brenner: Phys. Stat. Solidi B 217,23 (2000).16. Theochem, Special Issue Combined QM/MM Calculations in Chemistry and

Biochemistry, 632 (2003)17. W.C. Swope, H.C. Andersen, P.H. Berens, and K.R. Wilson, J. Chem. Phys.

76, 637 (1982).18. A.De Vita and R. Car, MRS Proc. 491, 473 (1998).19. Gabor Csanyi, T. Albaret, M. C. Payne and A. De Vita: Phys. Rev. Lett. 93

175503 (2004)20. Gabor Csanyi, T. Albaret, G Moras, M. C. Payne and A. De Vita: J. Phys.

Cond. Mat. 17, R691 (2005).21. W.H. Press, S. A. Teukolsky, W. T. Vetterling and B. P. Flannery, Numerical

Recipes in C, 2nd ed. (Cambridge University Press, Cambridge, 1992).22. F Stillinger and T Weber: Phys.Rev.B 31, 5262 (1985).23. I Kwon et al.: Phys. Rev. B 49, 7242 (1994).24. T J Lenosky et al.: Phys. Rev B 55, 1528 (1997).25. D. Quigley and M. I. J. Probert: J. Chem. Phys. 120, 11432, (2004).26. V. V. Bulatov et al.: Philos. Mag. A 81, 1257 (2001).27. S. Goedecker, T.Deutsch and L. Billard: Phys. Rev. Lett. 88 235501 (2002).28. Th. Frauenheim et al.: Phys. Stat. Sol. B217,41 (2000).

8 Multiscale modelling of defects in semiconductors 193

29. N Bernstein: Europhys. Lett. 55, 52 (2001).30. F F Abraham et al.: Comp.Phys. 12,538 (1998).31. S Ogata et al.: Comp. Phys. Comm. 138, 143 (2001).32. R.Perez and P. Gumbsch: Phys. Rev. Lett. 84,5347 (2000).33. P. Gumbsch, Brittle Fracture and the Breaking of Atomic Bonds in ‘Materials

Science for the 21st Century’ (The Society of Materials Science, JSMS, Japan,2001), Vol. A, pp.50-58

34. N Bernstein and D W Hess: Phys. Rev. Lett. 91, 025501 (2003).

9 Empirical molecular dynamics: Possibilities,

requirements, and limitations

Kurt Scheerschmidt

Max Planck Institute of Microstructure Physics, Weinberg 2, D-06120 Halle,Germany [email protected]

9.1 Introduction: Why empirical molecular dynamics ?

Classical molecular dynamics (MD) enable atomistic structure simulations ofnanoscopic systems and are, in principle, a simple tool to approach the many-particle problem. For given interatomic or intermolecular forces one has tointegrate the Newtonian equations of motion assuming suitable boundaryconditions for the box containing the model structure. There are at leasttwo advantages of this technique. The molecular dynamics is deterministicand provides the complete microscopic trajectories, i.e., the full static anddynamic information of all particles is available, from which a large numberof thermodynamic and mechanically relevant properties of the models can becalculated. Further, one can perform simulations which are macroscopicallyrelevant with the present computational power of even desktop computers.With reasonable computational effort models of nanoscale dimensions canbe treated for several millions particles and up to microseconds of real time.Thus, empirical MD has two main fields of application: The search for theglobal energetic minima by relaxing nanoscopic structures and the calculationof dynamical parameters by analyzing the lattice dynamics.

Computer simulations are performed on models simulating the realityby using approximations, reduction, localization, linearization, etc., the va-lidity of which has to be critically evaluated for each problem considered.As sketched in Fig. 9.1, increasing the time and length scales of the mod-els (which is necessary to increase the macroscopic validity and robustnessof the calculations) requires an increasing number of approximations andthus to a reduction in the ability to predict some physical properties. Den-sity functional theory (DFT) and its approximations (e.g. local density LDA,generalized gradient GGA) , and the different kinds of tight-binding treat-ments (TB) up to the bond order potential (BOP) approximations startwith the Born-Oppenheimer (BO) approximation, thus decoupling the ionicand electronic degrees of freedom. Additional approximations such as pseu-dopotentials, gradient corrections to the exchange-correlation potential, andincomplete basis sets for the single-particle states are required for DFT calcu-lations. The specifics of the various methods are discussed in various chaptersin the present book and in a number of review articles on DFT, TB, linearscaling techniques and programs such as SIESTA and CASTEP [1–9]. In

196 Kurt Scheerschmidt

first-principle MD [10], e.g., using DFT or DFT-TB, the electronic systemis treated parameter free and the resulting Hellmann-Feynman forces arethe glue of the ionic interactions. Such simulations are computationally tooexpensive for large systems.

Fig. 9.1. Length and time scales in various modeling methods: DFT = densityfunctional theory, TB = tight-binding approximations, BOP = bond order poten-tials, MD = empirical molecular dynamics, MC, CGM = Monte-Carlo/conjugate-gradient techniques, FEM = finite element methods, CM = continuum mechanics.

Figure 9.1 shows that the empirical MD closes the gap between first princi-ples, macroscopic, and continuum techniques (FEM = finite element methods,CM = continuum mechanics). The latter neglect the underlying interatomicinteractions but allow the description of defects, defect interactions, diffu-sion, growth processes, etc. Stochastic Monte-Carlo techniques (MC) enablethe further increase of the time scales, also at the first-principles level, butwithout access to the dynamics. On the other hand, static energy minimiza-tion (e.g. conjugate gradient methods, CGM) enable a drastic increase in themodel size. However, this does not necessarily provide the global minimumof the potential energy surface. The largest model dimensions (number ofatoms) accessible to empirical MD (or MC, CGM) has approximately thesame extension as the smallest devices in microelectronics and micromechan-ics. Thus, today’s atomistic modeling approaches the size of actual nanoscalesystems.

9 Empirical Molecular Dynamics 197

(a) (c)

(b) (d)

Fig. 9.2. Interface structures at 90◦ twist boundaries (a: Pmm(m)-layer, b: 42m-dreidl, cf. text in sect. 9.4.2), predicted by empirical MD and revealed by ab initioDFT simulations yielding semimetallic/isolating behavior as function of the inter-face bonding state as demonstrated by the corresponding DFT band structures (c)and (d), respectively.

The assumption of the existence, validity, and accuracy of known em-pirical interatomic potentials or force fields in analytic form always ignoresthe underlying electronic origin of the forces, i.e. the quantum structure ofthe interactions (cf. sects. 9.2.4). Therefore, it is important to have betterapproximations, such as the bond order potential (BOP), which is devel-oped from tight-binding approximations and discussed briefly in sect. 9.3.2.Other possibilities to include electronic properties of the interaction consistin continuously re-fitting an empirical potential during the calculation, calledlearning on the fly (see chapter 9 and Ref. [11]). Separated subsystems can betreated at an ab initio level. Suitable handshaking methods can be designedto bridge the embedded subsystem with its surrounding which is treatedsemi-classically (cf. [12–15]). However, well-constructed potentials describingsufficiently accurate physical properties also give physical insights and enablea thorough understanding of the underlying processes [2].

Fig. 9.2 shows a simple example demonstrating the difference in the ap-proximation levels. Using empirical MD the correct structural relaxation ofspecial interfaces created by wafer bonding can be treated, as described laterin sect. 9.4.2. Two configurations exist, a metastable one (Fig. 9.2 a) andthe global minimum structure shown in Fig. 9.2 b, called dreidl [16]. Theelectronic properties demand ab initio level simulations, done here using asmaller periodic subunit of the interface and applying DFT techniques. It is

198 Kurt Scheerschmidt

shown that the higher level approximation reproduce the structure predictedby the semiempirical techniques, whereas the correct energies and electronicproperties (the band structure and the semi-metallic or isolating behavior ofthe interfaces as shown in Figs. 9.2 c,d) can only be described using the DFTformalism.

A second kind of embedding problem occurs because even millions of par-ticles only describe a small part of reality which, even for smaller pieces ofmatter, is characterized by Avogadro number (6×1023 mol−1). Since isolatedsystems introduce strong surface effects, each model has to be embedded insuitable surroundings. For a discussion of various types of boundary condi-tions, see Section 9.3.1, Chapter 9 and Refs. [13–15].

The fundamental requirements of classical MD simulations, the relationto statistical methods and particle mechanics, suitable integration and em-bedding techniques and the analysis of the trajectories are presented in sec-tion 9.2. The enhancements of potentials (bond-order potentials) and bound-ary conditions (elastic embedding) are discussed in section9.3. Selected appli-cation of semi-empirical MD (relaxation of quantum dots and wafer bondedinterfaces) are in sect. 9.4 together with some examples from the literature.They strongly depend on the approximations assumed in the simulations.

9.2 Empirical molecular dynamics: Basic concepts

The main steps for applying empirical molecular dynamics consist of the inte-gration of the basic equations (cf. sect. 9.2.1) using a suitable interaction po-tential and embedding the model in a suitable surrounding (cf. sects. 9.2.4 and9.2.3, respectively). Textbooks of classical molecular dynamics, e.g. [17–23],describe the technical and numerical details, and provide a good insight intopossible applications and the physical properties, which may be predictable.Here, only the main ideas of empirical MD simulations, viz. the basic equa-tions, methods of numerical treatment and the analysis of the trajectoriesare discussed.

9.2.1 Newtonian equations and numerical integration

The basic equations of motion solved in empirical MD are the Newtonianequations for N particles (i = 1, ..., N) characterized by their masses mi,their coordinates ri, and the forces acting on each particle f i. These mayinclude external forces F i and interatomic interactions f i =

∑j �=i f ij:

miri = f i = −∂V(r1, r2, ..., rN )∂ri

. (9.1)

The assumption that a potential V exists presupposes conservative (non-dissipative) interactions and needs some more general considerations, cf.

9 Empirical Molecular Dynamics 199

sect. 9.2.2. If one assumes pairwise central potentials, V(r1, r2, ..., rN ) =∑′i,jV(rij) with rij = |rj − ri| (the dash means that the sum is restricted

to i < j), it follows that the virial of the external and internal forces isM =

∑′i,jrijf ij =

∑iriF i. If the system is isolated (microcanonical en-

semble), the conservation of total energy E = K + V (with kinetic energyK =

∑imiri

2) is guaranteed:

E = K + V =∑

i

miri · ri −∑

i

rif i = 0. (9.2)

Given an initial configuration of particles and suitable boundary condi-tions (cf. sect. 9.2.3), the differential equations can be integrated using oneof the standard methods, e.g. Runge-Kutta techniques, Predictor-Correctormethods, Verlet or Gear algorithm. The numerical integration is equivalentto a Taylor expansion of the particle positions ri(t + δt) at a later time interms of atomic positions ri and velocities vi. The forces f i are the timederivatives of ri with an increasing order at previous time steps:

ri(t+ δt) = ri(t) + a1δtvi(t) + a2(δt)2f i(t) + ...+ O(δtn). (9.3)

In addition to good potentials and system restrictions (cf. sects. 9.2.3 and9.2.4, respectively) an efficient and stable integration procedure is requiredto accurately propagate the system. The conservation of energy during thesimulation is an important criterion. An increase in order allows larger timeincrements δt to be used, if the evaluation of the higher derivatives is not tootime consuming, which happens with more accurate potentials, like the BOP(9.3.2). The efficiency and the accuracy of the integration can be controlledby choosing suitable series expansion coefficients aj, j = 1, 2, ..., n−1 and theorder n of the method or by mixing the derivatives of ri at different timesin Eq. 9.3 to enhance the procedure. However, the increment δt, of the orderof the fs, must be at least so small that the fastest particle oscillations aresufficiently sampled.

Better or faster MD calculations may be performed using special acceler-ation techniques, the three most important methods being:

1) localization: By using the linked cell algorithm and/or neighbor lists,it is assumed that for sufficiently rapidly decaying potentials (faster thanCoulomb 1/r for r→ ∞) only a small number of particles have a direct andsignificant interaction. A cutoff rc is defined and the interaction potential isassumed to be zero for r > rc. A transition region is fitted using splines orother suitable functions. Then the system is divided into cells. Their minimumdimension is given by 2rc to avoid self-interaction of the particles (minimumimage convention). Only the interactions of the atoms within a cell and its26 neighboring cells are considered, which reduces the simulation time drasti-cally from the ∝ N2 behavior for an all particle interactions to linear behavior∝ 729cNN , where cN is the average number of particles per cell. The prob-lem is to find a suitable rc for a smooth transition and screening behavior

200 Kurt Scheerschmidt

which includes a sufficient number of next-neighbor atoms and cells. One alsoneeds a suitable criterion to update the neighbor lists and reorder the cellswhenever particles leave a cell during the propagation of the system.

The Coulomb potential may also be screened. However, summations overthe long range 1/r potential, representing infinite point charge distributions,are only conditionally convergent. Thus, it is better to apply the Ewaldmethod (originally developed to calculate cohesive energies and Madelungconstants [24]) and its extensions [25, 26] based on successive charge neutral-ization by including next neighbor shells around the origin.

2) Parallelization: Using replicas, or dividing the structure into severalparts, which are then distributed to different processors, thus allowing par-allelization [27, 28]. The replica technique needs suitable criteria for dividingthe system into small parts with minimum interaction. More importantly,one must bear in mind that the replicas are not independent over long times.A careful control of the time interval after which the communication be-tween the different parts is required in order to achieve the same results asnon-parallelized simulations. This issue is the bottleneck of the technique.

3) Time-stretching: Such techniques as hyperdynamics, temperature ac-celeration, basin-constrained dynamics, on-the-fly Monte-Carlo, and othersare subsumed briefly here, because their common idea consists in replacingthe true time evolution by a shorter one increasing the potential minima,transition frequencies, system temperature, etc.(see Ref. [28]).

9.2.2 Particle mechanics and non equilibrium systems

In classical mechanics a system is characterized either by its LagrangianL(q, q) = K(q, q) − V(q) or its Hamiltonian related to the Lagrangian by aLegendre-transform H(q, p) =

∑qipi−L = K+V(q), where qi,pi are gener-

alized coordinates and momenta (conjugate coordinates), respectively, whichhave to be independent or unrestricted. One derives the generalized momentafrom the derivatives of the Lagrangian pi = ∂L

∂qi, whereas the derivatives of

the Hamiltonianqi =

∂H∂pi

, pi = −∂H∂qi

(9.4)

reproduce Newton’s law of motion9.1.Hamilton’s principle of least action enables a simple generalization of

the mechanics of many particle systems. The integral over the Lagrangianfunction has to be an extremum. Variational methods yields applying theextremal principle:

d

dt

∂L∂qi

− ∂L∂qi

= 0. (9.5)

The advantage of the general formulation 9.5 is that it allows a simple exten-sion to non-conservative systems by including explicit time dependence or tosystems with constraints, such as fixed bond lengths in subsystems, friction

9 Empirical Molecular Dynamics 201

of particles, outer forces, etc. If there are holonomic constraints describingrelations between the coordinates fl(qk) = 0, one gets the generalized addi-tional coordinates alk = ∂fl/∂qk and al = ∂fl/∂t creating an additional termon the right hand side of Eq. (9.5) in the form

∑l λlalk with the set of La-

grange multipliers λl. This formalism is the basis for the enhancement of theboundary conditions in sect. 9.3.1 and the handshaking methods mentionedabove.

In addition, the Lagrangian and Hamiltonian formalisms correlate clas-sical dynamics to statistical thermodynamics and to quantum theory, andallow the evaluation of properties from the trajectories (cf. sect. 9.2.5). Theset of time-dependent coordinates q (the configuration space) and time-dependent momenta p (momentum space) together is called phase-spaceΓ = (q,p) = (q1, q2, ..., qn,p1,p2, ...,pn). Presenting Γ in a 6-dimensionalhyperspace yields N trajectories, one for each molecule, and allows the studyof the behavior of the system using statistical methods. It is called the μ-space(following Ehrenfest) or the Boltzmann molecule phase space. The whole Γ -set as one trajectory in a 6N -dimensional hyperspace presents the Gibbsphase space, with the advantage to include better interactions and restric-tions. However, the system must now be described as a virtual assembly forstatistical relations. The phase space Γ contains the complete information onthe microscopic state of the many particle system. Several basic quantitiesmay be derived such as the phase space flow or the ‘velocity field’ Γ = (q, p).Applying the relations 9.4 yields the Liouville equation

∇ΓΓ =∑

(∂qi

∂qi+∂pi

∂pi) = 0, (9.6)

showing that the flow Γ behaves like an incompressible liquid, i.e. althoughthe μ-trajectories are independent, their related phase space volume is con-stant.

An equivalent formulation of the Liouville statement 9.6 is the equationof continuity for Γ , ρ, and the local change of the density ρ(q,p, t), simi-lar to Heisenberg equation for observables in quantum theory. The densityρ(q,p, t) describes the probability of finding the system within the region p, qand p+dp, q+dq of the phase space volume dpdq. Because the systems mustbe somewhere in the phase space, the density can be normalized in the entirephase space. The integral over the whole phase space of the non-normalizeddensity yields the partition function. Integration over (N-k) particles yield thek-particle phase space density ρ(k)(q,p, t) =

∫ρ(N)(q,p, t)dp(N−k)dq(N−k),

and the integration over the whole momentum space the corresponding k-particle distribution function n(k)(q) =

∫ρ(N)(q,p, t)dq(N−k)dp. Very im-

portant for the trajectory analysis (sect. 9.2.5) are the cases k = 1, 2 definingthe (radial) pair-distribution functions g(qi, qj):

n(1)(qi)n(1)(qj)g(qi, qj) = n(2)(qi, qj). (9.7)

202 Kurt Scheerschmidt

The phase-space trajectories allow the classification of the dynamics ofmany particle systems. They may be Poincare-recurrent (the same phasespace configuration occurs repeatedly), Hamiltonian (non-dissipative), con-servative (no explicit time dependence for the Hamiltonian), or integrable(number of constants of motion equals the number of degrees of freedom, re-sulting in stable periodic or quasiperiodic systems). Non-integrable systemsmay be ergodic or mixing etc., i.e., the trajectory densely covers differenthypersurfaces in the phase space and allow the characterization of differentkinds of instabilities of the system.

9.2.3 Boundary conditions and system control

As mentioned above, even for systems considered large in MD simulations, thenumber of atoms is small compared to real systems and therefore dominatedby surface effects. They are caused by interactions at free surfaces or withthe box boundaries. In order to reduce non-realistic surface effects, periodicboundary conditions are applied, or the box containing the model (supercell)has to be enlarged so that the influence of the boundaries may be neglected.For further discussion and enhancements including elastic embedding, seeSec. 9.3.1. Periodic boundary conditions means, the supercell is repeatedperiodically in all space directions with identical image frames or mirrorcells. In contrast to the case of fixed boundaries, under periodic boundaryconditions the particles can move across the boundaries. If this happens, thepositions r have to be replaced by r−α where α is a translation vector to theimage frame to which the particle is moved. Particle number, total mass, totalenergy, and momenta are conserved in periodic boundary conditions, but theangular momenta are changed and only an average of them is conserved. Itshould be mentioned here that periodic boundary conditions repeat defects,which in small supercells create high defect concentrations. Anti-periodic andother special boundary conditions may be chosen, where additional shifts ofthe positions and momenta along the box borders allow to correct a disturbedperiodic continuation and describe reflective walls at free surfaces.

According to the boundary conditions and system restrictions, the virtualGibbs entities are either isolated microcanonical ensembles with constantvolume, total energy, and particle numbers (NVE ensembles) or systems thatexchange and interact with the environment. Closed ensembles (e.g. canonicalNVT or isothermic-isobar NPT systems) have only an energy exchange with athermostat, whereas open systems have in addition particle exchange with theenvironment (e.g. grand-canonical TμV, where μ is the chemical potential).

Simple tools exist to equilibrate systems at a well-defined temperature.For example, velocity rescaling is numerically equivalent to a simple exten-sion of the original microcanonical system with non-holonomic constraints inthe Lagrangian formalism and is called the Berendsen thermostat. A sim-ilar simple extension may be used for pressure rescaling. It is called theBerendsen thermostat [29] and works conserving the Maxwell distribution

9 Empirical Molecular Dynamics 203

and yielding maximum entropy. However, such non-isolated ensembles arebetter described using constraints in the Lagrangian similar to those dis-cussed in Sec. 9.2.2. The extension of L in equation 9.5 by adding termsLconstr introduces additional generalized variables having extra equations ofmotion and also additional force terms in the Newtonian equations 9.1. Someof the most important methods for Lconstr are given below without furthercomments:

– Nose-bath and generalized Nose-Hoover thermostat [30–32]: Lconstr =Ms2/2 − nfT ln(s) with the fictive or the whole mass M =

∑mi, the

degrees of freedom nf = 3N + 1, and the new generalized variable splaying the role of an entropy.

– Andersen isothermic-isobaric system control (NPH ensemble) Lconstr =V 2/2 +PV introduces volume and pressure as generalized variables [33].

– Generalized stresses according to Parrinello and Rahman [34] NTLσ-ensembles: the Lagrangian is extended by Lconstr = −1/2TrΣG with ageneralized symmetric tensor Σ and the metric tensor G of the crys-tal structure. A further specialization, e.g. for constant strain ratesNTLxσyyσzz-ensemble [35], is in principle a mixture of the isothermic-isobaric system with the generalized stress constraint.

– Brownian fluctuations, transport and flow processes, density and othergradients, Langevin damping ∝ v(t), etc. demand non-equilibrium MD[36, 37], which is mostly done including the perturbation as a suitablevirtual field F(p, q, t) into the equations of motion via the Lagrange for-malism.

9.2.4 Many body empirical potentials and force fields

Empirical potentials and force fields exist with a wide variety of forms, andalso different classification schemes are used according to their structure, ap-plicability, or physical meaning. It makes no sense to describe a large numberof potentials or many details in the present review, therefore only some of theexisting and most used potentials are briefly listed below (a good overviewcan be found in the MRS-Bulletin from February 1992 and the special issueof Phil. Mag. B 73, 1996). A classification by Finnis [2] (and many referencestherein) describes recent developments and discusses in an excellent way thejustification of the potentials by first principle approximations, which is im-portant for the physical reliability and the insight into the electronic structureof potentials. Interatomic forces are accurate only if the influence of the lo-cal environment according to the electronic structure is included. The mostimportant property of a potential is transferability, that is the applicabilityof the potential to varied bonding environments. An important additionalcriterion is its ability to predict a wide range of properties without re-fittingthe parameters. The number of fit parameters decreases as the sophisticationof the force fields increases. According to [2] one has:

204 Kurt Scheerschmidt

Pair potentials - Valid for s-p bonded metals and mostly approximated bya sum of pairwise potentials. They may be derived as the response to aperturbation in jellium, which can be visualized in the pseudoatom pictureas an ionic core and a screening cloud of electrons.Ionic potentials - To use Coulomb interactions directly for ionic structures,the problem of screening (Ewald summation, Madelung constant) has to beconsidered as mentioned above. The interactions were originally described byBorn and others as the rigid ion approximation. Starting from the Hohenberg-Kohn-Sham formulation of the DFT-LDA, shell or deformable ion modelsmay be developed beyond the rigid ion approximation, where the additionalshell terms [38] look like electron density differences in noble gases.Tight binding models - Different derivations and approximations for TB-related potentials exist and are nowadays applicable to semiconductors, tran-sition metals, alloys and ionic systems. The analytic bond order potential issuch an approximation and will be discussed in more detail in sect. 9.3.2.Hybrid schemes - Combinations of pair potentials with TB approxima-tions are known as generalized pseudopotentials, effective media theories(EMT [39]), environmental dependent ionic potentials (EDIP [40]), or em-bedded atom models (EAM [41]); for details cf. [2].

From the empirical point of view the simplest form of potentials maybe considered to be a Taylor series expansion of the potential energy withrespect to 2-, 3-,..., n- body atomic interactions. Pair potentials have a shortrange repulsive part, and a long range attractive part, e.g. of the Morseor Lennard-Jones (LJ) type, mostly ‘12-6’, i.e. ar−12 − br−6. LJ-potentialsare successfully applied to noble gases, biological calculations, or to modellong range van der Waals interactions (e.g. [42]). For quasi crystals [43] anLJ potential was constructed with two minima in a golden number distancerelation. However, simple pair potentials are restricted in their validity tovery simple structures or to small deviations from the equilibrium. Thereforemany-body interactions are added and fitted for special purposes, e.g. the MDof molecules and molecule interactions [44,45]. Separable 3-body interactionsare widely used: Stillinger-Weber [46] (SW), Biswas-Hamann [47], and Takai-Halicioglu-Tiller [48] potentials. The SW is perhaps the best known 3-bodytype potential. It includes anharmonic effects necessary to reproduce thethermal lattice expansion of Si and Ge [49]. Hybrid force fields are sometimesused to include the interaction of different types of atoms, such as Born-Mayer-Huggins, Rahmann-Stillinger-Lemberg terms and others applicable tosilicate glasses and interdiffusing metal ions or water molecules [50–54].

As mentioned above, other force fields are developed from first princi-ple approximations which combine sufficient simplicity with high rigor. Theyare not based upon an expansion involving N-body interactions (cluster po-tentials) . The more or less empirical forms of TB-potentials and effectivemedium force fields are the modified embedded atom model (MEAM, [55]for cubic structures and references therein for other structures), the Finnis-

9 Empirical Molecular Dynamics 205

Sinclair (FS) [56] and the Tersoff type (TS) potentials [57–59]. The TS po-tential is an empirical bond-order potential with the functional form:

V (rij) = ae−λrij − bije−μrij . (9.8)

The bonds are weighted by the bond order bij = F (rik, rjk, γijk) includingall next neighbors k �= i, j, which gives the attractive interaction the formof an embedded many body term. The different parameterizations (TI, TII,TIII) of the Tersoff potential have been intensively tested. Other parame-terizations exist [60]. They involve other environmental functions and firstprinciple derivations [61, 62], as well as extensions to include further interac-tions, H in Si [63], C, Ge [64], C-Si-H [65, 66], AlAs, GaAs, InAs, etc [67].Finally, a re-fitted MEAM potential with SW terms is available for Si [68].multipole expansions replaced by spherical harmonics [69] are an alternativeto TS-potentials.

A comparative study of empirical potentials shows advantages and dis-advantages in the range of validity, physical transparency, fitting and accu-racy as well as applicability [70]. Restrictions exist for all empirical potentialtypes, even if special environmental dependencies are constructed to enhancethe elastic properties near defects. In addition, not all potentials are applica-ble to long range interactions, and the electronic structure and the nature ofthe covalent bonds can only be described indirectly. Thus, it is of importanceto find physically motivated semi-empirical potentials, as mentioned aboveand discussed in Sect.9.3.2 for TB-based analytic BOP’s. The parameters ofthe empirical force field have to be fitted to experimental data or first princi-ple calculations. First, the cohesive energy, lattice parameter and stability ofthe crystal structures have to be tested or fitted. The bulk modulus, elasticconstants, and phonon spectra are very important properties for the fit. Thefollowing section describes some of the quantities which may be used for thefit or to be evaluated from the MD simulations if not fitted. Very impor-tant details concerning point defects and defect clusters - necessary to getthe higher order interaction terms - are given by the energy and structure ofsuch defects.The data may be given by DFT or TB dynamics or geometryoptimizations, as e.g. [71–74].

9.2.5 Determination of properties

Static properties of systems simulated by empirical MD can be directly cal-culated from the radial- or pair distribution function 9.7. Dynamic propertiesfollow from the trajectories using averages or correlations:

〈A〉t = limt→∞

1t− t0

t∫t0

dtA(r(t)), C(t) = 〈A(τ )B(t + τ )〉τ , (9.9)

which in principle correspond to time averages. However, they are sampled atdiscrete points, so that it is necessary to choose suitable sampling procedures

206 Kurt Scheerschmidt

to reduce the effects of the finite size of the system, stochastic deviations,and large MD runs.

Two basic relations are central for the analysis of the properties. Theergodic hypothesis states that the ensemble average 〈A〉e is equal to thetime average 〈A〉t, which relates the averages to the measurement of a sin-gle equilibrium system. The Green-Kubo formula limt→∞

〈[A(t)−A(t0)]2〉t

t−t0=∫∞

t0dτ (〈A(τ )A(t0)〉e) relates mean square deviations with time correlations.

The diffusion coefficient D = limt→∞ 16Nt〈

∑[rj(t) − rj(0)]2〉 (Einstein rela-

tion) is equivalent to the velocity autocorrelation function, which is a specialform of the Green-Kubo formula D = 1

3N

∫∞0 〈

∑vj(t) · vj(0)〉. Similarly one

uses the cross-correlation of different stress components to obtain shear vis-cosities. Other transport coefficients may be derived analogously.

In thermodynamic equilibrium the kinetic energy K per degree of freedomis determined by the equipartition theorem K = 〈

∑imiv

2i /2〉 = 3NkBT/2

(kB=Boltzmann constant) which yields a measure of the system temperatureT. The strain tensor σ and the pressure P are obtained from the generalizedvirial theorem σkl = 1/V [

∑j vjk · vjl +

∑ij rijk · f ijl]. Thus, the pressure

is the canonical expectation value P = 1/3V [2K−M] of the total virial M.Alternatively, one can use the pair distribution in the form P = ρkBT −ρ2∫∞

0 g(r)∂U∂r 4πr3dr, which may be useful for correcting sampling errors.

Using the densities ρ(r,p) as defined above and Z =∫ρ(r,p)dr as nor-

malization, statistical mechanics deals with ensemble averages, which in gen-eral are written as

〈Q〉e = 1/Z∫Q(r,p)ρ(r,p)dr . (9.10)

All thermodynamic functions may be derived from the partition func-tion Z, for example the free energy F (T, V, N) = −kBT ln(Q(T, V, N)).In addition, one can obtain all the thermodynamic response coefficients.With the internal energy U, one computes the isochoric heat capacityCV = (∂U/∂T )N,V and thus a measure for the quality of the temperatureequilibration (〈T 〉2 −〈T 2〉)/〈T 2〉 = 3/2N(1− 3kN)/2cV . From the volume Vfollows the isothermal compression χT = −1/V (∂V/∂P )N,T , etc. One has tochoose according to the different ensembles:

– Microcanonical: ρNV E = δ(H(r,p) − E) conserving entropy S(E,N,V).– Canonical: ρNV T = e(H/kBT ∝ eV (r)/kBT conserving free energy F(T,V,N)– Isothermic-isobar: ρNPT = e(H−TS)/kBT conserving Gibbs free energy

G(T,P,N)– Grand canonical: ρTμV = e(H−μN)/kB T conserving Massieu function

J(T,μ,V) (Legendre transformation in entropy representation).

Finally it should be mentioned that the Fourier transform of pair distribu-tions is connected to the scattering functions in X-ray, neutron and electron

9 Empirical Molecular Dynamics 207

diffraction. MD-relaxed structure models allow the simulation of the trans-mission electron microscope (TEM) or high resolution electron microscope(HREM) image contrast and therefore make the contrast analysis more quan-titative. For this purpose, snapshots of the atomic configurations are cut intothin slices, which are folded with atomic scattering amplitudes and each otherto describe the electron scattering (multi-slice formulation of the dynamicalscattering theory), cf. the applications in sect. 9.4 using this technique tointerpret HREM investigations of quantum dots and bonded interfaces.

9.3 Extensions of the empirical molecular dynamics

Coupling of length and time scales in empirical MD means bridging the firstprinciples particle interactions and the box environment (see 9.1). It can bedone either using embedding and handshaking or by a separate treatment anda transfer parameter between the subsystems. MD simulations of the crackpropagation [75] and the analysis of submicron MEMS [76] are successfullyapplications of the FEM-coupling between MD and an environmental con-tinuum. In sect. 9.3.1 enhanced boundary conditions for MD are discussed,where the coupling between MD and an elastic continuum is a handshakingmethod based on an extended Lagrangian [77, 78]. The main steps in thedevelopment of an analytic TB based BOP [79] are sketched in sect. 9.3.2 asan example for using enhanced potentials.

9.3.1 Modified boundary conditions: Elastic embedding

Elastic continua may be coupled to MD when the potential energy of aninfinite crystal with a defect as shown in Fig. 9.3a is approximated in theouter region II by generalized coordinates ak [78]:

E({ri}, {rj}) = E({ri}, {ak}). (9.11)

In the defect region I, characterized by large strains, the positions of atomsri (i = 1, ..., N) are treated by empirical MD. The atomic positions rj (j > N)in the outer regions II and III result from the linear theory of elasticity

rj = Rj + u(0)(Rj) + u(Rj, {ak}), (9.12)

where the fields u0(R) and u(R) describe the displacements of atoms fromtheir positions Rj in the perfect crystal, and satisfy the equilibrium equationsof a continuous elastic medium with defects; u0(R) is the static displacementfield of the defect and independent of the atomic behavior in I; u(R) isrelated to the atomic shifts in region I and can be represented as a multipoleexpansion:

u(R, {ak}) =∞∑

k=1

akU(k)(R) (9.13)

208 Kurt Scheerschmidt

over homogeneous eigensolutions U(k)(R) [77]. The U(k) are rapidly decreas-ing with R, thus the sum in Eq. 9.13 is truncated to a finite number K ofterms.

Fig. 9.3. Dislocation geometry (a, I= MD-region, II=elastic, III=overlap, cf. text)to apply elastic boundary conditions for an dipole of 60◦ dislocations, and snapshotsduring MD annealing: 500K (b), 600K (c), 0K (d).

The equilibrium positions of atoms in the entire crystal are obtained fromthe minimization of the potential energy given in Eq. (9.11) with respect tori and ak. This is equivalent to a dynamic formulation based on the extendedLagrangian as discussed above (sects. 9.2.2 and 9.2.3) with the extension 9.11:

L =N∑

i=1

mir2i

2+

K∑k=1

μak2

2−E({ri}, {ak}). (9.14)

Here mi are the atomic masses and μ is a parameter playing the role ofmass for the generalized coordinates ak. If μ is proper chosen, the phononsare smooth from I to II and the outer regions oscillate slower than the MDsubsystem as demonstrated in the snapshots Fig. 9.3b-d. The correspondingsystem defining the forces reads

Fi = −∂E({ri}, {ak})∂ri

, (9.15)

Fk = −∂E({ri}, {ak})∂ak

= −∑j>N

∂E({ri}, {rj})∂rj

∂rj

∂ak=∑j∈II

FjU(k)(Rj).

(9.16)These must vanish in the equilibrium Fi = Fk = 0. The sum in Eqs. (9.16)expands only over the atoms in region II surrounding region I, since in regionIII the linear elasticity alone is sufficient and the forces on these atoms arenearly vanishing. However, II must be completely embedded in region III.The forces on atoms in II must be derived from the interatomic potential,therefore the size of region II must be extended beyond the potential cut-offrc. The Lagrangian 9.14 results in the equations of motion

mri = Fi, (i = 1, ..., N), μak = Fk, (k = 1, ..., K), (9.17)

9 Empirical Molecular Dynamics 209

thus extending Newton’s equations by equivalent ones for the generalizedcoordinates.

9.3.2 Tight-binding based analytic bond-order potentials

As discussed before, the use of TB methods allows much larger models thanaccessible to DFT. TB formulations exist in many forms (e.g. [2,6–8,80–82]).The application of an analytical BOP, developed mainly by Pettifor andcoworkers [79,83–87], may further enhance empirical MD by providing betterjustified force fields while allowing faster simulations than numerical TB-MD. The approximations to develop analytic TB potentials from DFT maybe summarized by the following steps (cf. also [2, 88]). (1) Construct theTB matrix elements by Slater-Koster two-center integrals including s- and p-orbitals, (2) transform the matrix to the bond representation, (3) replace thediagonalization by Lanczos recursion, (4) get the momenta of the density ofstates from the continued fraction representation of the Green’s function upto order n for an analytic BOP-n potential. The basic ideas sketched with afew more details are as follows.

The cohesive energy of a solid in the TB formulation can be written interms of the pairwise repulsion Urep of the atomic cores and the energy dueto the formation of electronic bonds

Ucoh. = Urep + Uband − Uatoms = 1/2∑i,j

′φ(rij) + 2

∑n(occ)

ε(n) −∑iα

Natomiα εiα

(9.18)where the electronic energy ε of the free atoms has to be subtracted from theenergy of the electrons on the molecular orbitals ε. Replacing ε(n) with theeigenstates of the TB-Hamiltonian for orbital α at atom i,

ε(n) = ε(n)〈n|n〉 = 〈n|ε(n)|n〉 = 〈n|H|n〉 =∑

iα,jβ

C(n)iα Hiα,jβC

(n)jβ , (9.19)

the electronic contributions to the cohesive energy Uband − Uatom = Ubond +Uprom can be rearranged and separated in the diagonal and off-diagonal parts:

Ubond = 2∑iα,jβ

′ρjβ,iαHiα,jβ, Uprom =

∑iα

[2ρiα,iα −Natom

]εiα. (9.20)

The third contribution, the promotion energy Uprom compares the occu-pancy of the atomic orbitals of the free atoms with the occupation of thecorresponding molecular orbitals [89]. The bond energy Ubond describes theenergy connected with the exchange or hopping of electrons between arbi-trary pairs of atomic neighbors {i(j), j(i)}, the factor 2 is due to the spindegeneracy. The transition matrix elements of the Hamiltonian are hoppingenergies, and their transition probability is given by the corresponding ele-ment of the density matrix. The contribution for one bond between the atoms

210 Kurt Scheerschmidt

i and j is thus characterized by the part of the density called the bond-orderΘjβ,iα which may be expressed as a trace:

U i,jbond =

∑α,β

Hiα,jβΘjβ,iα = Tr(HΘ) (9.21)

Besides the bond energy, the force exerted on any atom i must be givenanalytically and therefore one needs the gradient of the potential energy inEq. (9.18) at the position of the atom i:

−F i =∂Ucoh

∂ri=∂Ubond

∂ri+ 1/2

j �=i∑j

∂φ(rij)∂ri

. (9.22)

This expression includes electronic bonds and ionic pairwise repulsions fromall atoms of the system. The general form is still expensive to cope with forthe simulation of mesoscopic systems. O(N) scaling behavior is provided ifthe cohesive energy of the system is approximated by the Tersoff potentialin Eq. 9.8, where bije−μrij is replaced by U i,j

bond and a suitable cutoff withresulting balanced interatomic forces is added.

To find an efficient way to obtain the bond energy in a manner thatscales linearly with the system size, too, the derivative ∂Ubond

∂riof the bond

energy in Eq. (9.22) is replaced assuming a stationary electron density ρ inthe electronic ground state and applying the Hellmann-Feynman theorem.The forces can now be obtained without calculating the derivatives of theelectronic states and leads to the Hellmann-Feynman force [90, 91]:

−F HFi =

∑iα,jβ

′Θjβ,iα

∂Hiα,jβ

∂ri. (9.23)

Both the elements of the density matrix and the hopping elements of theHamiltonian are functions of the relative orientation and separation of thebonding orbitals and have been calculated by Slater and Koster [92] assuminga linear combination of atomic orbitals (LCAO). To introduce the Slater-Koster matrix elements, usually denoted by ssσ, spσ, ppπ, etc., correspondingto the contributing orbitals and its interaction type, the TB-Hamiltonianoperator is transformed to a tri-diagonal form. This can be done using theLanzcos transformation [93]:

〈UmH〉Un = anδm,n + bnδm,n−1 + bn+1δm,n+1, (9.24)H|Un〉 = an|Un〉 + bn|Un−1〉 + bn+1|Un+1〉, (9.25)

where the orthonormal basis at the higher level n is recursively developedfrom |U0〉. The coefficients an, bn are the elements of the continuous fractionof the Green’s function below.

The off-diagonal elements of the density matrix are related to the Green’sfunction [94] ,

9 Empirical Molecular Dynamics 211

ρiα,jβ = − 1π

limη→0

�∫ Ef

dE Giα,jβ(Z), (9.26)

where the complex variable Z = E + iη is the real energy E with a positive,imaginary infinitesimal to perform the integration (theorem of residues). Thisinter-site Green’s function can be connected to the site-diagonal Green’s func-tion [94] via

Giα,jβ(Z) =∂GΛ

00(Z)∂Λiα,jβ

+GΛ00(Z)δi,jδα,β, (9.27)

and the latter can be evaluated recursively [95] using the coefficients of theLanczos recursion algorithm as mentioned above:

GΛ00(Z) =

1

Z − aΛ0 − (bΛ

1 )2

Z−aΛ1 − (bΛ

2)2

Z−aΛ2

−...

. (9.28)

The bond order can now be expressed in terms of the derivatives of therecursion coefficients aΛ

n and bΛn ,

Θiα,jβ = −2

[ ∞∑n=0

χΛ0n,n0

∂aΛn

∂Λiα,jβ+ 2

∞∑n=1

χΛ0(n−1),n0

∂bΛn∂Λiα,jβ

], (9.29)

with the response function χ0m,n0(Ef) = 1π

limη→0 �∫ Ef dEGΛ

0m(Z)GΛn0(Z)

and the elements G0n = Gn0 following from the system of equations (Z −an)Gnm(Z) − bnGn−1,m(Z) − bn+1Gn+1,m(Z) = δn,m. The more recursioncoefficients included in Eq. (9.29), the more accurately the bond order willbe approximated. The recursion coefficients are related to the moments of thelocal density of states (LDOS) [94] and the side diagonal Green’s function ofEqs. (9.26,9.27) relates to the LDOS itself. Therefore the recursive solutionof (9.28) gives an approximation to LDOS in terms of its moments [96]

μ(n)iα =

∫Enniα(E)dE =

∑alljkβk

Hiα,j1β1Hj1β1,j2β2 ...Hjn−1βn−1,iα (9.30)

and may be interpreted as self-returning (closed) loops of hops of the lengthn for electrons over neighboring atoms. The local atomic environment de-fines the LDOS via the moments 9.30, which in turn is used to calculate thebond order in Eq. 9.21 and the (local) atomic force 9.23. The remaining freeparameters in the analytic form 9.8 with U i,j

bond instead of bij may be fittedin the usual way. This is still a hard task, because the bond and the pro-motion energy involve different parameters as the repulsive part and cut-offparameters and screening functions for all terms have to be included. Besidesan accurate fit, the BOP requires well parameterized TB matrix elements orparameter optimizing, and the problem of transferability [88, 97–99] have tobe considered separately. For BOP of order n = 2 the bond-order term in the

212 Kurt Scheerschmidt

TS-representation reads bij = (−ssσij + ppσij)Θiσ,jσ − 2ppπijΘiπ,jπ and thenumerical behavior of BOP2 and TS are approximately equivalent. The de-tails for higher order BOP are given in the papers of the Pettifor group. [2,88]The bij terms of the analytic BOP4 involve complex angular dependencies,partially beyond those neglected in Pettifor’s formalism. For structures withdefects as well as the wafer bonding of diamond surfaces where π-bonds can-not be neglected, BOP can be found in [100–103] and will soon be publishedelsewhere with detailed derivations.

9.4 Applications

It is impossible to review the rapidly growing number of successful applica-tions of empirical MD in materials research. A few representative examplesmay give an impression of the wide range of problems considered. Isolatedpoint defects are mainly simulated to check the quality of the fit of the poten-tial parameters. Surface reconstructions, adatom and absorption phenomena,growth processes, and especially extended defect structures and interactionscan be investigated in detail. The analysis of dislocation core structures [108],the use of core structure data for studying dislocation kink motion [109] orthe dislocation motion during nanoindentation [110] are examples. Interfaceinvestigations have a long tradition, as illustrated in the standard book forgrain boundary structure [111] and growth [112]. Heterophase interfaces,e.g., using a Khor-de-Sama potential for Al and TS for SiC [113] to simulateAl/SiC interfaces, demand special attention to the correct description of themisfit [114] to get good interface energies. Simulations with the Tersoff po-tential and its modifications yield the correct diameter of the critical nucleifor the growth of Ge nanocrystals in an amorphous matrix [115], allow thestudy of growth, strains, and stability of Si- and C- nanotubes [116,117], andSiC surface reconstructions to propose SiC/Si interface structures [118]. Areview of atomistic simulations of diffusion and growth on and in semicon-ductors [119] demonstrates the applicability of SW- and TS -potentials incomparison with data from TB- and DFT-calculations.

In the following two examples of our work are discussed. empirical MDsimulations of first, high resolution electron microscopy (HREM) image con-trasts in quantum dots (sect. 9.4.1) and second, the physical processes atinterfaces during wafer bonding (sect. 9.4.2).

9.4.1 Quantum dots: Relaxation, reordering, and stability

A quantum dot (QD) is a nanometer scaled island or region of suitable ma-terial free-standing on or embedded in semiconductor or other matrices. Thepossibility to arrange QDs into complex arrays implies many opportunities forscientific investigations and technological applications. However, dependingon the growth techniques applied (mainly MBE and MOCVD), the islands

9 Empirical Molecular Dynamics 213

differ in size, shape, chemical composition and lattice strain, which stronglyinfluences the confinement of electrons in nanometer scaled QDs. Shape, sizeand strain field of single QDs, as well as quality, density, and homogeneity ofequisized and equishaped dot arrangements determine the optical properties,the emission and absorption of light, the lasing efficiency, and other optoelec-tronic device properties [120, 121]. A critical minimum QD size is requiredto confine at least one electron/exciton in the dot. A critical maximum QDsize is related to the separation of the energy levels for thermally induced de-coupling. Uniformity of the QD size is necessary to ensure coupling of statesbetween QDs. The localization of states and their stability depend further oncomposition and strain of the QDs. The strain relaxation at facet edges andbetween the islands is the driving force behind self-organization and lateralarrangement, vertical stacking on top or between buried dots, or pre-orderingby surface structuring. In addition, an extension of the emission range to-wards longer wavelengths needs a better understanding and handling of thecontrolled growth via lattice mismatched heterostructures or self-assemblingphenomena (see, e.g., [122]).

A wide variety of imaging methods are used to investigate growth, self-assembly, and physical properties of quantum dots. Among these the cross-section HREM and the plan-view TEM imaging techniques are suitable meth-ods to characterize directly shape, size, and strain field [123]. But the HREMand TEM techniques images are difficult to interpret phenomenologically, es-pecially when separating shape and strain effects. Modeling is essential touniquely determine the features and provide contrast rules.

In Fig. 9.4, the atomic structure of an InGaAs-QD in a GaAs matrix (forother systems cf. [123,124]) is prescribed by geometric models and relaxed byMD simulations. Very different dot shapes have been proposed and theoret-ically investigated: lens-shaped dots, conical islands, volcano type QDs, andpyramids with different side facets of type {011}, {111}, {112}, {113}, {136},and both {011} + {111} mixed, etc. Some of these and a spherical cap areschematically presented in Fig. 9.4a; in simulations one or two monolayers(ML) thick wetting layers are included, too. The most important differencebetween the various structures is the varying step structure of the facetsdue to their different inclination. There are at least two reasons to inves-tigate these configurations [123]. First, small embedded precipitates alwayshave facets; a transition between dome-like structure and pyramids due tochanges in spacer distance, change the number and arrangement of the facets,and thus strain and electronic properties. Second, for highly facetted struc-tures the continuum elasticity is practically inapplicable and finite-elementcalculations must be done in 3-D instead of 2-D. However, the technique ofab initio MC provides a more accurate prediction of the QD-shapes than em-pirical MD [125]. The embedding of one perfect {011} pyramid in a matrix isdemonstrated in Fig. 9.4b after pre-relaxation. Fig. 9.4c shows typical anneal-ing behavior during empirical MD calculation, characterized by the potential

214 Kurt Scheerschmidt

Fig. 9.4. Structure modeling and image simulation of different pyramidal-shapedquantum dot configurations: a) different faceting, truncation, and wetting of pyra-midal start models (matrix removed, models related to (001)-base planes), b) re-laxed complete model of a {011} pyramid (base length about 6nm, 10 × 10 × 10-supercell length 10 nm), c) energy relaxation of a {011} quantum dot ( potentialEpot and total Etot energy versus time steps), d) cross-section HREM and brightfield diffraction contrast simulated for model (b) before and after relaxation assum-ing a standard 400kV-microscope at Scherzer focus (i.e. Δ= -40nm, Cs = 1mm,α=1.2nm−1, t=9nm, δ=8 nm−1, cf. [123]).

9 Empirical Molecular Dynamics 215

Epot and the total energy Etot per atom. The energy difference Etot −Epot isequal to the mean kinetic energy and thus directly related to the temperatureof the system. After the pre-relaxation of 5ps at 0K, an annealing cycle fol-lows, 60ps stepwise heat up to 200K and cool down to 0K, equilibrating thesystem at each heating step. The example here demonstrates a short cycle;most of the embedded QDs are relaxed at each T-step for at least 10,000 timesteps of 0.25fs, followed by annealing up to about 600K (the temperature isnot well defined with empirical potentials but is below the melting temper-ature). Whereas the structure in Fig. 9.4b is less strained, highly strainedconfigurations occur due to the self-interaction of the QD in small supercellswhich correspond to a stacked sequence with very small dot distances. Theextension of the supercell chosen in the simulations depends on the extensionof the QD to be investigated, as well as on the overlap of the strain fields atthe borders, to avoid self-interactions. Whereas Fig. 9.4b shows a relativelysmall supercell, investigations are made for supercells of up to 89×89×89 unitcells with a base length of the QD of 9nm containing several million atoms.For the image simulations subregions of the supercells are used, sliced intoone atomic layer and applying the multi-slice image simulation technique.By comparing imaging for structures before and after relaxation, Figs. 9.4ddemonstrates the enormous influence of the relaxations on the image con-trast in cross-section HREM and TEM [123, 124]. In summarizing some ofthe results one can state that MD calculations with SW (CdZnSe) and TII(InGaAs, Ge) allow us to obtain well relaxed structures, in contrast to thestatic relaxations performed with the Keating potential [104, 105]. With thisinsight into the atomic processes of rearranging and straining QD’s at anatomic level the growth conditions for quantum dots may be enhanced as afirst step to tailor their properties.

9.4.2 Bonded interfaces: tailoring electronic or mechanicalproperties?

Wafer bonding, i.e. the creation of interfaces by joining two wafer surfaces,has become an attractive method for many practical applications in micro-electronics, micromechanics or optoelectronics [126]. The macroscopic prop-erties of bonded materials are mainly determined by the atomic processesat the interfaces (clean and polished hydrophobic or hydrophilic surfaces, asschematically shown in Fig. 9.5) during the transition from adhesion to chem-ical bonding. For this elevated temperature or external forces are required, ascould be revealed by MD simulations of hydrogen passivated interfaces [127]or of silica bonding [54]. Thus, describing atomic processes is of increasinginterest to support experimental investigations or to predict bonding behav-ior. Already slightly rescaled SW-potentials predict the bond behavior viabond breaking and dimer reconfiguration as shown in the snapshots explain-ing the possibility of covalent bonding at room temperature for very cleanhydrophobic surfaces under UHV conditions [128].

216 Kurt Scheerschmidt

Fig. 9.5. Wafer bonding at Si[100] surfaces: simulations of surface rearrangementfor perfect alignment and slow heating. The figure describes the evolution of theatomic relaxation leading to bonded wafers.

Whereas the bonding of two perfectly aligned, identical wafers give asingle, perfectly bonded wafer without defects, miscut of the wafer results insteps on the wafer surfaces and edge dislocations at the bonded interfacesare created. In Fig. 9.6 the red color describes the potential energy abovethe ground state during wafer bonding over two-atomic steps. The upperterraces behave like perfect surfaces and the dimerized starting configurationof Fig. 9.6a is rearranged and new bonds are created. The energy gained isdissipated and for slow heat transfer the avalanche effect leads to bondingof the lower terraces, too. Two 60◦ dislocations remain and, depending onthe rigid shift of the start configuration, an additional row of vacancies [128].The dislocations may split as simulated and observed in experiments [129].

Bonding wafers with rotational twist leads to a network of screw dislo-cations at the interface. A special situation is the 90◦ twist, which alwaysoccurs between monoatomic steps. A Stillinger-Weber potential applied toa 90◦ twist bonded wafer pair [16] yields a metastable fivefold coordinated

9 Empirical Molecular Dynamics 217

Fig. 9.6. MD simulation of wafer bonding over surface steps. Note the dislocationrelaxation for the initial configuration (left), a snapshot during annealing up to900K, after 12.5ps (middle) and the final relaxed state (right).

interface with a mirror symmetry normal to the interface characterized bya Pmm(m)-layer group (cf. Fig. 9.2(a)). Using the Tersoff or BOP-like po-tentials [102] and metastable or well-prepared starting configurations allowsfurther structure relaxation and energy minimization. Figure 9.2(b) showsthis relaxed configuration, which is (2x2) reconstructed and consists of struc-tural units with a 42m-(D2d) point group symmetry, called the 42m-dreidl.It should be emphasized that the dreidl structure is found to be the mini-mum energy configuration also in DFT-LDA simulations [16]. However, theenergies differ from those given in [130,131].Much more important is the mod-ification of the band structure due to the different interface relaxation whichmay enable engineering the electronic properties: whereas the metastableconfiguration (cf. Fig. 9.2(c)) yields semi-metallic behavior, the dreidl struc-ture (cf. Fig. 9.2(d)) yields a larger band gap than in perfect lattices. Thedreidl interface structure and its band structure modification is very similarto the essential building blocks proposed by Chadi [132] for group-IV mate-rials. They describe geometry and properties of the transformation of Si andGe under pressure and the special allo -phases as a new class of crystallinestructures.

A small misalignment of the wafers during wafer bonding yields bondedinterfaces with twist rotation resulting in a checkerboard-like interface struc-ture [133, 134]. Fig. 9.7 shows some of the resulting minimum structuresgained for higher annealing temperatures and different twist rotation an-gles. Before the bonding process takes place, the superposition of the twowafers looks like a Moire pattern in the projection normal to the interface.After bonding and sufficient relaxation under slow heat transfer conditions,almost all atoms have a bulk-like environment separated by misfit screw dis-locations, which may have many kinks. The screw dislocation network of thebonded wafer has a period half of those of the Moire pattern. One finds moreimperfectly bonded regions around the screw dislocations for smaller twistangles, whereas bonding at higher angles result in more or less widely spreadstrained interface regions. In Fig. 9.7 bonding with orthogonal start config-urations (a,b) is compared to those with parallel dimer start configurations

218 Kurt Scheerschmidt

Fig. 9.7. MD relaxations of bonding rotationally twisted wafers ([001] and [110]views) with different angles: a) 2.8◦, 22nm box, orthogonal dimers, [001] view; b)6.7◦, 9.2nm box, orthogonal dimers, [001] and [110] views; c) as a) and d) as b)with parallel dimers.

(c,d). Thus the bonding of Figs. 9.7 (a,b) may be considered as bonding withan additional 90◦ twist rotation. Clearly the periodicity of the defect regionis twice of those of (b,c) smoothing out the interface, but creating additionalshear strains. Irrespective of the chosen twist angles and box dimensions allfinal structures yield bond energies of approximately 4.5eV/atom at 0K, how-ever, varying slightly with the twist angle. A maximum occurs between 4o

and 6o twist related to a change of the bonding behavior itself. The higherthe annealing temperature the better the screw formation. In conclusion,simulation of the atomic processes at wafer bonded interfaces offers not onlythe tailoring of electronic properties, it is also an important step in under-standing how to control the creation of special interface structures for strainaccommodation or pre-patterned templates (compliant substrates) [126] byrotational alignment.

9.5 Conclusions and outlook

Molecular dynamics simulations based on empirical potentials provide a suit-able tool to study atomic processes which influence macroscopic materialsproperties. The applicability of the technique is demonstrated calculatingquantum dot relaxations and interaction processes with defect creation atwafer bonded interfaces. A brief overview describes the method itself andits advantages and limitations, i.e. the macroscopic relevance of the simula-tions versus the limited transferability of the potentials. The quality of thesimulations and the reliability of the results depend on the coupling of theatomistic simulations across length and time scales. Whereas the Lagrangeformalism is well established for the embedding into suitable environments at

9 Empirical Molecular Dynamics 219

the continuum level, the approximations of first principles based potentialsneed continuous future work.

References

1. W. Kohn: Rev. Mod. Phys. 71, 1253 (1999)2. M.W. Finnis: Interatomic Forces in Condensed Matter, Oxford Series on Ma-

terials Modelling (Oxford University Press, Oxford, 2003)3. K. Ohno, K. Esforjani, Y. Kawazoe: Computational Materials Science: From

ab initio to Monte Carlo Methods (Springer, Berlin, 1999)4. M.C. Payne, M.P. Teter, D.C. Allen, et al: Rev. Mod. Phys. 64, 1045 (1992)5. G. Galli, A. Pasquarello: First-Principles Molecular Dynamics, In: Computer

Simulations in Chemical Physics, ed by M.P. Allen, D.J. Tildesley (KluwerAcad. Publ. 1993) pp. 261-313

6. Th. Frauenheim, G. Seifert, M. Elstner, et al: Phys. Status Solidi B 217, 41(2000)

7. J.M. Soler, E. Artacho, J.D. Gale, et al: J. Phys.: Condens. Matter 14, 2745(2002)

8. D.R. Bowler, T. Miyazaki, M.J. Gillan: J. Phys.: Condens. Matter 14, 2781(2002)

9. M.D. Segall, Ph.J.D. Lindan, M.J. Probert, et al: J. Phys.: Condens. Matter14, 2717 (2002)

10. R. Car, M. Parinello: Phys. Rev. Lett. 55, 2471 (1985)11. G.C. Csany, T. Albaret, M.C. Payne, A. DeVita: Phys. Rev. Lett. 93, 175503

(2004)12. P. Ruggerone, A. Kley, and M. Scheffler: Bridging the length and time scales:

from ab initio electronicstructure calculations to macroscopic proportions In: Comments on CondensedMatter Physics ed by D.A. King, D.P. Woodruff (Elsevier Science Ltd., Ams-terdam 1997) 8, in press, arXiv:cond-mat/9802012

13. R.M Nieminen: J. Phys.: Condens. Matter 14, 2859 (2002)14. R.E. Rudd, J.Q. Broughton: Phys. Status Solidi B 217, 251 (2000)15. B.I. Lundquist, A. Bogicevic, S. Dudy, et al: Comput. Mater. Sci. 24, 1 (2002)16. K. Scheerschmidt, D. Conrad, A. Belov: Comput. Mater. Sci. 24, 33 (2002)17. J. M. Haile: Molecular Dynamics Simulation : Elementary Methods (Wiley,

New York 1992)18. D. C. Rapaport: The Art of Molecular Dynamics Simulation (Cambridge Univ.

Press, Cambridge 1998)19. R. Haberlandt: Molekulardynamik: Grundlagen und Anwendungen (Vieweg-

Lehrbuch Physik, Braunschweig, Wiesbaden 1995)20. W.G. Hoover: Molecular Dynamics, Lecture Notes in Physics 258 (Springer,

Berlin Heidelberg New York Tokyo 1986)21. I.G. Kaplan: Theory of Molecular Interactions, Studies in Physical and Theo-

retical Chemistry 42, (Elsevier Science Ltd., Amsterdam 1986)22. M. Sprik: Introduction to molecular dynamics methods. In: Monte Carlo and

Molecular Dynamics of Condensed Matter Systems, ed by K. Binder and G.Ciccotti, Conf. Proc. Vol. 49, Chapter 2, (SIF, Bologna, 1996) pp. 47-87

220 Kurt Scheerschmidt

23. D.W. Heermann: Computer Simulation Methods in Theoretical Physics,(Springer, Berlin Heidelberg New York Tokyo 1990)

24. J.M. Ziman: Principles of the Theory of Solids, (Cambridge University Press,Cambridge 1972)

25. S.W. deLeeuw, J.W. Perram, E.R. Smith: Proc. Roy. Soc. London A, 373, 27(1980)

26. D. Wolf, P. Keblinski, S.R. Phillpot, J. Eggebrecht: J. Chem. Phys. 110, 8254(1999)

27. A.F. Voter: Phys. Rev. B 57, 13985 (1998)28. A.F. Voter, F. Montalenti, T.C. Germann: Ann. Rev. Mater. Res. 32, 321

(2002)29. W. Rullmann, H. Gunsteren, H. Berendsen: Mol. Phys. 44, 69 (1981)30. S. Nose: Mol. Phys. 52, 255 (1984)31. S. Nose: J. Chem. Phys. 81, 511 (1984)32. W.G. Hoover: Phys. Rev. A 31, 695 (1985)33. H.C. Andersen: J. Chem. Phys. 72, 2384 (1980)34. M. Parinello, A. Rahman: J. Appl. Phys. 52, 7182 (1981)35. L. Yang, D.J. Srolowitz, A.F. Yee: J. Chem. Phys. 107, 4396 (1997)36. G.P. Morriss, D.J.Evans: Phys. Rev. A 35, 792 (1987)37. D.J.Evans, G.P. Morriss: Statistical Mechanics of Nonequilibrium Liquids

(Academic Press, London and New York 1990)38. N.A. Marks, M.W. Finnis, J.H. Harding, N.C. Pyper: J. Chem. Phys. 114, 4406

(2001)39. K.W. Jacobsen, J.K. Norskov, M.J. Puskas: Phys. Rev. B 35, 7423 (1987)40. N.A. Marks: J. Phys.: Condens. Matter 14, 2901 (2002)41. M.S. Daw: Phys. Rev. B 39, 7441 (1989)42. Y.C. Wang, K. Scheerschmidt, U. Gosele: Phys. Rev. B 61, 12864 (2000)43. H.R. Trebin: Comp. Phys. Comm. 121-122, 536 (1999)44. A.K. Rappe, C.J. Casewit, K.S. Colwell, et al: J. Am. Chem. Soc. 114, 10024

(1992)45. W.F. van Gunsteren, H.J.C. Berendsen: Angew. Chem., Int. Ed. 29, 992 (1990)46. F.H. Stillinger, T.A. Weber: Phys. Rev. B 31, 5262 (1985)47. R. Biswas, D.R. Hamann: Phys. Rev. B 36, 6434 (1987)48. T. Takai, T. Halicioglu, W.A. Tiller: Scr. Metal. 19, 709 (1985)49. C.P. Herrero: J. Mater. Res. 16, 2505 (2001)50. A.K. Nakano, B.J. Berne, P. Vashista, R.K. Kalia: Phys. Rev. B 39, 12520

(1989)51. S. H. Garofalini: J.Non-Cryst. Solids 120, 1 (1990)52. D. Timpel, K. Scheerschmidt, S.H. Garofalini: J. Non-Cryst. Solids 221, 187

(1997)53. D. Timpel, K. Scheerschmidt: J. Non-Cryst. Solids 232-234, 245 (1998)54. D. Timpel, M. Schaible, K. Scheerschmidt: J. Appl. Phys. 85, 2627 (1999)55. M.I. Baskes: Phys. Rev. B 46, 2727 (1992)56. M.W. Finnis, J.E. Sinclair, Philos. Mag. A 50, 45 (1984)57. J. Tersoff: Phys. Rev. Lett. 56, 632 (1986)58. J. Tersoff: Phys. Rev. B 38, 9902 (1989)59. J. Tersoff: Phys. Rev. B 39, 5566 (1989)60. W. Dodson: Phys. Rev. B 35, 2795 (1987)61. G.C. Abell: Phys. Rev. B 31, 6184 (1985)

9 Empirical Molecular Dynamics 221

62. D.W. Brenner: Phys. Status Solidi B 217, 23 (2000)63. M.V.R. Murty, H.A. Atwater: Phys. Rev. B 51, 4889 (1995)64. D.W. Brenner: Phys. Rev. B 42, 9458 (1990)65. A.J. Dyson, P.V. Smith: Surf. Sci. 355, 140 (1996)66. K. Beardmore, R. Smith: Philos. Mag. A 74, 1439 (1996)67. K. Nordlund, J. Nord, J. Frantz, J. Keinonen, Comput. Mater. Science 18, 504

(2000)68. T.J. Lenosky, B. Sadigh, E. Alonso, et al: Model. Simul. Mater. Sci. Eng. 8,

825 (2000)69. S.D. Chao, J.D. Kress, A. Redondo: J. Chem. Phys. 120, 5558 (2004)70. H. Balamane, T, Halicioglu, W.A. Tiller: Phys. Rev. B 46, 2250 (1992)71. M. Kohyama, S. Takeda: Phys. Rev. B. 60, 8075 (1999)72. M. Gharaibeh, S.K. Estreicher, P.A. Fedders: Physica B 273-274, 532 (1999)73. S.K. Estreicher: Phys. Status Solidi B 217, 513 (2000)74. W. Windl: Phys. Status Solidi B 226, 37 (2001)75. S. Kohlhoff, P. Gumbsch, H.F. Fischmeister: Philos. Mag. A 64, 851 (1991)76. R.E. Rudd, J.Q. Broughton: Phys. Rev. B 58, R5893 (1998)77. A.Yu. Belov: Dislocations Emerging at Planar Interfaces In: Elastic Strain

Fields and Dislocation Mobility, ed by V.L. Indenbom, J. Lothe (Elsevier Sci-ence Ltd., Amsterdam 1992) pp. 391-446

78. J.E. Sinclair: J. Appl. Phys 42, 5321 (1971)79. D.G. Pettifor: Phys. Rev. Lett. 63, 2480 (1989)80. C.M. Goringe, D.R. Bowler, E. Hernandez: Rep. Prog. Phys. 60, 1447 (1997)81. A.P. Horsfield, A.M. Bratkovsky: J. Phys.: Condens. Matter 12, R1 (2000)82. D.A Papaconstantopoulos, M.J. Mehl: J. Phys.: Condens. Matter 15, R413

(2003)83. D.G. Pettifor, I.I. Oleinik: Phys. Rev. B 59, 8487 (1999)84. D.G.Pettifor,I.I.Oleinik, D.Nguyen-Manh, V.Vitek: Comput. Mater. Sci. 23, 33

(2002)85. D.G. Pettifor: From Exact to Approximate Theory: The Tight Binding Bond

Model and Many-Body Potentials. Springer Series, Vol. 48, (Springer, BerlinHeidelberg New York Tokyo 1990)

86. D.G. Pettifor, I.I. Oleinik: Phys. Rev. Lett. 18, 4124 (2000)87. D.G. Pettifor, M.W. Finnis, D. Nguyen-Manh, et al: Mat. Sci. Eng. A 365, 2

(2004)88. A. P. Sutton: Electronic Structure of Materials, (Clarendon Press, Oxford

1994)89. A.P. Sutton, M.W. Finnis, D.G. Pettifor, Y. Ohta: J. Phys. C 35, 35 (1988)90. H. Hellmann: Einfuhrung in die Quantenchemie (Deuticke Verlag, Leipzig

1937)91. R.P. Feynman: Phys. Rev. 56, 340 (1939)92. J.C. Slater and G.F. Koster: Phys. Rev. 94, 1498 (1954)93. C. Lanczos: J. Res. Natl. Bur. Stand. 45, 225 (1950)94. A.P. Horsfield, A.M. Bratovsky, M. Fear, et al: Phys. Rev. B 53, 12694 (1996)95. R. Haydock: Recursive solution of the Schrodinger equation, Solid State

Physics, Vol. 35, (Academic Press, London and New York 1980)96. F. Ducastelle: J. Phys. 31, 1055 (1970)97. O.F. Sankey, D.J. Niklewski: Phys. Rev. B 40, 3979 (1989)98. I. Kwon, R. Biswas, C.Z. Wang, et al: Phys. Rev. B 49, 7242 (1994)

222 Kurt Scheerschmidt

99. T.J. Lenosky, J.D. Kress, I. Kwon, et al: Phys. Rev. B 55, 1528 (1997)100. D. Conrad: Molekulardynamische Untersuchungen fur Oberflachen und Grenz-

flachen von Halbleitern. Thesis, Martin-Luther-Universitat, Halle (1996)101. V. Kuhlmann: Entwicklung analytischer bond-order Potentiale fur die em-

pirische Molekulardynamik. Thesis, Martin-Luther-Universitat, Halle (2004)102. D. Conrad, K. Scheerschmidt: Phys. Rev. B 58, 4538 (1998)103. D. Conrad, K. Scheerschmidt, U. Gosele: Appl. Phys. Lett. 77, 49 (2000)104. P.N. Keating: Phys. Rev. 145, 637 (1966)105. Y. Kikuchi, H. Sugii, K. Shintani: J. Appl. Phys. 89, 1191 (2001)106. M.Z. Bazant, E. Kaxiras, J.F. Justo: Phys. Rev. B 56, 8542 (1997)107. K. Scheerschmidt, D. Conrad, Y.C Wang: Comput. Mater. Sci. 22, 56 (2001)108. T. Harry, D.J. Bacon: Acta Mater. 50, 195 (2002)109. V.V. Bulatov, J.F. Justo, W. Cai, et al: Philos. Mag. A 81, 1257 (2001)110. E.T. Lilleodden, J.A. Zimmermann, S.M. Foiles, W.D. Nix: J. Mech. Phys.

Solids 51, 901 (2003)111. D. Wolf, S. Yip (Eds.): Materials Interfaces: Atomic-Level Structure and

Properties (Chapman & Hall, London 1992)112. A.J. Haslam, D. Moldovan, S.R. Phillpot, et al: Comput. Mater. Sci. 23, 15

(2002)113. X. Luo, G. Qian, E.G. Wang, Ch. Chen: Phys. Rev. B 59, 10125 (1999)114. R. Benedek, D.N. Seidman, C. Woodward: J. Phys.: Condens. Matter 14, 2877

(2002)115. K.J. Bording, J. Tafto: Phys. Rev. B 62, 8098 (2000)116. J.W. Kang, J.J. Seo, H.J. Hwang: J. Nanosci. Nanotechnol. 2, 687(2002)117. D. Srivasta, D.W. Brenner, J.D. Schall, et al: J. Physical Chem. B 103, 4330

(1999)118. C. Koitzsch, D. Conrad, K. Scheerschmidt, U. Gosele: J. Appl. Phys. 88, 7104

(2000)119. E. Kaxiras: Comput. Mater. Sci. 6, 158 (1996)120. D. Bimberg, M. Grundmann, N. N. Ledentsov: Quantum Dot Heterostructures

(John Wiley & Sons., Chicester 1999)121. L. W. Wang, A. Zunger: Pseudopotential Theory of Nanometer Silicon Quan-

tum Dots, In: Semiconductor Nanoclusters - Physical, Chemical, and CatalyticAspects, ed by P.V. Kamat, D. Meisel (Elsevier Science, Amsterdam, New York1997)

122. H. Lee, J. A. Johnson, M. Y. He, et al: Appl. Phys. Lett. 78, 105 (2001)123. K. Scheerschmidt, P. Werner: Characterization of structure and composition

of quantum dots by transmission electron microscopy. In: Nano-Optoelectronics:Concepts, Physics and Devices, ed by M. Grundmann (Springer, Berlin Heidel-berg New York Tokyo 2002) Chapter 3, pp. 67-98

124. K. Scheerschmidt, D. Conrad, H. Kirmse, R. Schneider, W. Neumann: Ultra-microscopy 81, 289 (2000)

125. E. Pehlke, N. Moll, A. Kley, M. Scheffler: Appl. Phys. A 65, 525 (1997)126. Q.Y. Tong, U. Gosele: Semiconductor Wafer Bonding: Science and Technology

(Wiley, New York 1998)127. D. Conrad, K. Scheerschmidt, U. Gosele: Appl. Phys. Lett. 71, 2307 (1997)128. D. Conrad, K. Scheerschmidt, U. Gosele: Appl. Phys. A 62, 7 (1996)129. A.Y. Belov, R. Scholz, K. Scheerschmidt: Philos. Mag. Lett. 79, 531 (1999)130. A.Y. Belov, D. Conrad, K. Scheerschmidt, U. Gosele: Philos. Magazine A77,

55 (1998)

9 Empirical Molecular Dynamics 223

131. A.Y. Belov, K. Scheerschmidt, and U. Gosele: Phys. Status Solidi A 159, 171(1999)

132. D.J. Chadi: Phys. Rev. B 32, 6485 (1985)133. K. Scheerschmidt: MRS Proceedings Volume 681E, I2.3 (2001)134. K. Scheerschmidt, V. Kuhlmann: Interface Sci. 12, 157 (2004)

10 Defects in Amorphous Semiconductors:

Amorphous Silicon

D. A. Drabold and T. A. Abtew

Department of Physics and Astronomy, Ohio University, Athens [email protected]

10.1 Introduction

This book is primarily concerned with defects in crystalline materials. Inthis chapter, we depart from this and discuss defects in amorphous materi-als. In the chapter of Simdayankin and Elliott, a most interesting feature ofamorphous materials, photoresponse, is discussed in detail. As in the caseof crystals, defects determine key features of the electronic, vibrational andtransport properties of amorphous materials [1–3]. These are often preciselythe properties that are relevant to applications. In this chapter we stronglyemphasize amorphous silicon (a-Si), while not focusing on it exclusively. Thisis significantly due to the background of the authors and constraints on thelength of the chapter, but it is also true that a-Si offers at least a somewhatgeneric theoretical laboratory for the study of disorder, defects, and otheraspects of amorphous materials in general.

The outline of this chapter is as follows: First, we briefly survey amorphoussemiconductors. Next, we define the notion of defect, already a somewhatsubtle question in an environment which is intrinsically variable in structure.In the fourth section, we discuss the generic electronic and vibrational at-tributes of defects in amorphous solids, and briefly comment on the currentmethods for studying these systems. In section 10.5, we discuss calculationsof the dynamics of defects in the a-Si:H network, of critical importance tothe stability and practical application of the material.

10.2 Amorphous Semiconductors

Amorphous materials and glasses are among the most important to the hu-man experience with applications ranging from primitive obsidian weaponsto opto-electronics. Today, every office in the world contains storage me-dia for computers (for example DVD-RW compact disks) which exploit thespecial reversible laser-driven amorphization/crystallization transition prop-erties of a particular GeSbTe glass. The same computer uses crystalline Sichips which depend critically upon the dielectric properties of a-SiO2. Withincreasing pressure on global energy markets, it is notable that some of themost promising photovoltaic devices are based upon amorphous materials

226 D. A. Drabold and T. A. Abtew

such as a-Si:H because of the small expense of the material (compared tocrystalline devices) and the ability to grow thin films of device quality mate-rial over wide areas. The internet depends upon fiber optic glass light pipeswhich enable transmission of information with bandwidth vastly exceedingthat possible with wires. This list is the proverbial tip of the iceberg.

Experimental measurements for structure determination, such as X-ray orneutron diffraction, lead easily to a precise determination of the structure ofcrystals. Nowadays, protein structures with thousands of atoms per unit cellare readily solved. The reason for this impressive success is that the diffractiondata (structure factor) consists of a palisade of sharply-defined spikes (theBragg peaks), arising from reflection from crystal planes. For amorphous ma-terials, the same experiments are rather disappointing; smooth broad curvesreplace the Bragg peaks. The wavelength-dependent structure factor maybe interpreted in real space through the radial distribution function (RDF),which is easily obtained with a Fourier transform of the structure factor. Incrystals, the RDF consists again of spikes, and the radii at which the spikesoccur are the neighbor distances. In the amorphous case, the RDF is smooth,and normally has broadened peaks near the locations of the first few peaks ofthe crystal. This similarity in the small-r peak positions reflects a tendencyof amorphous materials to attempt to mimic the local order of the crystallinephase in the amorphous network, but usually with modest bond length andbond angle distortions. At distances beyond several nearest-neighbor spac-ings, similarities between crystalline and amorphous pair correlations wane,and the pair correlations decay to zero after tens of Angstroms in the amor-phous material (this number is system dependent).

From an information theoretic point of view (for example, considerationof the Shannon-Jaynes information entropy [4]), it is clear that there is lessinformation inherent to the (smooth) data for the amorphous case relative tothe crystal. So while it is possible to invert the diffraction data from crystalsto obtain structure determination (with some assumptions to fix the phaseproblem), such a process fails for the amorphous case and many studies haveemphasized the multiplicity of distinct structures possible which can repro-duce measured diffraction data. In this sense we are in a situation ratherlike high energy or nuclear physics where there are sum rules which mustbe obeyed, but the sum rules by themselves offer an extremely incompletedescription of the physical processes. The first goal of theory work in amor-phous materials is to obtain an experimentally realistic model consistent withall trusted experiments and also accurate total energy and force calculations.

Early models of amorphous materials were made by hand, but this ap-proach was soon supplanted by simulations with the advent of digital com-puters. A variety of modeling schemes based upon simulated atomic dynamics(so called molecular dynamics), Monte Carlo methods, methods based uponattempts to obtain the structure from the diffraction measurements (mostsensibly done with constraints to enforce rules on bonding) and even hy-

10 Defects in Amorphous Semiconductors: Amorphous Silicon 227

brids between the last two are in current use [5–8]. Impressive progress hasbeen made in the last 25 years of modeling, and highly satisfactory mod-els now exist for many amorphous materials, among these: Si, Si:H, Ge, C,silica, elemental and binary chalcogenides [9–12]. Empirically, systems withlower mean coordination (floppier in the language of Thorpe [13]) are easierto model than higher coordination systems. Multinary materials tend to bemore challenging than elemental systems as the issue of chemical ordering be-comes important and also these systems are harder to model accurately withconventional methods. empirical potentials, tight binding schemes and ab ini-tio methods are used for modeling the interatomic interactions [5, 14–16]. Inthe area of defects in amorphous semiconductors, it is usually the case thatfirst principles interactions are required.

Where electronic properties of amorphous semiconductors are concerned,k-space methods are not useful as the crystal momentum is not a good quan-tum number (the translation operator does not commute with the Hamilto-nian). Rather, one focuses on the density of electron states, and on individualeigenvectors of the Hamiltonian, especially those near the Fermi level. Theelectronic density of states is usually qualitatively like that of a structurally-related crystal, but broadened by disorder. In Fig. 10.1, we illustrate thesefeatures with a “real” calculation on a 10 000 atom model of a-Si [11, 17].The sharp band edges expected in crystals are broadened into smooth “bandtails”. The tails are created by structural disorder; early Bethe lattice calcu-lations demonstrated that strained bond angles lead to states pushed out ofthe band (into the forbidden optical gap), and loosely, the more severe thedefect, the further into the gap. It is known from optical studies of amorphoussemiconductors that all such systems have band tails decaying exponentiallyinto the gap. While the optical spectrum is a convolution involving both tails,photoemission studies have separately probed the valence tail and conductiontails particularly in a-Si:H [18], and found that both decay exponentially,albeit with different rates and a temperature dependence (quite different forthe two tails) that suggests the importance of thermal disorder beside thestructural order we have stressed so far [21].

The field of amorphous semiconductors is a vast area in which many booksand thousands of papers have been written, and it is impossible to do eventhe specialized topic of defects justice. For this reason, we emphasize a fewquestions of particular current interest, and recommend the comprehensivetreatments of Elliott and Zallen [19, 20] for general discussion of amorphousmaterials including defects.

228 D. A. Drabold and T. A. Abtew

Fig. 10.1. Electronic density of states near a band edge in 10 000 atom modelof amorphous Si. Insets show |ψ|2, indicating parts of the cell in which the defectwave functions are localized. The mobility edge separating localized and extendedstates) is indicated. The “band tail” extends from about -15.5 eV to the mobilityedge [17].

10.3 Defects in Amorphous Semiconductors

10.3.1 Definition for defect

Static Network

A complete discussion of defects in amorphous materials may be found inChapter 6 of the book by Elliott [19]. Evidently, the term defect implies adeparture from the typical disorder of an amorphous network. Perhaps it isnot surprising that it is not possible to produce a definition of defect with-out some arbitrary character. Thus, in defining a coordination defect, oneintroduces a distance which defines whether a pair of atoms is bonded or not.So long as the model structure contains no bonds very close to the criticallength, the definition is unambiguous. In amorphous materials there is usu-ally a well-defined (deep) minimum in the pair-correlation function betweenthe first and second neighbor peaks, which gives a reasonable justification forselecting the distance at which the minimum occurs as cutoff-length for defi-nition of coordination. When one accounts for thermal motion of the atoms,the situation is murkier as we describe below.

10 Defects in Amorphous Semiconductors: Amorphous Silicon 229

If the primary interest is electronic properties of the amorphous semi-conductor, one can introduce an electronic criterion to identify defects. Asfor the geometrical criterion, such a definition is necessarily formulated asa sufficient departure from the mean. An electronic defect may be definedas a structural irregularity which produces a sufficiently localized electroniceigenstate. The ambiguous point here is the modifier “sufficiently”: isolatedstates in the middle of the gap (such as a three-fold ‘dangling bond’ statein a-Si) are easily recognized as defects with a spatially confined wave func-tion. Other less localized states stemming from other network defects maybe better characterized as part of the band tailing. In the same spirit, thechemical bond order (essentially off-diagonal elements of the single-particledensity matrix) may be used to identify defects, but a cutoff will be requiredas always to define the line between bonded and not! A simple quantitativemeasure of localization is given below.

To contrast the geometrical and electronic definitions of defects, well-localized electronic eigenstates do correspond to network irregularities (eitherstructural, chemical or both) and sufficiently large geometrical irregularitiesmanifest themselves as localized states in a spectral gap in the electronicspectrum (note that this does not necessarily have to be the optical gap (thegap which contains the Fermi level) – other spectral gaps or even the extremalband edges can exhibit these states).

Dynamic Network

The schemes to define defects outlined above are consistent for sufficientdeparture from the mean structure. To a surprising degree however, roomtemperature thermal disorder implies the existence of thermally induced ge-ometrical defect fluctuations. Thus, in an ab initio simulation of a-Si, thenumber of five-fold “floating” bonds varied between 0 and 10 in a 2ps room-temperature run for a 216 atom unit cell with coordination defined by a2.74 A bond distance [21]. At face value this is astonishing, suggesting that5% of the atoms change their coordination in thermal equilibrium at roomtemperature! If the electronic and geometrical definitions were identical, itwould imply that states should be jumping wildly into and out of the gap. Ofcourse this is not the case, though it is true that instantaneous Kohn-Shamenergy eigenvalues in the gap do thermally fluctuate [21]. This example il-lustrates that geometrical defects are not necessarily electronic defects. Onecan also easily see that electronic defects are not necessarily coordinationdefects: while an atom may nominally possess ideal (8-N rule) coordination,substantial bond angle distortions can also induce localized electronic states.We present new simulations of defect dynamics in Section 10.5.

230 D. A. Drabold and T. A. Abtew

10.3.2 Long time dynamics and defect equilibria

It is apparent from a variety of experiments in a-Si:H that there are reversible,temperature-dependent changes that occur in defect type and concentration.At its simplest, the idea is that particular defects have a characteristic energyin an amorphous material. It may be that these energies are not substantiallyremoved from the typical conformations of the host. Because the amorphousnetwork does not possess long-range order, it is possible for a small num-ber of atoms to move to convert one kind of defect into another, without ahuge cost in energy. These ideas can predict temperature dependence of someelectronic properties, for example quantities like the density and character ofband tail states [22–25]. By analyzing a variety of experiments, Street andcoworkers [26] have demonstrated that the idea of defect equilibrium is es-sential. At present the approach is phenomenological to the extent that themicroscopic details of the defects involved are not certainly known or evenneeded, but in principle these ideas could be merged with formation energiesand other information from ab initio simulations to make a first principlestheory of defect equilibria. The potential union between the phenomenolog-ical modeling and simulation is also potentially a good example of ’bridgingtime scales’ though there is much work still required on the simulation sideto realize this goal.

10.3.3 Electronic Aspects of Amorphous Semiconductors

Localization

A remarkable feature of electron states in crystals is that the wave functionshave support (are nonzero) throughout the crystal, apart from surface ef-fects. Such states are called extended. The utility of Bloch’s theorem is thatit reduces the formulation of the electronic structure problem for the infiniteperiodic system to a simpler form: calculation within a single unit cell on agrid of k crystal momentum points. Thus, for crystals, information aboutthe electronic structure is provided by band structure diagrams. Such dia-grams are not useful in amorphous systems, and furthermore, in amorphousmaterials, some states may not be extended.

To model the electronic consequences of disorder, it is usual to adopt atight-binding (either empirical or ab initio) description of the electron states,and a Hamiltonian schematically of the form:

H =∑

i

εi|i〉〈i| +∑

ij,i �=j

Jij |i〉〈j|. (10.1)

To keep the number of indexes to a minimum, we suppose that there is oneorbital per site (indexed by i). In this equation, the εi specify the atomicenergy at each site (in an elemental system, this would be constant for all

10 Defects in Amorphous Semiconductors: Amorphous Silicon 231

sites) and Jij is a hopping integral which depends on the coordinates ofthe atoms located at sites i and j. The topological disorder of a particularrealization of an amorphous network (e.g. a structural model) is manifestedthrough the off-diagonal or hopping terms Jij. An enormous effort has beendevoted to studying the properties of the eigenvectors of H , and also thecritical properties (e.g. of quantum phase transitions [17]).

In realistic studies of the electronic localized-delocalized transition [17](here “realistic” means large scale electronic structure calculations in whichthe disorder in the Hamiltonian matrix is obtained from experimentally plau-sible structural models, rather than by appealing to a random number gen-erator), it is found that the eigenstates near midgap are spatially compact(localized) and for energies varied from midgap into either the valence orconduction bands, the states take “island” form: a single eigenstate may con-sist of localized islands of charge separated by volumes of low charge density.The islands themselves exhibit exponential decay, and the decay lengths forthe islands increase modestly from near mid gap to the mobility edge. Thestates become extended with approach to the mobility edge by proliferation(increasing number) of the islands [17, 22]. The qualitative nature of this lo-calized to extended transition is similar for electronic and vibrational disorderof diverse kinds and also conventional Anderson models. Thus, in importantways the localized to extended transition is universal [17].

Characterizing localized states

As we have discussed, the concept of localized states is a central one foramorphous materials and indeed defective systems in general. The questionimmediately arises: given an electronic eigenvector expressed in some repre-sentation, how can we quantify the degree of localization? The most widelyadopted gauge of localization is the so-called “inverse participation ratio”(IPR), which is defined as:

I(E) =N∑

n=1

q(n, E)2, (10.2)

where N is the number of atoms and q(n, E) is a Mulliken (or other) chargeon atom n, and the analysis is undertaken for a particular energy eigenstatewith energy E. This measure ranges from N−1 (N number of sites) for anideally extended state to 1 for a state perfectly localized to one site. One canuse other measures, such as the information entropy, but at a practical level,we have found that the IPR and information entropy produce qualitatively(not quantitatively) similar results for the localization. IPR is a simple toolfor categorizing the zoo of extended, localized and partly localized states ofamorphous semiconductors with defects [27]. As quantum chemists know, allsuch definitions are basis dependent, and thus care is needed in the interpre-tation of local charges, IPR and related quantities [28].

232 D. A. Drabold and T. A. Abtew

Locality of interatomic interactions

For atomistic simulation of any materials, a fundamental measure of thespatial non-locality of interatomic interactions is the decay of the single-particle density matrix, or equivalently, the decay characteristics of the bestlocalized Wannier functions that may be constructed in the material. Thedensity matrix is defined as:

ρ(r, r′) =∑

i occupied

ψ∗i (r′)ψi(r), (10.3)

where ψi are eigenfunctions of the Hamiltonian in position representation. Ina typical condensed matter system, the ψi are oscillating functions, almostall of which will be delocalized through space. Consequently, one can expect“destructive interference” effects as in other wave phenomena when manywiggling functions are superposed as in Equation 10.3, which can make ρ(r, r′)decay rapidly for large |r − r′|. If H is the Hamiltonian, then the electroniccontribution to the total energy is E = Tr(ρH). If the trace is carried out inthe position representation, one can then see that the decay of the densitymatrix provides information about the locality of the interatomic potential.The details of the chemistry and structure of the material determine thisdecay length, and the full results even for the asymptotic decay in a greatlysimplified two band model are too complicated to reproduce here [29]. For amaterial with a finite optical gap as we assume in this article, the decay isultimately exponential and the decay is faster for larger optical gaps.

Recently, the density matrix has been explicitly computed for a struc-turally realistic 4096 atom model [10] of a-Si which was fourfold coordinated,but with some large bond angle distortions. There are two key conclusionsfrom this work: (1) the spatial non-locality of interatomic interactions is verysimilar in a-Si and c-Si (because the density matrix decays in similar fashionfor both) and (2) defect centers in a-Si (in this model associated with bondangle distortions) have Wannier functions with asymptotic decay similar toordinary (tetrahedral) sites (of course the short range behavior involving thefirst few neighbors may be quite different) [22].

10.3.4 Electron correlation energy: electron-electron effects

For a localized single-particle defect state, one must consider the possibility ofthe state being occupied by zero, one or two (opposite spin) electrons. Consid-eration of the energetics of these various occupations led in the seventies andeighties to insights into the nature of defects, particularly in chalcogenide ma-terials. Typically the electron-electron Coulomb repulsion costs a net energy(usually labeled “U”) to enable two electrons to occupy the same localizedspatial state. This is called a “positive U” defect center (the name comes fromthe symbol used in a Hubbard Hamiltonian to model the electron-electron

10 Defects in Amorphous Semiconductors: Amorphous Silicon 233

interaction). In a-Si:H, U>0 for the dangling bond. It is worth mentioningthat the accurate calculation of U is a challenging task for the conventionaldensity functional methods, as these are essentially mean-field methods andappear to consistently overestimate the de-localization of defect states.

When the possibility of structural relaxation is considered as the secondopposite spin electron is added to the localized state, a negative effective U,implying a net energy lowering for double occupation, can emerge. This isobserved in chalcogens, and as pointed out by early workers, explains thepinned Fermi level and the diamagnetism of the materials (that is, lack ofunpaired spins and so absence of ESR signal). It follows that double occu-pancy implies that defects are charged, and the chemistry (p bonding) ofthe chalcogens like S or Se leads to the occurrence of “Valence AlternationPair” defects. Thus, careful simulations reveal that isolated dangling bondsin a-Se are unstable, and are likely to convert instead into VAPs [30]. Elliottdiscusses these points thoroughly [19].

10.4 Modeling Amorphous Semiconductors

10.4.1 Forming Structural Models

For amorphous materials we immediately face a challenging problem: Whatis the atomistic structure of the network? Usually there a strong tendency forparticular local structure (chemical identity of neighbors, and geometricalbonding characteristics), but this preference is only approximately enforced:there is a characteristic distribution of bond lengths and bond angles whichis dependent upon both the material and its preparation. Thus, in “betterquality” a-Si samples, almost all of the atoms are four-coordinated, and thebond angle between a reference Si atom and two neighbors is within about 10degrees of the tetrahedral angle: a strong echo of the chemistry and structureof diamond. In binary glasses (for example GeSe2), Ge is essentially alwaysfour-fold coordinated, and Ge only rarely bonds to Ge. Also, as one wouldexpect from crystalline phases of GeSe2 or simple considerations of chemicalbonding, the glass is made up predominantly of GeSe4 tetrahedral, againwith Se-Ge-Se bond angles close to the tetrahedral angle [31–37].

As reviewed by Thorpe [38], the first attempt to understand glasses wasbased upon the idea that amorphous materials were microcrystalline with avery fine grain size. Eventually, it became clear that this model could notexplain the structural experiments that were available. The idea that wasultimately accepted was advanced in 1932 by Zachariasen for amorphousSiO2, the “Continuous Random Network” (CRN) model [39]. Here, the localchemical requirements (four-coordinated Si bonded only to two-coordinatedO, and no homopolar bonds) were enforced, but there was local disorder inbond angles and to a smaller degree, bond lengths. Such networks do notpossess long range structural order. Zachariesen’s ideas were taken seriously

234 D. A. Drabold and T. A. Abtew

much later starting in the sixties, when comparisons to experiment showedthat the idea had real promise.

The next step was to use computers to help in making the models. Anearly computation was performed using a Monte Carlo approach. The ideawas to put atoms inside a simulated box and move the atoms at random[40]. At each step the radial distribution function was computed and if arandom move pushed the model closer to the experimental data, the movewas kept, otherwise moves were retained with Metropolis probability. This isvery similar in spirit to a current method, the so-called Reverse Monte Carlo(RMC) method [8]. These approaches are information theoretic in spirit: usethe available information to produce the model. While this is an eminentlyreasonable idea, the problem is that merely forcing agreement with diffractiondata (pair correlations) grossly underconstrains the model: there are manyconfigurations which agree beautifully with diffraction data but make nosense chemically or otherwise.

In the eighties, Wooten, Weaire and Winer (WWW) [9], introduced aMonte Carlo scheme for tetrahedral amorphous materials with special bond-switching moves, and energetics described by Keating springs and by requir-ing the network to be fourfold coordinated. They applied their method toa-Si and a-Ge with remarkable success. Here, the disorder of the real ma-terial was somehow captured by what appeared then to be a completely adhoc procedure. This method and improved versions of it are still the “goldstandard” for creating models of amorphous Si. Years later, Barkema andMousseau [11] showed that the likely reason for the success of the WWWscheme is that on long time scales, the WWW moves occur quite naturally!

Nowadays, the great majority of computer simulations are done usingthe molecular dynamics method. The idea is to mimic the actual processof glass formation. To start with, one needs an interatomic potential thatdescribes the interactions in the material. Typically, a well-equilibrated liquidis formed not far above the melting point, then the kinetic energy of thesystem is gradually reduced (by using some form of dissipative dynamics,such as velocity rescaling at each time step). Eventually there is structuralarrest (the computer version of the glass transition), and a structural modelresults which may be useful for further study. Such calculations are oftenuseful, but it needs to be clearly understood that there is little real similarityto the actual quenching process for glasses, which proceeds far more slowly innature. As a final oddity, the “melt-quench” approach is often used even formaterials which do not form glasses (for example, a-Si). Results are especiallypoor for this case.

10.4.2 Interatomic Potentials

To perform a molecular dynamics simulation, or even a simulation of a mererelaxation of a network near a defect, it is necessary to compute the forceson the atoms in the material. This is never easy, and offers a particular

10 Defects in Amorphous Semiconductors: Amorphous Silicon 235

challenge in amorphous materials, as there are usually a variety of bondingenvironments in the network, and empirical potentials tend to be most accu-rate/reliable near conformations that were used in a fitting process used toobtain them. The presence of defects exacerbates this further, as coordinationor chemical order may be radically different for the defect relative to the restof the material (and thus harder to describe with a simple interatomic in-teraction). Also, the complexity of interatomic potentials grows rapidly withthe number of distinct atomic species, so that even for binary systems, thereare very few reliable empirical potentials available.

The reason why it is difficult to compute accurate interatomic potentialsis that the interatomic forces are obtained from the electronic structure ofthe material. Thus, the details of bonding, electron hybridization, all dependin minute detail upon the local environment (coordination, bond lengths,bond angles, on the environment of the neighboring atoms and so on). Theway out is to adopt an approach in which the electronic structure is directlycomputed in some approximate form. This has been done with some successusing empirical tight-binding Hamiltonians [41, 42].

A tremendous advance for defects in semiconductors, but a particularbenefit for study of amorphous materials was the advent of practical densityfunctional codes. The union of density functional methods and molecular dy-namics is now mature, and one can obtain excellent canned codes which canbe used to undertake simulations of complex systems. Of course there is im-portant background knowledge (of amorphous materials, electronic structureand simulation) needed.

10.4.3 Lore of approximations in density functional calculations

Defect calculations typically must be carried out using robust approxima-tions for the Kohn-Sham states. The density functional, basis set upon whichthe Kohn-Sham orbitals are represented, and spin polarization are the mainquantities which need to be considered. Where the basis set is concerned, theprime issue is completeness: the adequacy of the basis functions to approxi-mate the “true” (complete basis limit) Kohn-Sham orbitals. Spin polarizationis of particular importance in simulations if there are unpaired spins in themodel (as for example a singly occupied dangling bond state at the Fermilevel). The choice of density functionals is most important for an accurateestimate of energetics: the use of gradient approximations tends to amelioratethe tendency of LDA to overbind.

One of the most important quantities for calculations involving defectsis the positioning of defect energy levels, and also accurate estimates of thedefect wave functions. DFT is in principle the wrong choice for either ofthese, since it is formally only a ground state theory, and also because onlythe charge density (that is, sum of the squares of the occupied Kohn-Shamorbitals) is formally meaningful within the derivation of the Kohn-Sham equa-tions. However, it is clear that this viewpoint is unduly restrictive, and the

236 D. A. Drabold and T. A. Abtew

Fig. 10.2. GeSe2 density of electron states: comparison of experiment and theory(Gaussian-broadened Kohn-Sham eigenvalues) [43]. The Fermi-level is at zero forboth curves.

density of Kohn-Sham eigenvalues is useful for comparing to the single par-ticle density of states as measured for example in photoemission (see Fig.10.2 taken from reference [43]). The Kohn-Sham DOS does systematicallyunderestimate the optical gap (often by a factor of 2). Curiously, local basisfunction calculations with a limited (single zeta) basis tend to give approxi-mately the correct gap as the incompleteness of the basis tends to exaggeratethe gap thus partly (and fortuitously) fixing the underestimate intrinsic tothe Kohn-Sham calculation. A proper job of describing these states requiresmethods beyond DFT. In this book, there are two relevant chapters, thatof Scheffler on the so called “GW” methods (these provide self-energy cor-rections to DFT) and also the chapter of Needs on quantum Monte Carlo.GW calculations of Blase and coworkers have shown that the Kohn-Sham or-bitals tend to exaggerate the extent of localized states in crystalline systems;one should expect a similar effect for amorphous systems. Unfortunately theunderstanding of these points is only empirical at present.

10.4.4 The electron-lattice interaction

We have recently shown in some generality that localized electron states as-sociated with defects exhibit a large coupling to the lattice [27]. Thus, for

10 Defects in Amorphous Semiconductors: Amorphous Silicon 237

localized electronic eigenstates, deformations of the lattice in the vicinity ofthe localized state lead to significant changes in the associated electroniceigenvalue. This effect is easily tracked with first principles MD (in suchan approach, Kohn-Sham eigenvectors are computed for each instantaneousionic configuration). Beside this empirical observation, it is possible to doa simple analysis of the vibration-induced changes in electron energies us-ing the Hellmann-Feynman theorem [45] and exploiting the locality of thestates to show that the electron-lattice coupling is roughly proportional tothe localization (as gauged by inverse participation ratio).

Previous thermal simulations with Bohn-Oppenheimer dynamics have in-dicated that there exists a large electron-phonon coupling for the localizedstates in the band tails and in the optical gap [46]. Earlier works on chalco-genide glasses by Cobb and Drabold [47] have emphasized a strong correlationbetween the thermal fluctuations as gauged by root mean square (RMS) vari-ation in the LDA eigenvalues and wave function localization of a gap or tailstate (measured by inverse participation ratio (Equation 10.2). Drabold andFedders [24] have shown that localized eigenvectors may fluctuate dramati-cally even at room temperature. Recently, Li and Drabold relaxed the adia-batic (Born-Oppenheimer) approximation to track the time-development ofelectron packets scattered by lattice vibrations [46]. We have recently shownfor localized electron states, that the electron-phonon coupling Ξn(ω) (cou-pling electron n and phonon ω) approximately satisfies:

Ξ2n(ω) ∼ In × f(ω), (10.4)

where f(ω) does not depend upon n, In is the IPR of electron state n. It isalso the case that the thermally-induced variance of electronic eigenvalue n,〈δλ2

n〉 ∝ In. It is remarkable that a simple correlation exists between a staticproperty of the network (the IPR) and a dynamical feature of the system, thethermally induced fluctuation in the Kohn-Sham energy eigenvalues. Thesepredictions are easily verified from thermal simulation as reported elsewhere.The main assumption in obtaining this connection is that the electron statesunder study are localized. Beside the thermally induced changes in electronicenergies, there are also significant variations in the structure of the Kohn-Sham eigenstates, another consequence of the large electron-phonon couplingfor localized states.

10.5 Defects in Amorphous Silicon

Geometry/Topology

Probably the most heavily studied amorphous semiconductor is Si. In itsunhydrogenated form, the material is not electronically useful, as there aretoo many defect states in the optical gap. On the other hand, hydrogenated

238 D. A. Drabold and T. A. Abtew

a-Si (a-Si:H) can be grown in a variety of ways, and can be prepared withdefect concentrations of order 10−16, which is small enough to enable manyelectronic applications [48]. Of critical importance for these applications, it ispossible to dope a-Si:H n(p) type with P(B) donors (acceptors). The dopingefficiency of the material is very low (meaning that large amounts - of order1% impurity is needed to move the Fermi level significantly).

In a-Si, diffraction measurements show that these materials are verytetrahedral (typically more than 99.9% of atoms are four-coordinated andhave bond angles distributed around the tetrahedral angle (typically with aFWHM of order 10 degrees). The key defect is the three-fold coordinated Siatom, the sp3 dangling bond, which is known on the basis of experiment andsimulation to produce a midgap state. Such states are directly observable inElectron Spin Resonance (ESR) measurements [49, 50].

Several papers have speculated on the importance of five-fold floatingbonds [51], but their significance is still uncertain, and it appears that suchdefects would not produce midgap states, but rather states near the conduc-tion band edge. Depending upon the details of the local bonding environment,the levels associated with these states could move slightly from their ideal lo-cations. Such configurations are popular in MD simulations of a-Si (quenchesfrom the liquid). It is unclear however whether this implies the existence offloating bonds in a-Si:H or if there is an overemphasis on higher coordinatedsites because the liquid is roughly 6-fold coordinated.

Defects play many roles in a-Si:H, and one of the most interestingdefect-dependent properties is associated with light-induced metastability,the “Staebler-Wronski” effect [52], which is usually interpreted as light-induced creation of defect centers (probably dangling bonds). A remarkablecollection of experiments and models have been undertaken to understandthis effect which is obviously important for photovoltaic applications, but forthin film device applications (like thin-film transistors) as well [54–67]. Thiseffect is addressed in the chapter of Simdayankin and Elliott.

Level of approximations: a cautionary tale

We have undertaken a systematic study of the level of calculation needed tofaithfully represent the electron states, total energies and forces in a-Si withinthe density functional LDA approximations. This work was performed withthe powerful local basis code SIESTA [69]. This calculation was carried outfor a-Si:H, but we expect that many of the conclusions should at least beconsidered when applying density functional techniques to other amorphousor glassy materials. In fact studies analogous to ours for silicon should beimplemented for more complex amorphous materials. For a unique “one stopreference” to methodological issues for density functional methods, see thebook of Martin [68].

In a nutshell, a proper calculation of defect states and geometry, especiallyin amorphous materials, is difficult. Every approximation needs to be checked

10 Defects in Amorphous Semiconductors: Amorphous Silicon 239

and optimized. Depending upon the type of question being asked more or lesssophisticated approximations may be required. We divide the discussion intoseveral categories:

(1) Pseudopotentials: One of the most important and fortunately reli-able approximations used in electronic structure calculations is the pseudopo-tential. This is a means to separate the atomic core and valence regions, andenables the use of only valence electrons in the calculation of the Kohn-Shamorbitals. Even for a relatively light atom like Si, this allows a calculationinvolving 4 electrons per atom rather than 14. When one reflects that merediagonalization of the Hamiltonian scales as the cube of the number of elec-trons, the payoff is clear. After many years of work, the lore of pseudopoten-tials is fairly mature, though one must test a new potential carefully beforeusing it widely.

(2) Basis Set: The most obvious approximation in any large scale DFcalculation is the use of a finite basis set to represent a set of continuousfunctions (the Kohn-Sham orbitals and the charge density). In a plane wavecalculation, it is easy to check for completeness, as the only “knob” is theplane wave cutoff (or number of reciprocal lattice vectors). Care is neededwith plane wave calculations as defect states can be quite spatially compactand therefore difficult to approximate without a large collection of reciprocallattice vectors.

For local basis codes, ab initio or empirical, completeness is a delicatequestion. Most current codes use basis orbitals much in the spirit of theLinear Combination of Atomic orbitals (LCAO) method of chemistry, withs, p, d, f states. The minimal basis is defined to be a set of atom-centeredfunctions which is just adequate to represent the occupied atomic orbitalson that atom. The minimal basis has very limited variational freedom. Thefirst improvement on the minimal basis is introducing two functions withthe symmetry of the original single zeta (SZ) functions. Quantum chemistsname this a “double zeta” (DZ)basis. A suitably selected double zeta basiscan reproduce expansion and contraction in local bonding. The zeta prolif-eration can continue, though it is uncommon to proceed beyond triple-zetain practical calculations. After adequately filling out the basis orbitals withthe symmetries of the states for the ground-state atom, one proceeds to thenext shell of states which are unoccupied in the atomic ground state. Theseare called polarization functions (so named because the loss of symmetrycaused by application of a (polarizing) electric field distorts the ground statefunctions into the next angular momentum shell).

In a-Si:H we find that the choice of basis affects virtually everything. Thelocalization of the Kohn-Sham states depends dramatically on the basis. Asone might guess, the less complete basis sets tend to overestimate the local-ization of the Kohn-Sham eigenvectors (as there are fewer channels for thesestates to admix into). What was surprising is that the degree of localization,measured by IPR for a well-isolated dangling bond defect, varies by a factor

240 D. A. Drabold and T. A. Abtew

of two between a single zeta (four orbital per site) and a double-zeta polarized(DZP) (thirteen orbital per site) basis. Similar effects are seen for defects incrystals. Since the single-particle density matrix, the total energy and inter-atomic forces depends upon the Kohn-Sham eigenvectors, it is to be expectedthat defect geometry, vibrational frequencies and dynamical properties areall influenced by choice of basis.

(3) Spin Polarization: Fedders and co-workers [70] have shown that, inorder to correlate the degree of localization from dangling bond states withESR experiments, it is not enough to look at the wave functions, but to thenet spin polarization near the dangling bond. The reason is that the spindensity includes contributions from electronic states other than the localizeddefect wave function, which contribute to make the spin polarization morelocalized than the specific localized state wave function. In order to confirmthis result (obtained by Fedders et al. on cells of a-Si:H) in our structuralmodels, we performed calculations allowing for spin polarization in our frozenlattice models, using the DZP basis set. We were not able to find a spinpolarized solution for any of the amorphous cells. The reason is the existenceof two interacting dangling bonds, which favors the formation of a spin singletwith two electrons paired. In order to force the appearance of a spin momentin our models, we introduce an unpaired spin by removing a single electronfrom the system. We find a contribution of almost 50% to net spin by thecentral dangling bond and its neighbors (the central atom alone contributing38%). However, the Mulliken charge contribution to the wave function of thecorresponding localized state from the defect site is only 0.29e. The hydrogen-terminated dangling bond sites also contribute about 10% of the net spin.The remainder is somewhat distributed uniformly at the other sites. For well-isolated dangling bonds in a-Si, about 54% of the net spin localization sits onthe dangling bond and its nearest neighbors, in reasonable agreement withthe experiment [50, 71].

The conclusion is that, for a dangling bond defect state, there is a largedifference between spin localization and wave function localization. The de-gree of spin localization is greater than that of the wave function localizationat the dangling bond site. To our knowledge, no experimental methods existfor measuring the extent wave function localization on the dangling bondorbital as opposed to spin.

(4) Gradient Corrected Density Functional: This has been inade-quately studied, although there is no reason to expect large changes in struc-ture with the use of gradient corrections. Typically bond lengths may changemodestly and cohesive energies improve when compared to experiment. Gen-erally one expects gradient corrections to at least partly repair the tendencyof LDA to overbind.

(5) Brillouin zone sampling: Many current calculations of defects arecarried out with a cell which is then periodically repeated to eliminate surfaceeffects. In particular, this construction clearly yields a crystal with a unit cell

10 Defects in Amorphous Semiconductors: Amorphous Silicon 241

with typically several hundred or more atoms (such a number is necessary tomeaningfully sample the disorder characteristic of the material). Thus, theuse of periodic (Born-Von Karman) boundary conditions really amounts toconsideration of a crystal with a large unit cell. Thus, there is a new (andcompletely artificial) band structure (k dispersion) associated with the con-struction, a Brillouin zone etc. It is of course true that as the cell gets larger,the bands become flatter, thus reducing the significance of the periodicity.For computational convenience, total energies and forces are inevitably com-puted at the center (k=0) of the Brillouin zone, though in principle thesequantities involve quadrature over the first Brillouin zone. However, if resultsfor total energies and especially forces depend upon k in any significant way,then it is doubtful that the cell was selected to be large enough in the firstplace. For delicate energetics (an all too familiar state of affairs for defects), itis particularly important to test that the cell is big enough. In our experiencea few hundred atoms in a cubic cell is adequate.

Defect Identification

From simulation studies, one finds the expected point defects of coordinationtype (three-fold dangling bonds and fivefold atoms floating bonds), and alsostrain defects (nominally four-coordinated structures with large deviations inthe bond angles). The dangling bond defect produces electronic states nearthe middle of the gap, and floating bonds near the conduction edge. Straindefects are associated with the valence and conduction band tails.

Defect Dynamics and Diffusion

An important, but under appreciated aspect of defects in amorphous semi-conductors, is their dynamics. In hydrogenated amorphous silicon, the mo-tion of defects and the motion of hydrogen (which are evidently related)are correlated with some of the most important physical properties of theamorphous matrix, such as the light-induced degradation of the material(Staebler-Wronski effect) [52, 53]. The expectation is that H motion consistsof small oscillations in a particular potential well associated with a given lo-cal environment with rare escape events until the diffusing particle falls intoanother trap. The time between such rare events depends upon the heightof the barrier separating the two metastable configurations. Sophisticatedmethods exist for determining diffusion pathways and events, particularlythe Activation-Relaxation Technique (ART) of Barkema and Mousseau [72].In a remarkable paper, these authors applied ART with a simple potential todirectly compute the atomic “moves” in a model of a-Si. For large barriersand very rare events, there is no substitute for a study of ‘ART’ type.

For smaller barriers, we have seen that it is possible to extract interestingshort-time diffusive dynamics directly from MD simulation. In simulations of

242 D. A. Drabold and T. A. Abtew

Fig. 10.3. Trajectory of one of the hydrogen atoms in our simulation at T = 300K, the trajectory clearly shows the diffusion and trapping of H atom. The totaltime for the graph is 1.0 ps.

Ag dynamics in chalcogenide glass hosts, we found that it was not difficultto track the motion of the Ag atoms from simulations of order 50 ps. The ex-istence of trapping centers, and even some information about trap geometry,and temperature dependence residence times was obtained [73]. In a-Si:H,we have found that an analogous computation produces new insight into themotion of both Si and H atoms. Using a small cell (61 Si and 10 H atoms)with two dangling bonds and no other defects, we employed SIESTA withhigh level approximations (a double-zeta polarized basis) to monitor atomicmotion. On the time scale of 1 ps we have shown the trajectory of one ofthe H atoms in Fig. 10.3 [74]. Representative of the majority of H atoms inthe cell, this particular trajectory shows the diffusion of the hydrogen atom,including trapping. While the H atoms are diffusing in the cell it is followedby breaking old bonds and forming new bonds. We have plotted the timeevolution of undercoordinated and floating bonds formed as a consequenceof hydrogen diffusion in Fig. 10.4 [74].

In our simulation we have observed rearrangements of the atoms while thehydrogen atoms are diffusing in the cell. The diffusion of H causes formationand breaking of bonds. We have also observed the formation of a metastablestates that trap the mobile hydrogen atoms. The mechanisms for the forma-tion of these structures follow breaking of H atoms from the Si-H bonds andfollowed by diffusion in the cell. These mobile H atoms then collide with an-other Si+DB structure and form a bond. These processes continue until twohydrogens form a bond to a single Si atom or to two nearby Si atoms to form

10 Defects in Amorphous Semiconductors: Amorphous Silicon 243

0 2000 4000 6000 8000 10000time (*0.1fs)

0

4

8

12

Und

erco

ordi

nate

d bo

nds

average

0 2000 4000 6000 8000 10000time (*0.1fs)

0

4

8

12

Flo

atin

g bo

nds

average(a) (b)

Fig. 10.4. (a) Total number of undercoordinated atoms and (b) Total numberof overcoordinated atoms in our simulation at T = 300 K. The total time for thegraph is 1.0 ps.

0 5000 10000time(*0.1fs)

−2

0

2

4

6

8

Coo

rdin

atio

n

H−65averageSi−51average

0 5000 10000time (*0.1fs)

−2

0

2

4

6

Coo

rdin

atio

n

Si−24H−69average

(b)(a)

Fig. 10.5. Time evolution of coordination for few selected atoms (H-69, H-65, Si-24, and Si-51) at room temperature T = 300 K. (a) time evolution of atoms, Si-24which is far from the diffusing H with coordination 4 and H-69. (b) time evolutionof selected atoms: Si-51, which is close to the diffusing H, and H-65.

a metastable conformations. In Fig. 10.5, [74] we have shown the time evo-lution of the coordination for selected atoms. Fig. 10.5 (a) shows the stablecoordination of a Si atom (Si-24) which is fully coordinated since there is nodiffusing atom in its vicinity and of H-69 which has an average coordinationof 1. However, the Si atom (Si-51) and H-65 which are shown in Fig. 10.5 (b)shows change in the coordination as a function of time. This change in thecoordination becomes more stable (1 for H and 4 for Si) after the formationof a metastable configuration. Our results suggest that atoms which are rea-

244 D. A. Drabold and T. A. Abtew

sonably far from the diffusing hydrogen have an average coordination of 4 forSi and 1 for H. However, the coordination of the atoms, for instance Si-51, inthe direction of the diffusing H atoms changes with time. This suggest thatwhile the H atoms are diffusing in the cell there are breaking and formationof bonds.

Other researchers have emphasized the significance of defect and H mo-tion. In particular, Street [75] has pioneered a phenomenological approach tothe long-time dynamics with the defect pool model. Our work on extremelyshort time scales should ultimately be smoothly connected to the inferredlong-time defect dynamics from the defect equilibria models. While we arevery far from this goal at present, it is possible to imagine impacts of bothapproaches on each other within a few years. Ideally, the short-time simula-tions could provide ab initio input into the energetics of the dynamics, andeven information about residence times for various defects.

Acknowledgements

It is a pleasure to thank a number of collaborators for their contributions tothe work reported here: R. Atta-Fynn, P. Biswas, J. Dong, S. R. Elliott, P. A.Fedders, H. Jain, J. Li, J. Ludlam, N. Mousseau, P. Ordejon, S. N. Taraskinand X. Zhang. DAD thanks the National Science Foundation for supportingthis work.

References

1. C. R. A. Catlow (ed.): Defects and Disorder in Crystalline and AmorphousSolids, (Kluwer Academic publishers, the Netherlands 1994) Vol 418

2. J. D. Joannopoulos and G. Lucovsky (ed.): The physics of Hydrogenated Amor-phous Silicon II, (Springer, Berlin Heidelberg New York Tokyo 1984) Vol. 56

3. M. Stutzmann, D. K. Biegelsen, and R. A. Street: Phy. Rev. B 35, 5666 (1987)4. E. T. Jaynes: Probability Theory: The logic of science, (Cambridge University

Press, Cambridge 2003)5. R. Car and M. Parrinello: Phy. Rev. Lett. 60, 204 (1988)6. O. Gereben and L. Pusztai: Phys. Rev. B 50, 14136 (1994)7. J. K. Walters and R. J. Newport: Phys. Rev. B 53, 2405 (1996)8. R. L. McGreevy: J. Phys.: Condens. Matter 13, R877 (2001)9. F. Wooten, K. Winer, and D. Weaire: Phy. Rev. Lett. 54, 1392 (1985)

10. B. R. Djordjevic, M. F. Thorpe and F. Wooten: Phys. Rev. B 52, 5685 (1995)11. G. T. Barkema and N. Mousseau: Phy. Rev. B 62, 4985 (2000)12. R. L. C. Vink and G. T. Barkema: Phys. Rev. B 67, 245201 (2003)13. M.F. Thorpe, J. Non Cryst. Sol. 57, 355 (1983).14. A. E. Carlsson: In Solid State Physics, Advances in research and applications,

Vol. 43, ed by H. Ehrenreich and D. Turnbull (Academic, New York 1990) pp1

10 Defects in Amorphous Semiconductors: Amorphous Silicon 245

15. F. Ercolessi and J. B. Adams: Europhys. Lett. 26, 583 (1994)16. A. Nakano, L. Bi, R. K. Kalia, and P. Vashishta: Phys. Rev. B 49, 9441 (1994)17. J. J. Ludlam, S. N. Taraskin, S. R. Elliott and D. A. Drabold: J. Phys.: Con-

dens. Matter 17, L321 (2005)18. S. Aljishi, L. Ley and J. D. Cohen: Phys. Rev. Lett. 64, 2811 (1990)19. S. R. Elliott: Physics of Amorphous Materials, 2nd edn (Longman Scientific &

Technical, London; Wiley, New York 1990)20. R. Zallen: The Physics of Amorphous Solids, (Wiley-Interscience, New York

1998)21. D. A. Drabold, P. A. Fedders, S. Klemm and O. F. Sankey: Phys. Rev. Lett.

67, 2179 (1991)22. J. Dong and D. A. Drabold: Phys. Rev. Lett. 80, 1928 (1998)23. P. A. Fedders, D. A. Drabold and S. Nakhmanson: Phys. Rev. B 58, 15624

(1998)24. D. A. Drabold and P. A. Fedders: Phys. Rev. B 60, R721 (1999)25. D. A. Drabold: J. Non-Cryst. Solids 266-269, 211 (2000)26. R. A. Street, J. Kakalios, and T. M. Hayes: Phys. Rev. B 34, 3030 (1986)27. R. Atta-Fynn, P. Biswas, P. Ordejon and D. A. Drabold: Phys. Rev. B 69,

085207 (2004)28. N. Szabo and N. S. Ostlund: Modern Quantum Chemistry (Dover, New York

1996)29. S. N. Taraskin, D. A. Drabold and S. R. Elliott: Phys. Rev. Lett. 88, 196405

(2002)30. X. Zhang and D. A. Drabold: Phys. Rev. Lett. 83, 5042 (1999)31. M. Cobb, D. A. Drabold, and R. L. Cappelletti: Phys. Rev. B 54, 12162 (1996)32. C. Massobrio, A. Pasquarello, and R. Car: Phys. Rev. Lett. 80, 2342 (1998)33. I . Petri, P. S. Salmon, and H. E. Fischer: Phys. Rev. Lett. 84, 2413 (2000)34. X. Zhang and D. A. Drabold: Phys. Rev. B 62, 15 695 (2000)35. P. S. Salmon and I. Petri, J. Phys.: Condens. Matter 15, S1509 (2003)36. D. A. Drabold, J. Li and D. Tafen: J. Phys.: Condens. Matter 15, S1529 (2003)37. D. N. Tafen and D. A. Drabold: Phys. Rev. B 68, 165208 (2003)38. P. Boolchand (ed.): Insulating and Semiconducting Glasses, (World Scientific,

Singapore 2000)39. W. H. Zachariasen: J. Am. Chem. Soc. 54, 3841 (1932)40. R. Kaplow, T. A. Rowe and B. L. Averbach: Phys. Rev. 168, 1068 (1968)41. A. P. Sutton, M. W. Finnis, D. G. Pettifor etal: J. Phys. C: Solid State Phys.

21, 35 (1988)42. N. Bernstein and E. Kaxiras: Phys. Rev. B 56, 10488 (1997)43. P. Biswas, D. N. Tafen, and D. A. Drabold: Phys. Rev. B 71, 054204 (2005)44. R. Atta-Fynn, P. Biswas and D. A. Drabold: Phys. Rev. B 69, 245204 (2004)45. R. P. Feynman: Phys. Rev. 56, 340 (1939)46. J. Li and D. A. Drabold: Phys. Rev. B 68, 033103 (2003)47. M. Cobb and D. A. Drabold: Phys. Rev. B 56, 3054 (1997)48. J. D. Joannopoulos and G. Lucovsky (ed.): The physics of Hydrogenated Amor-

phous Silicon I, (Springer, Berlin Heidelberg New York Tokyo 1984) Vol. 5549. R. A. Street, D. K. Biegelsen, and J. C. Knights: Phys. Rev. B 24, 969 (1981)50. D. K. Biegelsen and M. Stutzmann: Phys. Rev. B 33, 3006 (1986)51. S. T. Pantelides: Phys. Rev. Lett. 57, 2979 (1986)52. D. L. Staebler and C. R. Wronski: Appl. Phys. Lett. 31, 292 (1977)

246 D. A. Drabold and T. A. Abtew

53. T. A. Abtew and D. A. Drabold, J. Phys. Condens. Matt 18, L1 (2006).54. H. Fritzsche: Annu. Rev. Mater. Res. 31, 47 (2001)55. T. Su, P. C. Taylor, G. Ganguly etal: Phys. Rev. Lett. 89, 015502 (2002)56. S. B. Zhang, W. B. Jackson and D. J. Chadi: Phys. Rev. Lett. 65, 2575 (1990)57. D. J. Chadi: App. Phys. Lett. 83, 3710 (2003)58. M. Stutzmann W. B. Jackson and C. C. Tsai: Phys. Rev. B 32, 23 (1985)59. R. Biswas, I. Kwon and C. M. Soukoulis: Phys. Rev. B 44, 3403 (1991)60. S. Zafar and E. A. Schiff: Phys. Rev. B 40, 5235 (1989)61. S. Zafar and E. A. Schiff: Phys. Rev. Lett. 66, 1493 (1991)62. S. Zafar and E. A. Schiff: J. Non-Cryst. Sol. 137-138, 323 (1991)63. H. M. Branz: Phys. Rev. B 59, 5498 (1999)64. R. Biswas and Y. -P. Li: Phys. Rev. Lett. 82, 2512 (1999)65. N. Kopidakis and E. A. Schiff: J. Non-Cryst. Sol. 266-269, 415 (2000)66. S. B. Zhang and H. M. Branz: Phys. Rev. Lett. 87, 105503 (2001)67. T. A. Abtew, D. A. Drabold, and P. C. Taylor: Appl. Phys. Lett. 86, 241916

(2005)68. R. M. Martin, Electronic Structure: Basic Theory and Practical Methods (Cam-

bridge University Press, Cambridge) 2004.69. J. M. Soler, E. Artacho, J. D. Gale etal : J. Phys.: Condens. Matter 14, 2745

(2002)70. P. A. Fedders, D. A. Drabold, Pablo Ordejon, G. Fabricius, E. Artacho, D.

Sanchez-Portal and J. Soler: Phys. Rev. B 60, 10594 (1999)71. T. Umeda, S. Yamasaki, J. Isoya et al: Phys. Rev. B 59, 4849 (1999)72. N. Mousseau, G. T. Barkema: Phys. Rev. E 57, 2419 (1998)73. D. N. Tafen, D. A. Drabold, and M. Mitkova: Phys. Stat. Sol. (b) 242, R55

(2005) ibid Phys. Rev. B 72 054206 (2005).74. T. A. Abtew and D. A. Drabold (unpublished).75. R. A. Street, Hydrogenated Amorphous Silicon (Cambridge University Press,

Cambridge 1991).

11 Light-Induced Effects in Amorphous and

Glassy Solids

S.I. Simdyankin and S.R. Elliott

Department of Chemistry, University of Cambridge, Lensfield Road,Cambridge CB2 1EW, United Kingdom, [email protected], [email protected]

In this chapter, we discuss how exposure to light can affect the properties ofdisordered materials and review our recent computational studies of thesephenomena. Familiarity with the preceding contribution by Drabold andAbtew is beneficial for understanding this chapter.

11.1 Photo-induced metastability in Amorphous Solids:an Experimental Survey

11.1.1 Introduction

Electromagnetic (EM) radiation can interact with amorphous solids in a num-ber of ways, depending on the wavelength of the radiation. EM radiation canbe either absorbed or scattered by a solid. In the resonant process of ab-sorption, the photon energy must match a transition energy between twoquantum states in the material, e.g. for atomic vibrations corresponding toinfrared (IR) wavelengths, or electronic transitions from valence to conduc-tion band across a bandgap in a semiconductor, corresponding to visible/nearultraviolet (UV) wavelengths (depending on the magnitude of the bandgapenergy, Eg). Scattering of EM radiation may be elastic, or inelastic when en-ergy is exchanged between a photon and quantum states of a solid. Examplesinclude X-ray scattering (diffraction) from the electrons in atoms, Raman(inelastic light) scattering from atomic vibrations (e.g. molecular-like modes)and Brillouin (inelastic light) scattering from acoustic phonon-like vibrationalexcitations. The photon-solid interaction can be interrogated in terms of ei-ther the ultimate state of the photon (as in the above examples), or in termsof changes in the internal state of the solid. In the latter case, it is oftenchanges in the electronic degrees of freedom that are probed: for the case ofmost amorphous semiconductors with bandgaps in the range 1 < Eg < 3eV,this involves near-IR or visible light excitation (although for more ionic ox-ides, e.g. vitreous (v-)SiO2 with Eg ∼ 10eV, UV light excitation is required).Electrons optically excited into the conduction band (CB), or holes excitedinto the valence band(VB), can be probed, for example, by their enhancedelectrical transport (photoconductivity) or by their radiative recombination(photoluminescence or fluorescence). At sufficiently high light intensities,

248 S.I. Simdyankin and S.R. Elliott

the light-solid interaction can become non-linear, this optical non-linearitypermitting many new phenomena, such as second-harmonic generation, three-and four-wave mixing, to occur. Electronic excitation of structural coordina-tion defects, e.g. dangling bonds, can be probed by electron spin (paramag-netic) resonance (ES(P)R).

The optically-induced phenomena outlined above can be exhibited byall kinds of semiconducting/insulating solid, whether crystalline or amor-phous. However, certain features unique to the amorphous state in general,and to certain types of amorphous materials in particular, mean that somephoto-induced phenomena are special to amorphous semiconducting materi-als, particularly those which are metastable, i.e. which remain after cessationof irradiation.

One general characteristic of amorphous semiconductors of relevance inthis connection is the occurrence of disorder-induced spatial localization ofelectronic states in the band-tail states extending into the gap between theVB and CB [1]. The presence of continuous bands of localized states in thesetail states, in the energy interval between the “mobility” edges in the VBand CB, marking the transition point between localized and delocalized (ex-tended) electron states [2], as well as localized states deep in the bandgaparising from coordination defects [1], can have a profound influence on thenature of the photon-solid interaction. These localized states have an en-hanced electron-lattice interaction (see the preceding Chapter by Draboldand Abtew), meaning that optical excitation of such electronic states can havea disproportionate effect on the surrounding atomic structure. Secondly, theradiative-recombination lifetime for an optically-created electron-hole pairtrapped in localized tail states can be very considerably longer than whenthe photo-excited carriers are in extended states (as is always the case forcrystals), thereby allowing possible non-radiative channels to become signif-icant (e.g. involving atomic-structural reconfiguration).

Other relevant aspects are more materials specific. Optically-inducedmetastability associated with structural reorganisation is likely to be moreprevalent in those materials in which (some) atoms have a low degree ofnearest-neighbour connectivity (e.g. being two-fold coordinated, rather thanfour-fold coordinated), thereby imparting a considerable degree of local struc-tural flexibility. In addition, structural reorganisation following optically-induced electronic excitation is more probable if the (VB) electronic statesinvolved correspond to easily-broken weak bonds.

One class of materials satisfying the above constraints consists of chalco-genide glasses, namely alloys of the Gp VI chalcogen elements (S, Se, Te)with other (metalloid) elements, e.g. B, Ga, P, As, Sb, Si, Ge etc. These ma-terials are “lone-pair” semiconductors in which (if the chalcogen content issufficiently high) the top of the VB comprises chalcogen p-π lone-pair (LP)states [3]. Interatomic, “non-bonding” (Van der Waals-like) interactions in-volving such LP states are appreciably weaker than for normal covalent bonds

11 Light-Induced Effects in Amorphous and Glassy Solids 249

(those states lying deeper in the VB). Since the ground-state electronic con-figuration of chalcogen atoms is s2p4, the occurrence of non-bonding LP elec-tron states means that each chalcogen atom is ideally two-fold coordinated bycovalent bonds to its nearest neighbours. Thus chalcogenide glasses, havinga combination of localized electronic tail states, low atomic coordination andweak bonds associated with optically accessible states at the top of the VB,are ideal candidates for exhibiting photo-induced effects.

11.1.2 Photo-induced effects in chalcogenide glasses

Amorphous chalcogenide materials exhibit a plethora of photo-induced phe-nomena. Such changes can be variously dynamic (i.e. present only whilst amaterial is illuminated) or metastable (i.e. the effects remain after cessationof illumination). Furthermore, the changes may be either scalar or vectoralin nature (respectively independent or dependent on either the polarizationstate, or the propagation direction, of the inducing light). Finally, metastablechanges may be irreversible, or reversible with respect to thermal or opticalannealing.

Examples of dynamic effects are the afore-mentioned phenomena of pho-toluminescence and photoconductivity [1] and optical non-linearity [4], whichare common to all materials. (Chalcogenide glasses generally exhibit ex-tremely large optical non-linearities because of the highly electronically po-larizable nature of chalcogen atoms present [4].) However, a (scalar) dynamiceffect, which is characteristic of chalcogenide glasses, is photo-induced fluidity,wherein the viscosity of a glass (e.g. As2S3 ) decreases on illumination withsub-bandgap light or, in other words, light can cause viscous relaxation in astressed glass [5,6]. An example of a vectoral dynamic photo-effect in chalco-genide glasses is the opto-mechanical effect [7, 8], wherein linearly-polarizedlight incident on a chalcogenide-coated clamped microcantilever causes it todisplace upwards or downwards, depending on whether the polarization axisis respectively parallel to perpendicular to the cantilever axis, as a result ofanisotropic photo-induced strains introduced into the chalcogenide-cantileverbimorph.

Metastable photo-induced changes are perhaps the most interesting (andapplicable) of the effects observed in chalcogenide glasses. One such is photo-darkening (or bleaching), wherein the optical absorption edge of the materialshifts to lower energies (hence the material gets darker at a given wavelength),or to higher energies (bleaching), on illumination with (sub-) bandgap light.For reviews, see refs. [9, 10]. Arsenic-based materials with higher sulphurcontents and germanium sulphide materials seem to favour photo-bleaching,for reasons that are not clear at present. irreversible shifts of the opticalabsorption edge occur in virgin (as-evaporated) thin films of amorphouschalcogenides containing structurally-unstable molecular fragments from thevapour phase [11] that are particularly vulnerable to photo-induced (or ther-mal) change. Reversible photodarkening is observed in bulk glasses and well-

250 S.I. Simdyankin and S.R. Elliott

annealed thin films: optical illumination causes a red-shift of the absorptionedge, whilst subsequent thermal annealing to the glass-transition tempera-ture, Tg, erases the effect [9, 10].

Photodarkening (bleaching) appears to be associated with photo-inducedstructural changes, such microscopic changes being manifested as macro-scopic volume changes. Photo-contraction is particularly prevalent in virginobliquely-deposited thin films of amorphous chalcogenides having a columnar-like microstructure [12], but photo-expansion is commonly associated withphotodarkening in chalcogenide bulk glasses or thin films [9,10]. Giant photo-expansion (up to 5% linear expansion) occurs for sub-bandgap illumina-tion [13]. Another metastable photo-induced structural change exhibited bychalcogenide materials is photo-induced crystallization and amorphization,as used in rewriteable “phase-change” CDs and DVDs [14, 15]. The amor-phization process there is believed to result from a photo-induced melting andsubsequent very rapid quenching of the material to the glassy state. However,illumination of certain crystalline chalcogenide materials (e.g. As50Se50) canalso cause athermal photo-amorphization [16]. Light can also cause “chemi-cal” changes in chalcogenide glasses. For example, overlayers of certain metals(notably silver) diffuse into the undoped bulk glass on illumination [17, 18].On the other hand, Ag-containing chalcogenide glasses, rich in Ag, exhibitthe opposite effect, namely photo-induced surface deposition, wherein themetal exsolves from the glassy matrix on illumination [19, 20].

A particularly interesting metastable vectoral photo-induced effect exhib-ited by chalcogenide glasses is photo-induced optical anisotropy (POA), firstobserved in ref. [21], and manifested in absorption, reflection (i.e. refractiveindex) and scattering (for a review, see [10, 22]). Illumination of an initiallyoptically isotropic glass by linearly-polarized light causes the material to be-come dichroic and birefringent. In the case of well-annealed thin films andbulk glasses, the effect is completely reversible optically (i.e. the induced opti-cal axis is fully rotated when the light-polarization vector is rotated), and thePOA can be annealed out thermally (but at a lower temperature than is thecase for scalar photodarkening) or optically (using unpolarized or circularlypolarized light).

11.2 Theoretical studies of photo-induced excitations inamorphous materials

From the above very brief review, it is apparent that amorphous chalcogenidematerials exhibit a wide range of interesting photo-induced phenomena. Al-though the experimental phenomenology is, for the most part, well developed,a proper theoretical understanding is still lacking. Until very recently, theo-retical models have been confined to “hand-waving” models [9, 10], involvingsimple notions of chemical bonding and the effects of light (e.g. bond-breakingand defect formation). However, the predictive, and even descriptive, power of

11 Light-Induced Effects in Amorphous and Glassy Solids 251

such approaches is very limited, and for a microscopic (atomic-level) under-standing of photo-induced phenomena in amorphous chalcogenides, properquantum-mechanical calculations need to be performed. Obviously, this is anextremely challenging task, and two essentially opposing methods for makingprogress in this regard, using different approximations, can be employed. Thefirst employs all-electron quantum-chemical calculations on small clusters (afew tens) of atoms, in which the optimized electronic and atomic configura-tions and energies are found for the ground and electronically-excited states(see, e.g. [23–26]). This favours accuracy over dynamics. The other method,(discussed in the Chapter by Drabold and Abtew), namely ab initio molecu-lar dynamics (MD) simulations, takes a converse approach: atomic dynamicsare followed at the expense of accuracy. More detail will be given in the fol-lowing, but the advantages and disadvantages of these two approaches canbe summarized as follows. Quantum-chemical methods can only deal withvery small clusters (particularly for excited-state calculations) and only ini-tial (ground-state) and final (excited-state) configurations can be studied,not the intermediate dynamics, but the energetics are the most accurate.Ab initio MD, on the other hand, provides full information about atomic dy-namics, but only over very short time scales (typically a few picoseconds) andfor relatively few atoms (typically less than a hundred). In order to make thecalculations involved tractable, numerous more-or-less severe approximationsneed to be invoked (e.g. the local density approximation (LDA) in densityfunctional theory (DFT)), which make certain results imprecise (e.g. underes-timation of the bandgap). Moreover, the Kohn-Sham (KS) orbitals resultingfrom DFT are ground-state quantities and hence formally inadmissible fora consideration of excited-state behaviour. Nevertheless, use of these theo-retical methods has provided very useful atomistic information of help inunderstanding photo-induced phenomena in chalcogenide glasses. Some jus-tification for using both occupied and virtual KS orbitals for qualitative, andsometimes partially quantitative, analysis of electronic structure is providedby comparing the shape and symmetry properties, as well as the energy or-der, of these orbitals with those obtained by wavefunction-based (e.g. at theHartree-Fock level of theory [27]) and GW [28] calculations. There is a verystrong similarity between GW and LDA states, and often identical DFT andGW quasiparticle wavefunctions are assumed in practical calculations [29].

11.2.1 Application of the Density-Functional-based Tight-Bindingmethod to the case of amorphous As2S3

One approximate ab initio-based MD scheme that has proved very useful inunderstanding the electronic behaviour of chalcogenide glasses is the Density-Functional-based Tight-Binding (DFTB) method developed by Frauenheimand coworkers. Although this is reported in refs. [30–32], since this approachis not described elsewhere in this book, we give here a brief description of

252 S.I. Simdyankin and S.R. Elliott

this method and review our recent results on the canonical chalcogenide glassa-As2S3 obtained by using the DFTB method [33–35].

Although the DFTB method is semiempirical, it allows one to improveupon the standard tight-binding description of interatomic interactions by in-cluding a DFT-based self-consistent second order in charge fluctuation (SCC)correction to the total energy [31]. The flexibility in choosing the desired ac-curacy while computing the interatomic forces brings about the possibilityto perform much faster calculations when high precision is not required, andrefine the result if needed.

As described in ref. [36], the SCC-DFTB model is derived from density-functional theory (DFT) by a second-order expansion of the DFT total en-ergy functional with respect to the charge-density fluctuations δn′ = δn(r ′)around a given reference density n′0 = n0(r ′):

E =occ∑i

〈ψi|H0|ψi〉

+12

∫∫ ′( 1|r − r ′| +

δ2Exc

δn δn′

∣∣∣∣n0

)δn δn′ (11.1)

− 12

∫∫ ′ n′0n0

|r − r ′| +Exc[n0] −∫Vxc[n0]n0 +Eii,

where∫dr and

∫dr′ are expressed by

∫and

∫ ′, respectively. Here, H0 =H[n0] is the effective Kohn-Sham Hamiltonian evaluated at the referencedensity and the ψi are Kohn-Sham orbitals. Exc and Vxc are the exchange-correlation energy and potential, respectively and Eii is the core-core repul-sion energy.

To derive the total energy of the SCC-DFTB method, the energy contri-butions in Eq. (11.1) are further subjected to the following approximations:1) The Hamiltonian matrix elements 〈ψi|H0|ψi〉 are represented in a basis ofconfined, pseudoatomic orbitals φμ,

ψi =∑

μ

ciμφμ. (11.2)

To determine the basis functions φμ, the atomic DFT problem is solved byadding an additional harmonic potential ( r

r0)2 to confine the basis functions

[30]. The Hamiltonian matrix elements in this LCAO basis, H0μν, are then

calculated as follows. The diagonal elements H0μμ are taken to be the atomic

eigenvalues and the non-diagonal elements H0μν are calculated in a two-center

approximation:

H0μν = 〈φμ|T + veff [n0

α + n0β]|φν〉 μεα, νεβ, (11.3)

which are tabulated, together with the overlap matrix elements Sμν withrespect to the interatomic distance Rαβ. veff is the effective Kohn-Sham

11 Light-Induced Effects in Amorphous and Glassy Solids 253

potential and n0α are the densities of the neutral atoms α.

2) The charge-density fluctuations δn are written as a superposition of atomiccontributions δnα,

δn =∑α

δnα, (11.4)

which are approximated by the charge fluctuations at the atoms α, Δqα =qα − q0α. q0α is the number of electrons of the neutral atom α and the qαare determined from a Mulliken-charge analysis. The second derivative of thetotal energy in Eq. (11.1) is approximated by a function γαβ , whose functionalform for α �= β is determined analytically from the Coulomb interaction oftwo spherical charge distributions, located at Rα and Rβ. For α = β, itrepresents the electron-electron self-interaction on atom α.3) The remaining terms in Eq. (11.1), Eii and the energy contributions, whichdepend on n0 only, are collected in a single energy contribution Erep. Erep isthen approximated as a sum of short-range repulsive potentials,

Erep =∑α �=β

U [Rαβ], (11.5)

which depend on the interatomic distances Rαβ.With these definitions and approximations, the SCC-DFTB total energy

finally reads:

Etot =∑iμν

ciμciνH

0μν +

12

∑αβ

γαβΔqαΔqβ + Erep. (11.6)

Applying the variational principle to the energy functional (11.6), one obtainsthe corresponding Kohn-Sham equations:∑

ν

cνi(Hμν − εiSμν ) = 0, ∀ μ, i (11.7)

Hμν = 〈φμ|H0|φν〉 +12Sμν

∑ζ

(γαζ + γβζ )Δqζ, (11.8)

which have to be solved iteratively for the wavefunction expansion coeffi-cients ciμ, since the Hamiltonian matrix elements depend on the ciμ due tothe Mulliken charges. Analytic first derivatives for the calculation of inter-atomic forces are readily obtained, and second derivatives of the energy withrespect to atomic positions are calculated numerically.

The repulsive pair potentials U [Rαβ] are constructed by subtracting theDFT total energy from the SCC-DFTB electronic energy (first two terms onthe right-hand side of eq.(11.6)) with respect to the bond distance Rαβ for asmall set of suitable reference systems.

To summarize, in order to determine the appropriate parameters for a newelement, the following steps have to be taken. First, DFT calculations have to

254 S.I. Simdyankin and S.R. Elliott

be performed for the neutral atom to determine the LCAO basis functions φμ

and the reference densities n0α. Here the confinement radius can in principle

be chosen to be different for the density (rn0 ) and each type of atomic orbital

(rs,p,d0 ). The value of r0 is usually taken to be the same for s- and p-functions.

In a minimal basis, this yields a total number of two adjustable parametersfor elements in the first and second rows, while there are three if d-functionsare included. After this, the different matrix elements can be calculated andthe pair potentials U [Rαβ] are obtained as stated above for every combinationof the new element with the ones already parameterized.

The single-particle KS occupied, ψi, and unoccupied, ψj, orbitals andthe corresponding KS excitation energies ωij = εj − εi are sometimes usedfor qualitative analysis of electronic excitations. Although the KS excitationenergies can be considered as an approximation to true excitation energiesωI [37] (see also end of the introduction to Sec. 11.2), the correction terms tothis approximation can be very significant in many cases. Modelling electronicexcitations by varying the occupation of the KS orbitals results in mixedquantum states with undefined contributions from singlet and triplet states.We have performed modelling of electronic excitations at the level of KSorbitals and energies and have attempted to validate the results by usingmore exact methods. One such method is based on time-dependent density-functional response theory (TD-DFRT) [38, 39], where the true excitationenergies are found by solving the following eigenvalue problem:∑

ijσ

[ω2ijδikδjlδστ + 2

√ωijKijσ,klτ

√ωkl]F I

ijσ = ω2IF

Iklτ . (11.9)

Here σ and τ are spin indices. The indices i, k correspond to occupied, and j,l to unoccupied, KS orbitals, respectively. The coupling matrix K is definedas (see ref. [40] for a more detailed description of the method):

Kijσ,klτ =∫∫ ′

ψi(r)ψj(r)(

1|r− r′| +

δ2Exc

δnσ δn′τ

)ψk(r′)ψl(r′), (11.10)

where we use a notation consistent with that in Eq. (11.2).Our structural models of a-As2S3 were obtained by a “melt-and-quench”

procedure [35] similar to the one described in the preceding chapter byDrabold and Abtew. The structural quality of models can be assessed byexamining the radial distribution function (RDF) and the structure factor(see fig. 11.1). Apart from general good agreement with experimental data,an interesting feature revealed by fig. 11.1(a) is the shape of the first peakof the RDF. The shoulders on both sides of this peak indicate the presenceof homopolar (As-As or S-S) bonds in the material. Such bonds (chemicaldefects), as well as coordination (or topological) defects, are easily detectablein computer models. An example of a configuration featuring an As-As bondis shown in fig. 11.2. Special significance can be attributed to the presence of

11 Light-Induced Effects in Amorphous and Glassy Solids 255

(a)

0

1

2

3

4

5

6

0 1 2 3 4 5 6 7 8 9

g(r)

r [Å]

simulationexperiment

(b)

−2

−1

0

1

2

0 2 4 6 8 10 12 14 16 18 20 22 24

Q(S

(Q)−

1) [Å

−1 ]

Q [Å−1]

simulationexperiment

Fig. 11.1. (a) Pair-correlation functions for a 200-atom model of a-As2S3 and theneutron-diffraction experiment (ref. [41]). (b) Reduced structure factors (interfer-ence functions) for the same model and experiment.

five-membered rings in models with an appreciable concentration of homopo-lar bonds (models with all-heteropolar bonds contain only an even number ofatoms in all rings). When such rings share some of the bonds, the resultinglocal structure is close to that of cage-like molecules (e.g. As4S4), as foundin the vapor phase and in some chalcogenide molecular crystals. Fig. 11.2(a)shows two such bond-sharing rings. Upon breaking the two bonds connectingthe rings to the rest of the network, the distance between the two freed arsenicatoms (connected by a dashed line in fig. 11.2(a)) could be reduced, thus pro-ducing another As-As homopolar bond and this group of atoms would thenform an As4S4 molecule (shown in fig. 11.2(b)). Evidence of the presence ofsuch molecules in bulk AsxS1−x glasses from Raman-scattering experimentshas recently been reported in ref. [42]. Our result shows that the As4S4 frag-

256 S.I. Simdyankin and S.R. Elliott

(a) (b)

Fig. 11.2. (a) Fragment of a 200-atom model of a-As2S3 : two bond-sharing five-membered rings and the two AsS3 groups connected to this structure. The danglingbonds show where the displayed configuration connects to the rest of the amorphousnetwork. The dashed line connects two As atoms, which, if brought nearer together,would close up to form an As4S4 molecule shown in panel (b). The shading of theAs atoms (all with three neighbors) is darker than that of the S atoms (all withtwo neighbors).

ments may not only form discrete cage-like molecules but also be embeddedinto the amorphous network. We verified that the vibrational signatures ofthe As4S4 fragment from models 1 and 2 are similar to those from an isolatedAs4S4 molecule, apart from a few very symmetric modes of the latter. The ob-served tendency for formation of quasi-molecular structural groups suggeststhat amorphous chalcogenides can be viewed as nanostructured materials.

Localization of the electronic states near the optical band-gap edges isof great interest for studies of photoinduced phenomena. The inverse par-ticipation ratios (IPR, defined in the chapter by Drabold and Abtew) for amodel of a-As2S3 are shown in Fig. 11.3. The general picture is that, at thetop of the valence band, the eigenstates are predominantly localized at whatcan be called sulphur-rich regions, where several sulphur atoms are closerthan about 3.45 A, i.e. their interatomic distances are on the low-r side ofthe second peak in g(r) shown in Fig. 11.1(a) or some of these atoms formhomopolar S-S bonds. For instance, most of the HOMO (highest occupiedmolecular orbital) level in this model is localized at two sulphur atoms sep-arated by 3.42 A and which are part of the molecule-like fragment depictedin Fig. 11.2(a). By inspecting the projected (local) IPRs in Fig. 11.3(b) atthe optical gap edges, it is seen that the IPRs are greatest for the S atoms.It appears that the localization at the top of the valence band is facilitatedby the proximity of the lone-pair p orbitals in the sulphur-rich regions. Thisobservation is consistent with a result for a-GeS2 [44].

11 Light-Induced Effects in Amorphous and Glassy Solids 257

(a)

0

0.2

0.4

0.6

0.8

1

LED

OS

[eV

-1] As, s As, p As, d

0

0.2

0.4

0.6

0.8

1

-15-10 -5 0 5

LED

OS

[eV

-1]

E [eV]

S, s

-15-10 -5 0 5

E [eV]

S, p

-15-10 -5 0 5

E [eV]

S, d

0

0.2

0.4

0.6

0.8

1

-15 -10 -5 0 5

ED

OS

[eV

-1]

E [eV]

Total

simulationexperiment

(b)

0

0.05

0.1

0.15

0.2As, s

IPR

As, p

IPR

As, d

IPR

0

0.05

0.1

0.15

0.2

-15-10 -5 0 5

E [eV]

S, s

IPR

-15-10 -5 0 5

E [eV]

S, p

IPR

-15-10 -5 0 5

E [eV]

S, d

IPR

0

0.05

0.1

0.15

0.2

-15 -10 -5 0 5

E [eV]

Total

IPR

Fig. 11.3. Local and total electronic density of states (a) and inverse participationratios (b) for a 200-atom model of a-As2S3. The Fermi energy is at the energyorigin. The experimental data in (a) are obtained from ref. [43].

At the bottom of the conduction band, the states tend to localize atvarious anomalous local configurations, such as four-membered rings, S-S ho-mopolar bonds (some of these bonds are in five-membered rings) and valence-alternation pairs (coordination defects). For example, the LUMO (lowest un-occupied molecular orbital) state for the structure depicted in fig. 11.5(a,b)is localized on an overcoordinated S atom (with three bonded neighbours).

We found that a basis set of s, p and d Slater-type orbitals for all atoms isan essential prerequisite for the observation of overcoordinated defects [36,45].It is possible to analyse individual contributions of orbitals of each type tothe total EDOS and IPR, and these contributions are shown in Fig. 11.3.

As mentioned above, in the context of photoinduced metastability, a greatdeal of significance is attributed to the presence of topological and/or chem-ical defects [9]. It is therefore imperative to create models both with and

258 S.I. Simdyankin and S.R. Elliott

A

E FG

HI

C

D

B

(a)

A

D

BC

H

I

E

F

G

(b) (c)

Fig. 11.4. Planar trigonal, [S3]+ (marked by the letter ‘A’), and “seesaw”, [As4]

(marked by letter ‘E’), configurations in (a) a fragment of a 60-atom model of a-As2S3 (the dangling bonds show where the displayed configuration connects to therest (not shown) of the network) and, (b) and (c), charged isolated clusters(thedangling bonds are terminated with hydrogen atoms). The shading of the atoms isthe same as in fig. 11.2.

without such defects in a theoretical investigation that attempts to be con-clusive. Defect-free models can be produced by “surgical” manipulations. Forexample, atoms with undesired coordination can be removed from the model,and, in order to eliminate chemical defects, one can iteratively apply the fol-lowing algorithm. First, a sulphur atom is inserted in the middle of each As-Ashomopolar bond. Second, each S-S bond is replaced by a single sulphur atomlocated at its mid-point so that each local As-S-S-As configuration is turnedinto As-S-As. Third, the distance between each newly introduced S atom andits two nearest arsenic atoms in the newly created As-S-As units is reducedin order to increase the bonding character of the As-S bonds stretched bythe above manipulation. Fourth, the modified configuration is relaxed in anMD run. We have found that only a few iterations can be sufficient in orderto obtain models with all-heteropolar (As-S) bonds.

Elimination of the defects in our models removes some electronic statesin the optical band gap. As a result, the band gap broadens, which can beviewed as an artificial bleaching of the material. The states at the band edges,however, are still localized due to disorder. It is possible that exposure to lightmay lead to bond breaking and defect creation even in such “all-heteropolar”

11 Light-Induced Effects in Amorphous and Glassy Solids 259

B

C

A

D

E

D

C

B

A

E

(a) (c)

(b) (d)

Fig. 11.5. An As4S10H8 cluster containing both defect centers [As4]− (marked by

letter ‘A’) and [S3]+ (‘B’). The shading of the atoms is the same as in Fig 11.2.

The black solid lines signify elongated bonds. (a) Optimized ground-state geometry.Bond lengths are (A): AC=3.00, AD=2.32, and BE=2.4. (b) Isosurfaces correspond-ing to the value of 0.025 of electron density in the HOMO (darker red surface) andLUMO (lighter cyan surface) states for the structure shown in (a). (c) Optimizedgeometry in the first singlet excited state. Bond lengths are (A): AC = 2.43, AD =2.44, and BE = 2.82. (d) Same as (b), but for the structure shown in (c).

models. Indeed, As-S bond elongation/breaking has been observed in all-heteropolar clusters [23,24] and cage-like molecules (unpublished). Such bondbreaking under irradiation can lead to the creation of self-trapped excitons(STEs), i.e. oppositely charged defect pairs, as shown in ref. [29] for silicondioxide where, following Si-O bond breaking, the hole is localized at theoxygen defect centre and the electron at the silicon defect centre. In analogywith the case of SiO2, it is possible that positively charged chalcogen defectsand negatively charged non-chalcogen defects are introduced in amorphouschalcogenides by a similar mechanism. In the As-S system, such STEs canpossibly lead to creation of the [As4]−-[S3]+ defect pairs described below.

These defects, as well as those which are always present in real materi-als and “as-prepared” models, may mediate local structural rearrangementsupon photon absorption. Creation of additional states at optical band-gapedges (photodarkening) of a-As2S3 exposed to light has long been attributedto the generation by illumination of defects in excess of the thermal equi-

260 S.I. Simdyankin and S.R. Elliott

librium concentration [9], but the exact mechanism of defect creation andthe nature of the defects are still enigmatic. Possible candidate defect types,notably valence-alternation pairs, have been proposed over the years [9]. Nor-mally, such defect pairs contain singly-coordinated chalcogen atoms havingdistinct spectroscopic signatures [46]. Experimentally, the concentration ofthese defects is estimated to be rather small [47], i.e. 1017 cm−3, comparedwith the atomic density of about 2 × 1025 cm−3, in order quantitatively toaccount for the observed magnitude of the photo-induced effects.

In our simulations, in addition to the charged coordination defects previ-ously proposed to exist in chalcogenide glasses, a novel defect pair, [As4]−-[S3]+ (see fig. 11.4), consisting of a four-fold coordinated arsenic site in a“seesaw” configuration and a three-fold coordinated sulfur site in a near-planar trigonal configuration, was found in several models [33]. Such defectpairs are unusual in two ways. First, there is an excess of negative charge inthe vicinity of the normally electropositive pnictogen (As) atoms and, second,there are no under-coordinated atoms with dangling bonds in these local con-figurations. The latter peculiarity may be the reason why such defect pairshave not yet been identified experimentally. These defect pairs, however, areconsistent with the STEs described above.

Although electronic excitations, where one electron is promoted fromHOMO to LUMO Kohn-Sham states [48], are not especially realistic, wesimulate such excitations in order qualitatively to assess defect stability withrespect to (optically-induced) electronic excitations. In some models, [S3]+-S−

1 defect pairs are converted into [As4]−-[S3]+ pairs as a result of the elec-tronic excitation. The presence of these defect pairs introduces additionallocalized states at the optical band-gap edges with energies quantitativelyconsistent with the phenomenon of photodarkening [34].

A possible mechanism of conversion of [S3]+-S−1 defect pairs into [As4]−-

[S3]+ pairs is illustrated in fig. 11.5, which shows an As4S10H8 cluster con-taining an [S3]+-S−

1 pair in the ground state. Geometry optimization in thefirst singlet excited state within the linear-response approximation to time-dependent (TD) density-functional theory (which gives a much better de-scription of excited states compared with HOMO-to-LUMO electron exci-tations [40]) leads to a redistribution of electron density, so that the singlycoordinated S atoms become attached to a normally coordinated As atomthus forming an [As4]− defect. At the same time, the [S3]+ defect breaksup (we observed that bond breaking/elongation in all our models generallyoccurs at the groups of atoms where the LUMO is localized, indicating theexpected antibonding character of LUMO states.). Perhaps in bulk materialssimilar rearangements can lead to the creation of an [S3]+ centre at a differ-ent location, which is suggested by the observation that, in our simulationsof photoexcitations in supercell

models, the [S3]+ centres after excitation are not necessarily located atthe same S atom as before the excitation.

11 Light-Induced Effects in Amorphous and Glassy Solids 261

References

1. S.R. Elliott: Physics of Amorphous Materials, 2nd edn (Longman Scientific andTechnical, London 1990)

2. J.J. Ludlam, S.N. Taraskin, S.R. Elliott: J. Phys.: Condens. Matter 17, L321(2005)

3. S.R. Elliott: Chalcogenide Glasses. In: Materials Science and Technology, vol 9,ed by J. Zarzycki (VCH, Weinheim 1991) p 375

4. A. Zakery, S.R. Elliott: J. Non-Cryst. Solids 330, 1 (2003)5. H. Hisakuni, K. Tanaka: Science 270, 974 (1995)6. S.N. Yannopoulos: Photo-plastic effects in chalcogenide glasses: Raman scat-

tering studies. In: Photo-Induced Metastability in Amorphous Semiconductors,ed by A.V. Kolobov (Wiley-VCH, Weinheim 2003) p 119

7. P. Krecmer, A. M. Moulin, T. Rayment et al: Science 277, 1799 (1997)8. M. Stuchlik, S.R. Elliott: Proc. IEE A: Science, Measurement and Technology

151, 131 (2004)9. A.V. Kolobov (ed): Photo-Induced Metastability in Amorphous Semiconductors

(Wiley-VCH, Weinheim 2003)10. K. Shimakawa, A. Kolobov, S. R. Elliott: Adv. Phys. 44, 475 (1995)11. S.A. Solin, G.V. Papatheodorou: Phys. Rev. B 15, 2084 (1977)12. S. Rajagopalan, K.S. Harshavardhan, L.K. Malhotra et al: J. Non-Cryst. Solids

50, 29 (1982)13. H. Hisakuni and K. Tanaka: Appl. Phys. Lett. 65, 2925 (1994)14. S.R. Ovshinsky: Phys. Rev. Lett. 21, 1450 (1968)15. T. Ohta, S.R. Ovshinsky: Phase-change optical storage media. In: Photo-

Induced Metastability in Amorphous Semiconductors, ed by A.V. Kolobov(Wiley-VCH, Weinheim 2003) p 310

16. S.R. Elliott, A.V. Kolobov: J. Non-Cryst. Solids 128, 216 (1991)17. A. Kolobov, S.R. Elliott: Adv. Phys. 40, 625 (1991)18. T. Wagner, M. Frumar: Optically-induced diffusion and dissolution in met-

als in amorphous chalcogenides. In: Photo-Induced Metastability in AmorphousSemiconductors, ed by A.V. Kolobov (Wiley-VCH, Weinheim 2003) p 160

19. S. Maruno, T. Kawaguchi: J. Appl. Phys. 46, 5312 (1975)20. T. Kawaguchi: Photo-induced deposition of silver particles on amorphous semi-

conductors. In: Photo-Induced Metastability in Amorphous Semiconductors, edby A.V. Kolobov (Wiley-VCH, Weinheim 2003) p 182

21. V.G. Zhdanov, V.K. Malinovskii: Sov. Tech. Phys. Lett. 3, 943 (1977)22. V.M. Lyubin, M.L. Klebanov: Photo-induced anisotropy in chalcogenide glassy

semiconductors. In: Photo-Induced Metastability in Amorphous Semiconduc-tors, ed by A.V. Kolobov (Wiley-VCH, Weinheim 2003) p 91

23. T. Uchino, D.C. Clary, S.R. Elliott: Phys. Rev. Lett. 85, 3305 (2000)24. T. Uchino, D.C. Clary, S.R. Elliott: Phys. Rev. B 65, 174204 (2002)25. T. Uchino, S.R. Elliott: Phys. Rev. B 67, 174201 (2003)26. T. Mowrer, G. Lucovsky, L.S. Sremaniak et al: J. Non-Cryst. Solids 338, 543

(2004)27. R. Stowasser, R. Hoffman: J. Am. Chem. Soc. 121, 3414 (1999)28. M.S. Hybertsen, S.G. Louie: Phys. Rev. B 34, 5390 (1986)29. S. Ismail-Beigi, S.G. Louie: Phys. Rev. Lett. 95, 156401 (2005)30. D. Porezag, T. Frauenheim, T. Kohler et al: Phys. Rev. B 51, 12947 (1995)

262 S.I. Simdyankin and S.R. Elliott

31. M. Elstner, D. Porezag, G. Jungnickel et al: Phys. Rev. B 58, 7260 (1998)32. T. Frauenheim, G. Seifert, M. Elstner et al: J. Phys.: Condens. Matter 14, 3015

(2002)33. S.I. Simdyankin, T.A. Niehaus, G. Natarajan et al: Phys. Rev. Lett. 94, 086401

(2005)34. S.I. Simdyankin, M. Elstner, T.A. Niehaus et al: Phys. Rev. B 72, 020202

(2005)35. S.I. Simdyankin, S.R. Elliott, Z. Hajnal et al: Phys. Rev. B 69, 144202 (2004)36. T.A. Niehaus, M. Elstner, Th. Frauenheim et al: J. Mol. Struct. THEOCHEM

541, 185 (2001)37. A. Gorling: Phys. Rev. A 54, 3912 (1996)38. M.E. Casida, in: Recent Advances in Density Functional Methods, ed by D.P.

Chong, Part I (World Scientific, Singapore 1995) p 15539. M.E. Casida, in: Recent Developments and Applications of Modern Density

Functional Theory, ed by J.M. Seminario, Theoretical and ComputationalChemistry, Vol 4 (Elsevier Science, Amsterdam 1996) p 391

40. T.A. Niehaus, S. Suhai, F. Della Sala et al: Phys. Rev. B 63, 085108 (2001)41. J.H. Lee, A.C. Hannon, S.R. Elliott: eprint: cond-mat/040258742. D.G. Georgiev, P. Boolchand, K.A. Jackson: Philos. Mag. 83, 2941 (2003)43. S.G. Bishop, N.J. Shevchik: Phys. Rev. B 12, 1567 (1975)44. S. Blaineau, P. Jund: Phys. Rev. B 70, 184210 (2004)45. S.I. Simdyankin, S.R. Elliott, T.A. Niehaus et al: Effect of defects in amor-

phous chalcogenides on the atomic structure and localisation of electronic eigen-states. In: Computational Modeling and Simulation of Materials III, vol A, edby P. Vincenzini, A. Lami, F. Zerbetto (Techna Group s.r.l., Faenza, Italy 2004)p 149

46. M. Kastner, D. Adler, H. Fritzsche: Phys. Rev. Lett. 37, 1504 (1976)47. A. Feltz: Amorphous Inorganic Materials and Glasses (VCH, Weinheim 1993)

p 20348. X. Zhang, D.A. Drabold: Phys. Rev. Lett. 83, 5042 (1999)