statistical physics of biomolecules : an introduction · statistical physics ofbiomolecules an...

12
Statistical Physics of Biomolecules AN INTRODUCTION Daniel M. Zuckerman TECHNiSCHE INFORMATIONSSIBLIOTHEK UNiVERSITATSBIBLIOTHEK HANNOVER V. J CRC Press Taylor & Francis G roup CRC Press Is an imprint of the Taylor & Francis Croup, an Inform* business

Upload: others

Post on 04-Apr-2020

11 views

Category:

Documents


3 download

TRANSCRIPT

Statistical Physicsof BiomoleculesAN INTRODUCTION

Daniel M. Zuckerman

TECHNiSCHE

INFORMATIONSSIBLIOTHEK

UNiVERSITATSBIBLIOTHEKHANNOVER

V. J

CRC PressTaylor & Francis Group

CRC Press Is an imprint of the

Taylor & Francis Croup, an Inform* business

Contents

Preface xix

Acknowledgments xxi

Chapter 1 Proteins Don't Know Biology 1

1.1 Prologue: Statistical Physics of Candy, Dirt, and Biology 1

1.1.1 Candy 1

1.1.2 Clean Your House, Statistically 2

1.1.3 More Seriously 3

1.2 Guiding Principles 4

1.2.1 Proteins Don't Know Biology 4

1.2.2 Nature Has Never Heard of Equilibrium 4

1.2.3 Entropy Is Easy 5

1.2.4 Three Is the Magic Number for Visualizing Data 5

1.2.5 Experiments Cannot Be Separated from "Theory" 5

1.3 About This Book 5

1.3.1 What Is Biomolecular Statistical Physics? 5

1.3.2 What's in This Book, and What's Not 6

1.3.3 Background Expected of the Student '. 7

1.4 Molecular Prologue: A Day in the Life of Butane 7

1.4.1 Exemplary by Its Stupidity 9

1.5 What Does Equilibrium Mean to a Protein? 9

1.5.1 Equilibrium among Molecules 9

1.5.2 Internal Equilibrium 10

1.5.3 Time and Population Averages 11

1.6 A Word on Experiments 11

1.7 Making Movies: Basic Molecular Dynamics Simulation 12

1.8 Basic Protein Geometry 14

1.8.1 Proteins Fold 14

1.8.2 There Is a Hierarchy within Protein Structure 14

1.8.3 The Protein Geometry We Need to Know,

for Now 15

1.8.4 The Amino Acid 16

1.8.5 The Peptide Plane 17

1.8.6 The Two Main Dihedral Angles Are Not

Independent 17

1.8.7 Correlations Reduce Configuration Space, but Not

Enough to Make Calculations Easy 18

1.8.8 Another Exemplary Molecule: Alanine Dipeptide 18

vjjj Contents

1.9 A Note on the Chapters 18

Further Reading 19

Chapter 2 The Heart of It All: Probability Theory 21

2.1 Introduction 21

2.1.1 The Monty Hall Problem 21

2.2 Basics of One-Dimensional Distributions 22

2.2.1 What Is a Distribution? 22

2.2.2 Make Sure It's a Density! 25

2.2.3 There May Be More than One Peak:

Multimodality 25

2.2.4 Cumulative Distribution Functions 26

2.2.5 Averages 28

2.2.6 Sampling and Samples 29

2.2.7 The Distribution of a Sum of Increments:

Convolutions 31

2.2.8 Physical and Mathematical Origins of Some

Common Distributions 34

2.2.9 Change of Variables 36

2.3 Fluctuations and Error 36

2.3.1 Variance and Higher "Moments" 37

2.3.2 The Standard Deviation Gives the Scale of a

Unimodal Distribution 38

2.3.3 The Variance of a Sum (Convolution) 39

2.3.4 A Note on Diffusion 40

2.3.5 Beyond Variance: Skewed Distributions

and Higher Moments 41

2.3.6 Error (Not Variance) 41

2.3.7 Confidence Intervals 43

2.4 Two+ Dimensions: Projection and Correlation 43

2.4.1 Projection/Marginalization 44

2.4.2 Correlations, in a Sentence 45

2.4.3 Statistical Independence 46

2.4.4 Linear Correlation 46

2.4.5 More Complex Correlation 48

2.4.6 Physical Origins of Correlations 50

2.4.7 Joint Probability and Conditional Probability 51

2.4.8 Correlations in Time 52

2.5 Simple Statistics Help Reveal a Motor Protein's

Mechanism 54

2.6 Additional Problems: Trajectory Analysis 54

Further Reading 55

Contents ix

Chapter 3 Big Lessons from Simple Systems: Equilibrium Statistical

Mechanics in One Dimension 57

3.1 Intrpduction 57

3.1.1 Looking Ahead 57

3.2 Energy Landscapes Are Probability Distributions 58

3.2.1 Translating Probability Concepts into the

Language of Slatistical Mechanics 60

3.2.2 Physical Ensembles and the Connection with

Dynamics 61

3.2.3 Simple States and the Harmonic Approximation 61

3.2.4 A Hint of Fluctuations: Average Does Not Mean

Most Probable 63

3.3 States, Not Configurations 65

3.3.1 Relative Populations 65

3.4 Free Energy: It's Just Common Sense... If You Believe in

Probability 66

3.4.1 Getting Ready: Relative Populations 67

3.4.2 Finally, the Free Energy 68

3.4.3 More General Harmonic Wells 69

3.5 Entropy: It's Just a Name 70

3.5.1 Entropy as (the Log of) Width: Double

Square Wells 71

3.5.2 Entropy as Width in Harmonic Wells 73

3.5.3 That Awful £p Inp Formula 74

3.6 Summing Up 76

3.6.1 States Get the Fancy Names because They're Most

Important 76

3.6.2 It's the Differences That Matter 77

3.7 Molecular Intuition from Simple Systems 78

3.7.1 Temperature Dependence: A One-Dimensional

Model of Protein Folding 78

3.7.2 Discrete Models 80

3.7.3 A Note on ID Multi-Particle Systems 81

3.8 Loose Ends: Proper Dimensions, Kinetic Energy 81

Further Reading 83

Chapter 4 Nature Doesn't Calculate Partition Functions: Elementary

Dynamics and Equilibrium 85

4.1 Introduction 85

4.1.1 Equivalence of Time and Configurational Averages... 86

4.1.2 An Aside: Does Equilibrium Exist? 86

XContents

4.2 Newtonian Dynamics: Deterministic but Not Predictable 87

4.2.1 The Probabilistic ("Stochastic") Picture of

Dynamics 89

4.3 Barrier Crossing—Activated Processes 89

4.3.1 A Quick Preview of Barrier Crossing 89

4.3.2 Catalysts Accelerate Rates by Lowering Barriers 91

4.3.3 A Decomposition of the Rate 91

4.3.4 More on Arrhenius Factors and Their Limitations 92

4.4 Flux Balance: The Definition of Equilibrium 92

4.4.1 "Detailed Balance" and a More Precise Definition

of Equilibrium 94

4.4.2 Dynamics Causes Equilibrium Populations 94

4.4.3 The Fundamental Differential Equation 95

4.4.4 Are Rates Constant in Time? (Advanced) 95

4.4.5 Equilibrium Is "Self-Healing" 96

4.5 Simple Diffusion, Again 97

4.5.1 The Diffusion Constant and the Square-Root Law

of Diffusion 98

4.5.2 Diffusion and Binding 100

4.6 More on Stochastic Dynamics: The Langevin Equation 100

4.6.1 Overdamped, or "Brownian," Motion and Its

Simulation 102

4.7 Key Tools: The Correlation Time and Function 103

4.7.1 Quantifying Time Correlations: The

Autocorrelation Function 104

4.7.2 Data Analysis Guided by Time Correlation

Functions 105

4.7.3 The Correlation Time Helps to Connect Dynamicsand Equilibrium 106

4.8 Tying It All Together 106

4.9 So Many Ways to ERR: Dynamics in Molecular

Simulation 107

4.10 Mini-Project: Double-Well Dynamics 108

Further Reading 109

Chapter 5 Molecules Are Correlated! Multidimensional Statistical

Mechanics Ill

5.1 Introduction Ill

5.1.1 Many Atoms in One Molecule and/or ManyMolecules Ill

5.1.2 Working toward Thermodynamics 112

5.1.3 Toward Understanding Simulations 112

5.2 A More-than-Two-Dimensional Prelude 112

5.2.1 One "Atom" in Two Dimensions 113

5.2.2 Two Ideal (Noninteracting) "Atoms" in 2D 114

Contents xi

5.2.3 A Diatomic "Molecule" in 2D 115

5.2.4 Lessons Learned in Two Dimensions 119

5.3 Coordinates and Forcefields 1195.3.1 Cartesian Coordinates 119

5.3.2 Internal Coordinates 120

5.3.3 A Forcefield Is Just a Potential EnergyFunction 121

5.3.4 Jacobian Factors for Internal Coordinates

(Advanced) 123

5.4 The Single-Molecule Partition Function 124

5.4.1 Three Atoms Is Too Many for an Exact

Calculation 125

5.4.2 The General Unimolecular Partition Function 126

5.4.3 Back to Probability Theory and Correlations 127

5.4.4 Technical Aside: Degeneracy Number 128

5.4.5 Some Lattice Models Can Be Solved Exactly 129

5.5 Multimolecular Systems 130

5.5.1 Partition Functions for Systems of Identical

Molecules 131

5.5.2 Ideal Systems—Uncorrelated by Definition 132

5.5.3 Nonideal Systems 132

5.6 The Free Energy Still Gives the Probability 133

5.6.1 The Entropy Still Embodies Width (Volume) 134

5.6.2 Defining States 134

5.6.3 Discretization Again Implies S ~ - 53P^nP 135

5.7 Summary 135

Further Reading 135

Chapter 6 From Complexity to Simplicity: The Potential of Mean Force 137

6.1 Introduction: PMFs Are Everywhere 137

6.2 The Potential of Mean Force Is Like a Free Energy 137

6.2.1 The PMF Is Exactly Related to a Projection 138

6.2.2 Proportionality Functions for PMFs 140

6.2.3 PMFs Are Easy to Compute from a Good

Simulation 141

6.3 The PMF May Not Yield the Reaction Rate or Transition

State 142

6.3.1 Is There Such a Thing as a Reaction Coordinate?...

143

6.4 The Radial Distribution Function 144

6.4.1 What to Expect for g(r) 145

6.4.2 g(r) Is Easy to Get from a Simulation 146

6.4.3 The PMF Differs from the "Bare" Pair Potential....

148

6.4.4 From g(r) to Thermodynamics in Pairwise

Systems 149

6.4.5 g(r) Is Experimentally Measurable 149

xii Contents

6.5 PMFs Are the Typical Basis for "Knowledge-Based"

("Statistical") Potentials 150

6.6 Summary: The Meaning, Uses, and Limitations of

thePMF 150

Further Reading 151

Chapter 7 What's Free about "Free" Energy? Essential Thermodynamics.... 153

7.1 Introduction 153

7.1.1 An Apology: Thermodynamics Does Matter! 153

7.2 Statistical Thermodynamics: Can You Take a Derivative?...

154

7.2.1 Quick Reference on Derivatives 154

7.2.2 Averages and Entropy, via First Derivatives 155

7.2.3 Fluctuations from Second Derivatives 157

7.2.4 The Specific Heat, Energy Fluctuations, and the

(Un)folding Transition 157

7.3 You Love the Ideal Gas 158

7.3.1 Free Energy and Entropy of the Ideal Gas 159

7.3.2 The Equation of State for the Ideal Gas 160

7.4 Boring but True: The First Law Describes EnergyConservation 160

7.4.1 Applying the First Law to the Ideal Gas: Heatingat Constant Volume 161

7.4.2 Why Is It Called "Free" Energy, Anyway? The

Ideal Gas Tells All 162

7.5 G vs. F: Other Free Energies and Why They (Sort of)

Matter 164

7.5.1 C, Constant Pressure, Fluctuating Volume—A

Statistical View 164

7.5.2 When Is It Important to Use G Instead of Fl 166

7.5.3 Enthalpy and the ThermodynamicDefinition of G 168

7.5.4 Another Derivative Connection—GettingP fromF 169

7.5.5 Summing Up: G vs. F 170

7.5.6 Chemical Potential and Fluctuating Particle

Numbers 171

7.6 Overview of Free Energies and Derivatives 173

7.6.1 The Pertinent Free Energy Depends on the

Conditions 173

7.6.2 Free Energies Are "Slate Functions" 174

7.6.3 First Derivatives of Free Energies Yield

Averages 174

7.6.4 Second Derivatives Yield

Fluctuations/Susceptibilities 174

Contents xiii

7.7 The Second Law and (Sometimes) Free EnergyMinimization 175

7.7.1 A Kinetic View Is Helpful 175

7.7.2 Spontaneous Heat Flow and Entropy 175

7.7.3 The Second Law for Free

Energies—Minimization, Sometimes 177

7.7.4 PMFs and Free Energy Minimization for

Proteins—Be Warned! 179

7.7.5 The Second Law for Your House: RefrigeratorsAre Heaters 181

7.7.6 Summing Up: The Second Law 181

7.8 Calorimetry: A Key Thermodynamic Technique 182

7.8.1 Integrating the Specific Heat Yields Both Enthalpyand Entropy 182

7.8.2 Differential Scanning Calorimetry for Protein

Folding 183

7.9 The Bare-Bones Essentials of Thermodynamics 183

7.10 Key Topics Omitted from This Chapter 184

Further Reading 184

Chapter 8 The Most Important Molecule: Electro-Statistics of Water 185

8.1 Basics of Water Structure 185

8.1.1 Water Is Tetrahedral because of Its Electron

Orbitals 185

8.1.2 Hydrogen Bonds 185

8.1.3 Ice 186

8.1.4 Fluctuating H-Bonds in Water 187

8.1.5 Hydronium Ions, Protons, and QuantumFluctuations 187

8.2 Water Molecules Are Structural Elements in Many CrystalStructures 188

8.3 The pH of Water and Acid-Base Ideas 188

8.4 Hydrophobic Effect 190

8.4.1 Hydrophobicity in Protein and Membrane

Structure 190

8.4.2 Statistical/Entropic Explanation of the

Hydrophobic Effect 190

8.5 Water Is a Su-ong Dielectric 192

8.5.1 Basics of Dielectric Behavior 193

8.5.2 Dielectric Behavior Results from Polarizability 194

8.5.3 Water Polarizes Primarily due to

Reorientation 195

8.5.4 Charges Prefer Water Solvation to a NonpolarEnvironment 196

8.5.5 Charges on Protein in Water = Complicated! 196

xiv Contents

8.6 Charges in Water + Salt = Screening 197

8.6.1 Statistical Mechanics of Electrostatic Systems(Technical) 198

8.6.2 First Approximation: The Poisson-Bollzmann

Equation 200

8.6.3 Second Approximation: Debye-Huckel Theory 200

8.6.4 Counterfoil Condensation on DNA 202

8.7 A Brief Word on Solubility 202

8.8 Summary 203

8.9 Additional Problem: Understanding Differential

Electrostatics 203

Further Reading 204

Chapter 9 Basics of Binding and Allostery 205

9.1 A Dynamical View of Binding: On- and Off-Rates 205

9.1.1 Time-Dependent Binding: The Basic Differential

Equation 207

9.2 Macroscopic Equilibrium and the Binding Constant 208

9.2.1 Interpreting Kd 209

9.2.2 The Free Energy of Binding AG|;ind Is Based on a

Reference State 210

9.2.3 Measuring by a "Generic" Titration

Experiment 211

9.2.4 Measuring from Isothermal Titration

Calorimetry 211

9.2.5 Measuring Kd by Measuring Rates 212

9.3 A Structural-Thermodynamic View of Binding 212

9.3.1 Pictures of Binding: "Lock and Key" vs.

•'Induced Fit" 212

9.3.2 Many Factors Affect Binding 213

9.3.3 Entropy-Enthalpy Compensation 215

9.4 Understanding Relative Affinities: A AG and

Thermodynamic Cycles 216

9.4.1 The Sign of A AG Has Physical Meaning 216

9.4.2 Competitive Binding Experiments 218

9.4.3 "Alchemical" Computations of Relative

Affinities 218

9.5 Energy Storage in "Fuels" Like ATP 220

9.6 Direct Statistical Mechanics Description of

Binding 221

9.6.1 What Are the Right Partition Functions'? 221

9.7 Allostery and Cooperativity 222

9.7.1 Basic Ideas of Allostery 222

9.7.2 Quantifying Cooperativity with the Hill Constant...

224

Contents xv

9.7.3 Further Analysis of Allostery: MWC and KNF

Models 227

9.8 Elementary Enzymatic Catalysis 229

9.8.1 The Steady-State Concept 230

9.8.2 The Michaelis-Menten "Velocity" 231

9.9 pHANDPKa 231

9.9.1 pH 232

9.9.2 pKa 232

9.10 Summary 233

Further Reading 233

Chapter 10 Kinetics of Conformational Change and Protein Folding 235

10.1 Introduction: Basins, Substates, and States 235

10.1.1 Separating Timescales to Define Kinetic

Models 235

10.2 Kinetic Analysis of Multistate Systems 238

10.2.1 Revisiting the Two-State System 238

10.2.2 A Three-State System: One Intermediate 242

10.2.3 The Effective Rate in the Presence of an

Intermediate 246

10.2.4 The Rate When There Are Parallel Pathways 250

10.2.5 Is There Such a Thing as NonequilibriumKinetics? 251

10.2.6 Formalism for Systems Described by ManyStates 252

10.3 Conformational and Allosteric Changes in Proteins 252

10.3.1 What Is the "Mechanism" of a Conformational

Change? 252

10.3.2 Induced and Spontaneous Transitions 253

10.3.3 Allosteric Mechanisms 254

10.3.4 Multiple Pathways 255

10.3.5 Processivity vs. Stochasticity 255

10.4 Protein Folding 256

10.4.1 Protein Folding in the Cell 257

10.4.2 The Levinthal Paradox 258

10.4.3 Just Another Type of Conformational

Change? 258

10.4.4 What Is the Unfolded State? 259

10.4.5 Multiple Pathways, Multiple Intermediates 260

10.4.6 Two-State Systems, $ Values, and Chevron

Plots 262

10.5 Summary 264

Further Reading 264

xvi Contents

Chapter 11 Ensemble Dynamics: From Trajectories to Diffusion

and Kinetics 265

11.1 Introduction: Back to Trajectories and Ensembles 265

11.1.1 Why We Should Care about TrajectoryEnsembles 265

11.1.2 Anatomy of a Transition Trajectory 266

11.1.3 Three General Ways to Describe Dynamics 267

11.2 One-Dimensional Ensemble Dynamics 271

11.2.1 Derivation of the One-Dimensional TrajectoryEnergy: The "Action" 272

11.2.2 Physical Interpretation of the Action 274

11.3 Four Key Trajectory Ensembles 275

11.3.1 Initialized Nonequilibrium TrajectoryEnsembles 275

11.3.2 Steady-State Nonequilibrium TrajectoryEnsembles 275

11.3.3 The Equilibrium Trajectory Ensemble 276

11.3.4 Transition Path Ensembles 276

11.4 From Trajectory Ensembles to Observables 278

11.4.1 Configuration-Space Distributions from

Trajectory Ensembles 279

11.4.2 Finding Intermediates in the Path Ensemble 280

11.4.3 The Commitment Probability and a

Transition-State Definition 280

11.4.4 Probability Flow, or Current 281

11.4.5 What Is the Reaction Coordinate? 281

11.4.6 From Trajectory Ensembles to Kinetic Rates 282

11.4.7 More General Dynamical Observables from

Trajectories 28311.5 Diffusion and Beyond: Evolving Probability

Distributions 283

11.5.1 Diffusion Derived from Trajectory Probabilities...

284

11.5.2 Diffusion on a Linear Landscape 285

11.5.3 The Diffusion (Differential) Equation 287

11.5.4 Fokker-Planck/Smoluchowski Picture for

Arbitrary Landscapes 289

11.5.5 The Issue of History Dependence 29111.6 The Jarzynski Relation and Single-Molecule

Phenomena 29311.6.1 Revisiting the Second Law of Thermodynamics ... 294

11.7 Summary 294Further Reading 295

Contents xvii

Chapter 12 A Statistical Perspective on Biomolecular Simulation 297

12.1 Introduction: Ideas, Not Recipes 297

12.1.1 Do Simulations Matter in Biophysics? 297

12.2 First, Choose Your Model: Detailed or Simplified 298

12.2.1 Atomistic and "Detailed" Models 299

12.2.2 Coarse Graining and Related Ideas 299

12.3 "Basic" Simulations Emulate Dynamics 300

12.3.1 Timescale Problems, Sampling Problems 301

12.3.2 Energy Minimization vs. Dynamics/Sampling 304

12.4 Metropolis Monte Carlo: A Basic Method and

Variations 305

12.4.1 Simple Monte Carlo Can Be Quasi-Dynamic 305

12.4.2 The General Metropolis-Hastings Algorithm 306

12.4.3 MC Variations: Replica Exchange and Beyond 307

12.5 Another Basic Method: Reweighting and Its Variations.... 309

12.5.1 Reweighting and Annealing 310

12.5.2 Polymer-Growth Ideas 311

12.5.3 Removing Weights by "Resampling" Methods 312

12.5.4 Correlations Can Arise Even without Dynamics ...313

12.6 Discrete-State Simulations 313

12.7 How to Judge Equilibrium Simulation Quality 313

12.7.1 Visiting All Important States 314

12.7.2 Ideal Sampling as a Key Conceptual Reference....

314

12.7.3 Uncertainty in Observables and Averages 314

12.7.4 Overall Sampling Quality 315

12.8 Free Energy and PMF Calculations 316

12.8.1 PMF and Configurational Free EnergyCalculations 317

12.8.2 Thermodynamic Free Energy Differences Include

All Space .318

12.8.3 Approximate Methods for Drug Design 320

12.9 Path Ensembles: Sampling Trajectories 321

12.9.1 Three Strategies for Sampling Paths 321

12.10 Protein Folding: Dynamics and Structure Prediction 322

12.11 Summary 323

Further Reading 323

Index 325