statistical physics of biomolecules : an introduction · statistical physics ofbiomolecules an...
TRANSCRIPT
Statistical Physicsof BiomoleculesAN INTRODUCTION
Daniel M. Zuckerman
TECHNiSCHE
INFORMATIONSSIBLIOTHEK
UNiVERSITATSBIBLIOTHEKHANNOVER
V. J
CRC PressTaylor & Francis Group
CRC Press Is an imprint of the
Taylor & Francis Croup, an Inform* business
Contents
Preface xix
Acknowledgments xxi
Chapter 1 Proteins Don't Know Biology 1
1.1 Prologue: Statistical Physics of Candy, Dirt, and Biology 1
1.1.1 Candy 1
1.1.2 Clean Your House, Statistically 2
1.1.3 More Seriously 3
1.2 Guiding Principles 4
1.2.1 Proteins Don't Know Biology 4
1.2.2 Nature Has Never Heard of Equilibrium 4
1.2.3 Entropy Is Easy 5
1.2.4 Three Is the Magic Number for Visualizing Data 5
1.2.5 Experiments Cannot Be Separated from "Theory" 5
1.3 About This Book 5
1.3.1 What Is Biomolecular Statistical Physics? 5
1.3.2 What's in This Book, and What's Not 6
1.3.3 Background Expected of the Student '. 7
1.4 Molecular Prologue: A Day in the Life of Butane 7
1.4.1 Exemplary by Its Stupidity 9
1.5 What Does Equilibrium Mean to a Protein? 9
1.5.1 Equilibrium among Molecules 9
1.5.2 Internal Equilibrium 10
1.5.3 Time and Population Averages 11
1.6 A Word on Experiments 11
1.7 Making Movies: Basic Molecular Dynamics Simulation 12
1.8 Basic Protein Geometry 14
1.8.1 Proteins Fold 14
1.8.2 There Is a Hierarchy within Protein Structure 14
1.8.3 The Protein Geometry We Need to Know,
for Now 15
1.8.4 The Amino Acid 16
1.8.5 The Peptide Plane 17
1.8.6 The Two Main Dihedral Angles Are Not
Independent 17
1.8.7 Correlations Reduce Configuration Space, but Not
Enough to Make Calculations Easy 18
1.8.8 Another Exemplary Molecule: Alanine Dipeptide 18
vjjj Contents
1.9 A Note on the Chapters 18
Further Reading 19
Chapter 2 The Heart of It All: Probability Theory 21
2.1 Introduction 21
2.1.1 The Monty Hall Problem 21
2.2 Basics of One-Dimensional Distributions 22
2.2.1 What Is a Distribution? 22
2.2.2 Make Sure It's a Density! 25
2.2.3 There May Be More than One Peak:
Multimodality 25
2.2.4 Cumulative Distribution Functions 26
2.2.5 Averages 28
2.2.6 Sampling and Samples 29
2.2.7 The Distribution of a Sum of Increments:
Convolutions 31
2.2.8 Physical and Mathematical Origins of Some
Common Distributions 34
2.2.9 Change of Variables 36
2.3 Fluctuations and Error 36
2.3.1 Variance and Higher "Moments" 37
2.3.2 The Standard Deviation Gives the Scale of a
Unimodal Distribution 38
2.3.3 The Variance of a Sum (Convolution) 39
2.3.4 A Note on Diffusion 40
2.3.5 Beyond Variance: Skewed Distributions
and Higher Moments 41
2.3.6 Error (Not Variance) 41
2.3.7 Confidence Intervals 43
2.4 Two+ Dimensions: Projection and Correlation 43
2.4.1 Projection/Marginalization 44
2.4.2 Correlations, in a Sentence 45
2.4.3 Statistical Independence 46
2.4.4 Linear Correlation 46
2.4.5 More Complex Correlation 48
2.4.6 Physical Origins of Correlations 50
2.4.7 Joint Probability and Conditional Probability 51
2.4.8 Correlations in Time 52
2.5 Simple Statistics Help Reveal a Motor Protein's
Mechanism 54
2.6 Additional Problems: Trajectory Analysis 54
Further Reading 55
Contents ix
Chapter 3 Big Lessons from Simple Systems: Equilibrium Statistical
Mechanics in One Dimension 57
3.1 Intrpduction 57
3.1.1 Looking Ahead 57
3.2 Energy Landscapes Are Probability Distributions 58
3.2.1 Translating Probability Concepts into the
Language of Slatistical Mechanics 60
3.2.2 Physical Ensembles and the Connection with
Dynamics 61
3.2.3 Simple States and the Harmonic Approximation 61
3.2.4 A Hint of Fluctuations: Average Does Not Mean
Most Probable 63
3.3 States, Not Configurations 65
3.3.1 Relative Populations 65
3.4 Free Energy: It's Just Common Sense... If You Believe in
Probability 66
3.4.1 Getting Ready: Relative Populations 67
3.4.2 Finally, the Free Energy 68
3.4.3 More General Harmonic Wells 69
3.5 Entropy: It's Just a Name 70
3.5.1 Entropy as (the Log of) Width: Double
Square Wells 71
3.5.2 Entropy as Width in Harmonic Wells 73
3.5.3 That Awful £p Inp Formula 74
3.6 Summing Up 76
3.6.1 States Get the Fancy Names because They're Most
Important 76
3.6.2 It's the Differences That Matter 77
3.7 Molecular Intuition from Simple Systems 78
3.7.1 Temperature Dependence: A One-Dimensional
Model of Protein Folding 78
3.7.2 Discrete Models 80
3.7.3 A Note on ID Multi-Particle Systems 81
3.8 Loose Ends: Proper Dimensions, Kinetic Energy 81
Further Reading 83
Chapter 4 Nature Doesn't Calculate Partition Functions: Elementary
Dynamics and Equilibrium 85
4.1 Introduction 85
4.1.1 Equivalence of Time and Configurational Averages... 86
4.1.2 An Aside: Does Equilibrium Exist? 86
XContents
4.2 Newtonian Dynamics: Deterministic but Not Predictable 87
4.2.1 The Probabilistic ("Stochastic") Picture of
Dynamics 89
4.3 Barrier Crossing—Activated Processes 89
4.3.1 A Quick Preview of Barrier Crossing 89
4.3.2 Catalysts Accelerate Rates by Lowering Barriers 91
4.3.3 A Decomposition of the Rate 91
4.3.4 More on Arrhenius Factors and Their Limitations 92
4.4 Flux Balance: The Definition of Equilibrium 92
4.4.1 "Detailed Balance" and a More Precise Definition
of Equilibrium 94
4.4.2 Dynamics Causes Equilibrium Populations 94
4.4.3 The Fundamental Differential Equation 95
4.4.4 Are Rates Constant in Time? (Advanced) 95
4.4.5 Equilibrium Is "Self-Healing" 96
4.5 Simple Diffusion, Again 97
4.5.1 The Diffusion Constant and the Square-Root Law
of Diffusion 98
4.5.2 Diffusion and Binding 100
4.6 More on Stochastic Dynamics: The Langevin Equation 100
4.6.1 Overdamped, or "Brownian," Motion and Its
Simulation 102
4.7 Key Tools: The Correlation Time and Function 103
4.7.1 Quantifying Time Correlations: The
Autocorrelation Function 104
4.7.2 Data Analysis Guided by Time Correlation
Functions 105
4.7.3 The Correlation Time Helps to Connect Dynamicsand Equilibrium 106
4.8 Tying It All Together 106
4.9 So Many Ways to ERR: Dynamics in Molecular
Simulation 107
4.10 Mini-Project: Double-Well Dynamics 108
Further Reading 109
Chapter 5 Molecules Are Correlated! Multidimensional Statistical
Mechanics Ill
5.1 Introduction Ill
5.1.1 Many Atoms in One Molecule and/or ManyMolecules Ill
5.1.2 Working toward Thermodynamics 112
5.1.3 Toward Understanding Simulations 112
5.2 A More-than-Two-Dimensional Prelude 112
5.2.1 One "Atom" in Two Dimensions 113
5.2.2 Two Ideal (Noninteracting) "Atoms" in 2D 114
Contents xi
5.2.3 A Diatomic "Molecule" in 2D 115
5.2.4 Lessons Learned in Two Dimensions 119
5.3 Coordinates and Forcefields 1195.3.1 Cartesian Coordinates 119
5.3.2 Internal Coordinates 120
5.3.3 A Forcefield Is Just a Potential EnergyFunction 121
5.3.4 Jacobian Factors for Internal Coordinates
(Advanced) 123
5.4 The Single-Molecule Partition Function 124
5.4.1 Three Atoms Is Too Many for an Exact
Calculation 125
5.4.2 The General Unimolecular Partition Function 126
5.4.3 Back to Probability Theory and Correlations 127
5.4.4 Technical Aside: Degeneracy Number 128
5.4.5 Some Lattice Models Can Be Solved Exactly 129
5.5 Multimolecular Systems 130
5.5.1 Partition Functions for Systems of Identical
Molecules 131
5.5.2 Ideal Systems—Uncorrelated by Definition 132
5.5.3 Nonideal Systems 132
5.6 The Free Energy Still Gives the Probability 133
5.6.1 The Entropy Still Embodies Width (Volume) 134
5.6.2 Defining States 134
5.6.3 Discretization Again Implies S ~ - 53P^nP 135
5.7 Summary 135
Further Reading 135
Chapter 6 From Complexity to Simplicity: The Potential of Mean Force 137
6.1 Introduction: PMFs Are Everywhere 137
6.2 The Potential of Mean Force Is Like a Free Energy 137
6.2.1 The PMF Is Exactly Related to a Projection 138
6.2.2 Proportionality Functions for PMFs 140
6.2.3 PMFs Are Easy to Compute from a Good
Simulation 141
6.3 The PMF May Not Yield the Reaction Rate or Transition
State 142
6.3.1 Is There Such a Thing as a Reaction Coordinate?...
143
6.4 The Radial Distribution Function 144
6.4.1 What to Expect for g(r) 145
6.4.2 g(r) Is Easy to Get from a Simulation 146
6.4.3 The PMF Differs from the "Bare" Pair Potential....
148
6.4.4 From g(r) to Thermodynamics in Pairwise
Systems 149
6.4.5 g(r) Is Experimentally Measurable 149
xii Contents
6.5 PMFs Are the Typical Basis for "Knowledge-Based"
("Statistical") Potentials 150
6.6 Summary: The Meaning, Uses, and Limitations of
thePMF 150
Further Reading 151
Chapter 7 What's Free about "Free" Energy? Essential Thermodynamics.... 153
7.1 Introduction 153
7.1.1 An Apology: Thermodynamics Does Matter! 153
7.2 Statistical Thermodynamics: Can You Take a Derivative?...
154
7.2.1 Quick Reference on Derivatives 154
7.2.2 Averages and Entropy, via First Derivatives 155
7.2.3 Fluctuations from Second Derivatives 157
7.2.4 The Specific Heat, Energy Fluctuations, and the
(Un)folding Transition 157
7.3 You Love the Ideal Gas 158
7.3.1 Free Energy and Entropy of the Ideal Gas 159
7.3.2 The Equation of State for the Ideal Gas 160
7.4 Boring but True: The First Law Describes EnergyConservation 160
7.4.1 Applying the First Law to the Ideal Gas: Heatingat Constant Volume 161
7.4.2 Why Is It Called "Free" Energy, Anyway? The
Ideal Gas Tells All 162
7.5 G vs. F: Other Free Energies and Why They (Sort of)
Matter 164
7.5.1 C, Constant Pressure, Fluctuating Volume—A
Statistical View 164
7.5.2 When Is It Important to Use G Instead of Fl 166
7.5.3 Enthalpy and the ThermodynamicDefinition of G 168
7.5.4 Another Derivative Connection—GettingP fromF 169
7.5.5 Summing Up: G vs. F 170
7.5.6 Chemical Potential and Fluctuating Particle
Numbers 171
7.6 Overview of Free Energies and Derivatives 173
7.6.1 The Pertinent Free Energy Depends on the
Conditions 173
7.6.2 Free Energies Are "Slate Functions" 174
7.6.3 First Derivatives of Free Energies Yield
Averages 174
7.6.4 Second Derivatives Yield
Fluctuations/Susceptibilities 174
Contents xiii
7.7 The Second Law and (Sometimes) Free EnergyMinimization 175
7.7.1 A Kinetic View Is Helpful 175
7.7.2 Spontaneous Heat Flow and Entropy 175
7.7.3 The Second Law for Free
Energies—Minimization, Sometimes 177
7.7.4 PMFs and Free Energy Minimization for
Proteins—Be Warned! 179
7.7.5 The Second Law for Your House: RefrigeratorsAre Heaters 181
7.7.6 Summing Up: The Second Law 181
7.8 Calorimetry: A Key Thermodynamic Technique 182
7.8.1 Integrating the Specific Heat Yields Both Enthalpyand Entropy 182
7.8.2 Differential Scanning Calorimetry for Protein
Folding 183
7.9 The Bare-Bones Essentials of Thermodynamics 183
7.10 Key Topics Omitted from This Chapter 184
Further Reading 184
Chapter 8 The Most Important Molecule: Electro-Statistics of Water 185
8.1 Basics of Water Structure 185
8.1.1 Water Is Tetrahedral because of Its Electron
Orbitals 185
8.1.2 Hydrogen Bonds 185
8.1.3 Ice 186
8.1.4 Fluctuating H-Bonds in Water 187
8.1.5 Hydronium Ions, Protons, and QuantumFluctuations 187
8.2 Water Molecules Are Structural Elements in Many CrystalStructures 188
8.3 The pH of Water and Acid-Base Ideas 188
8.4 Hydrophobic Effect 190
8.4.1 Hydrophobicity in Protein and Membrane
Structure 190
8.4.2 Statistical/Entropic Explanation of the
Hydrophobic Effect 190
8.5 Water Is a Su-ong Dielectric 192
8.5.1 Basics of Dielectric Behavior 193
8.5.2 Dielectric Behavior Results from Polarizability 194
8.5.3 Water Polarizes Primarily due to
Reorientation 195
8.5.4 Charges Prefer Water Solvation to a NonpolarEnvironment 196
8.5.5 Charges on Protein in Water = Complicated! 196
xiv Contents
8.6 Charges in Water + Salt = Screening 197
8.6.1 Statistical Mechanics of Electrostatic Systems(Technical) 198
8.6.2 First Approximation: The Poisson-Bollzmann
Equation 200
8.6.3 Second Approximation: Debye-Huckel Theory 200
8.6.4 Counterfoil Condensation on DNA 202
8.7 A Brief Word on Solubility 202
8.8 Summary 203
8.9 Additional Problem: Understanding Differential
Electrostatics 203
Further Reading 204
Chapter 9 Basics of Binding and Allostery 205
9.1 A Dynamical View of Binding: On- and Off-Rates 205
9.1.1 Time-Dependent Binding: The Basic Differential
Equation 207
9.2 Macroscopic Equilibrium and the Binding Constant 208
9.2.1 Interpreting Kd 209
9.2.2 The Free Energy of Binding AG|;ind Is Based on a
Reference State 210
9.2.3 Measuring by a "Generic" Titration
Experiment 211
9.2.4 Measuring from Isothermal Titration
Calorimetry 211
9.2.5 Measuring Kd by Measuring Rates 212
9.3 A Structural-Thermodynamic View of Binding 212
9.3.1 Pictures of Binding: "Lock and Key" vs.
•'Induced Fit" 212
9.3.2 Many Factors Affect Binding 213
9.3.3 Entropy-Enthalpy Compensation 215
9.4 Understanding Relative Affinities: A AG and
Thermodynamic Cycles 216
9.4.1 The Sign of A AG Has Physical Meaning 216
9.4.2 Competitive Binding Experiments 218
9.4.3 "Alchemical" Computations of Relative
Affinities 218
9.5 Energy Storage in "Fuels" Like ATP 220
9.6 Direct Statistical Mechanics Description of
Binding 221
9.6.1 What Are the Right Partition Functions'? 221
9.7 Allostery and Cooperativity 222
9.7.1 Basic Ideas of Allostery 222
9.7.2 Quantifying Cooperativity with the Hill Constant...
224
Contents xv
9.7.3 Further Analysis of Allostery: MWC and KNF
Models 227
9.8 Elementary Enzymatic Catalysis 229
9.8.1 The Steady-State Concept 230
9.8.2 The Michaelis-Menten "Velocity" 231
9.9 pHANDPKa 231
9.9.1 pH 232
9.9.2 pKa 232
9.10 Summary 233
Further Reading 233
Chapter 10 Kinetics of Conformational Change and Protein Folding 235
10.1 Introduction: Basins, Substates, and States 235
10.1.1 Separating Timescales to Define Kinetic
Models 235
10.2 Kinetic Analysis of Multistate Systems 238
10.2.1 Revisiting the Two-State System 238
10.2.2 A Three-State System: One Intermediate 242
10.2.3 The Effective Rate in the Presence of an
Intermediate 246
10.2.4 The Rate When There Are Parallel Pathways 250
10.2.5 Is There Such a Thing as NonequilibriumKinetics? 251
10.2.6 Formalism for Systems Described by ManyStates 252
10.3 Conformational and Allosteric Changes in Proteins 252
10.3.1 What Is the "Mechanism" of a Conformational
Change? 252
10.3.2 Induced and Spontaneous Transitions 253
10.3.3 Allosteric Mechanisms 254
10.3.4 Multiple Pathways 255
10.3.5 Processivity vs. Stochasticity 255
10.4 Protein Folding 256
10.4.1 Protein Folding in the Cell 257
10.4.2 The Levinthal Paradox 258
10.4.3 Just Another Type of Conformational
Change? 258
10.4.4 What Is the Unfolded State? 259
10.4.5 Multiple Pathways, Multiple Intermediates 260
10.4.6 Two-State Systems, $ Values, and Chevron
Plots 262
10.5 Summary 264
Further Reading 264
xvi Contents
Chapter 11 Ensemble Dynamics: From Trajectories to Diffusion
and Kinetics 265
11.1 Introduction: Back to Trajectories and Ensembles 265
11.1.1 Why We Should Care about TrajectoryEnsembles 265
11.1.2 Anatomy of a Transition Trajectory 266
11.1.3 Three General Ways to Describe Dynamics 267
11.2 One-Dimensional Ensemble Dynamics 271
11.2.1 Derivation of the One-Dimensional TrajectoryEnergy: The "Action" 272
11.2.2 Physical Interpretation of the Action 274
11.3 Four Key Trajectory Ensembles 275
11.3.1 Initialized Nonequilibrium TrajectoryEnsembles 275
11.3.2 Steady-State Nonequilibrium TrajectoryEnsembles 275
11.3.3 The Equilibrium Trajectory Ensemble 276
11.3.4 Transition Path Ensembles 276
11.4 From Trajectory Ensembles to Observables 278
11.4.1 Configuration-Space Distributions from
Trajectory Ensembles 279
11.4.2 Finding Intermediates in the Path Ensemble 280
11.4.3 The Commitment Probability and a
Transition-State Definition 280
11.4.4 Probability Flow, or Current 281
11.4.5 What Is the Reaction Coordinate? 281
11.4.6 From Trajectory Ensembles to Kinetic Rates 282
11.4.7 More General Dynamical Observables from
Trajectories 28311.5 Diffusion and Beyond: Evolving Probability
Distributions 283
11.5.1 Diffusion Derived from Trajectory Probabilities...
284
11.5.2 Diffusion on a Linear Landscape 285
11.5.3 The Diffusion (Differential) Equation 287
11.5.4 Fokker-Planck/Smoluchowski Picture for
Arbitrary Landscapes 289
11.5.5 The Issue of History Dependence 29111.6 The Jarzynski Relation and Single-Molecule
Phenomena 29311.6.1 Revisiting the Second Law of Thermodynamics ... 294
11.7 Summary 294Further Reading 295
Contents xvii
Chapter 12 A Statistical Perspective on Biomolecular Simulation 297
12.1 Introduction: Ideas, Not Recipes 297
12.1.1 Do Simulations Matter in Biophysics? 297
12.2 First, Choose Your Model: Detailed or Simplified 298
12.2.1 Atomistic and "Detailed" Models 299
12.2.2 Coarse Graining and Related Ideas 299
12.3 "Basic" Simulations Emulate Dynamics 300
12.3.1 Timescale Problems, Sampling Problems 301
12.3.2 Energy Minimization vs. Dynamics/Sampling 304
12.4 Metropolis Monte Carlo: A Basic Method and
Variations 305
12.4.1 Simple Monte Carlo Can Be Quasi-Dynamic 305
12.4.2 The General Metropolis-Hastings Algorithm 306
12.4.3 MC Variations: Replica Exchange and Beyond 307
12.5 Another Basic Method: Reweighting and Its Variations.... 309
12.5.1 Reweighting and Annealing 310
12.5.2 Polymer-Growth Ideas 311
12.5.3 Removing Weights by "Resampling" Methods 312
12.5.4 Correlations Can Arise Even without Dynamics ...313
12.6 Discrete-State Simulations 313
12.7 How to Judge Equilibrium Simulation Quality 313
12.7.1 Visiting All Important States 314
12.7.2 Ideal Sampling as a Key Conceptual Reference....
314
12.7.3 Uncertainty in Observables and Averages 314
12.7.4 Overall Sampling Quality 315
12.8 Free Energy and PMF Calculations 316
12.8.1 PMF and Configurational Free EnergyCalculations 317
12.8.2 Thermodynamic Free Energy Differences Include
All Space .318
12.8.3 Approximate Methods for Drug Design 320
12.9 Path Ensembles: Sampling Trajectories 321
12.9.1 Three Strategies for Sampling Paths 321
12.10 Protein Folding: Dynamics and Structure Prediction 322
12.11 Summary 323
Further Reading 323
Index 325