introduction to biomolecular structure and modeling dhananjay bhattacharyya biophysics division saha...

66
Introduction to Introduction to Biomolecular Structure Biomolecular Structure and Modeling and Modeling Dhananjay Bhattacharyya Biophysics Division Saha Institute of Nuclear Physics Kolkata [email protected]

Upload: raquel-farrand

Post on 14-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

  • Slide 1

Introduction to Biomolecular Structure and Modeling Dhananjay Bhattacharyya Biophysics Division Saha Institute of Nuclear Physics Kolkata [email protected] Slide 2 Biomolecular Structures These are determined experimentally by X-Ray Crystallography Nuclear Magnetic Resonance Spectroscopy Neutron Diffraction Study Raman Spectroscopy And also by theoretical methods Slide 3 2d sin =n Slide 4 Nucleic Acid Backbone is Connected to Either of Four Different Bases Slide 5 A G T C Slide 6 A-DNA B-DNA Z-DNA Slide 7 Proteins (polymers) are made up of Amino Acids (monomer units) There are Twenty different Amino Acids with different shape, size and electrostatic properties. These amino acids form covalent bonds to form a linear polypeptide chain. Slide 8 Alanine Phenylalanine Serine Cystine Slide 9 Glutamic Acid (Negatively charged) Arginine (Positively charged) Slide 10 Amino Acids are joined together by covalent bonds, called peptide bond, which is structurally very important Slide 11 helix: Hydrogen bonding between every i i+4 residues Slide 12 sheet: Hydrogen bonding between i j, i+1 j-1 (Antiparallel), or i j, i+1 j+1 (parallel) Slide 13 Coordinate System: External coordinates, such as (x,y,z), (r, , ), (r, ,z) Internal coordinates (BondLength, BondAngle, TorsionAngle) Slide 14 Bond Length Bond Angle Torsion Angle Slide 15 Internal External Coordinate Slide 16 Generated coordinates H 0.000000 0.000000 0.000000 C 0.000000 0.000000 1.089000 C 1.367073 0.000000 1.572333 C 2.050610 -1.183920 1.089000 C 3.417683 -1.183920 1.572333 H -0.513360 0.889165 1.452000 H -0.513360 -0.889165 1.452000 Slide 17 Slide 18 Theoretical Modeling of Biomolecules: Quantum Mechanics based Methods Statistics based Methods Classical or Molecular Mechanics methods Slide 19 Peptide modeling initiated in India by G.N. Ramachandran (1950s) Postulates: Impenetrable spherical volumes for each atom Radius of the sphere depend on atom type No two atomic spheres can overlap if they are not covalently bonded Slide 20 Between HNOCPS H 2.0 (1.9)2.4 (2.2) 2.65(2.5) N 2.7 (2.6) 2.9 (2.8)3.2 (3.1)3.1 (3.0) O 2.7 (2.6)2.8 (2.7)3.2 (3.1)3.1 (2.9) C 3.0 (2.9)3.4 (3.2)3.3 (3.1) P 3.5 (3.3) S Normal and Extreme Limit (within parenthesis) distances () used by Ramachandran co-workers Slide 21 Original Ramachandran Plot Fully Allowed Regions Partially Allowed Regions Slide 22 Ramachandran plot for 202 proteins at 1.5A or better resolution Slide 23 Variation of angle by 5 o allowed to fit observed phi-psi of protein structures. Slide 24 Schrodinger Equation: Quantum Mechanics Time dependent (3 Dimensional) Time independent Slide 25 DFT formalism with B3LYP Pseudoeigenvalue equation: where Potential due to exchange-correlation, is defined by with a, b and c as parameters obtained from fit with experimental data for sample compounds, E x are for electron exchange and E c are for correlation. Essentials of Computational Chemistry by C.J. Cramer (2002) John Wiley & Sons Ltd, Slide 26 Input data (atom coordinates, basis sets) Generate input guess density (overlap integrals) Construct the potential and Solve Kohn-Sham equation Generate output densities from Solutions to Kohn-Sham equations Are input and output density same? Analyze electronic population Repeat the cycle using the output density as the input density YESNO FLOW CHART DESCRIBING THE DFT METHODOLOGY Slide 27 G:C W:W C E = -26 kcal/mol A:U W:W C E = -14 G:U W:W C E = -15 A:G H:S T E = -10 A:G s:s T E = -6 A:U H:W T E = -13 A:A H:H T E = -10 G:A W:W C E = -15 G:A S:W T E = -11 A:A W:W T E = -12 A:U W:W T E = -13 A:A H:W T E = -11 2=>NH..O 1=>NH..N 1=>NH..O 1=>NH..N 2=>NH..O 2=>NH..N 1=>NH..N 1=>CH..O 1=>NH..O 1=>NH..N 2=>NH..N 1=>NH..O 1=>NH..N 2=>NH..N 1=>NH..O 1=>NH..N 1=>NH..O 1=>NH..N Strengths of different H-bonds from 33 non-canonical Base Pairs Slide 28 Considered Energy components, E NHO, E NHN, etc are additive. Additional stabilities, i may come from van der Waals, dipole- dipole etc interactions. Least Squares Fit indicates i, errors should be smallest for best Fit Type of H-bond E (kcal/mol) N-HO-7.82 N-HN-5.62 O-HN-6.89 C-HO-1.33 C-HN-0.67 A. Roy, M. Bhattacharyya, S. Panigrahi, D. Bhattacharyya, (2008) J. Phys. Chem. B (in press) Slide 29 Netropsin like drugs bind in the B-DNA narrow and deep minor groove Slide 30 Actinomycin D like drugs make their place in between two stacked base pairs by distorting the DNA double helix Slide 31 DNA kinks by 90 o at the dyad location while binding to two subunits of Catabolite Activator Protein (CAP) Slide 32 TATA-box binding protein transforms the interfacing DNA region to A-DNA like structure Slide 33 DNA Smooth Curvature induced by Histone proteins in Chromatin (Nucleosome) Slide 34 Definition and Nomenclature of Base Pair Doublet Parameters Slide 35 Calculation of Base Pair parameters by NUPARM Local Step Parameters: Mean Local Helix Axis: Zm = Xm Ym, where Xm = Xaxis 1 + Xaxis 2 and Ym = Yaxis 1 + Yaxis 2 M is Base Pair Center to Center Vector Tilt : 2.0 * sin -1 ( -Zm Y1) Roll: 2.0 * sin -1 ( Zm X1) Twist:cos -1 (( X1 Zm) ( X2 Zm)) Shift (Dx) M Xm Slide(Dy)M Ym Rise(Dz) M Zm Slide 36 Partial list of DNA crystal structures available at http://ndbserver.rutgers.edu bd0001 12: A C C G A C G T C G G T bd0003 12: A C C G G T A C C G G T bd0004 12: C G C G A A T T C G C G bd0006 10: G G C C A A T T G G bd0011 12: C G C A A A T A T G C G bd0014 12: C G C G A A T T C G C G bd0015 10: C C G C C G G C G G bd0017 9: C G C G C G G A G bd0018 11: G C G A A T T C G C G bd0019 12: G G C G A A T T C G C G bd0022 12: A C C G G C G C C A C A bd0023 10: C C A G T A C T G G Bd0024 10: C C G A A T G A G G Slide 37 Slide 38 Average Structural Parameters from Crystal Structures Base-Pair Step Size of Database TiltRollTwistRise G:G37-0.245.8030.993.46 G:C106-0.33-5.3738.523.32 C:G1570.663.8136.263.46 A:A116-0.010.6735.923.21 A:T540.20-0.6032.763.25 T:A18-0.020.0740.393.30 A:C20-0.370.9732.733.43 C:A47-0.192.1737.753.48 A:G340.165.3431.923.44 G:A55-0.230.5238.403.14 Slide 39 DNA Bending: Experimental and Theory SequenceExperimental R L Theoretical bending (d/l) Random1.000.98 (AAANNNNNNN) n 1.230.85 (AAAANNNNNN) n 1.600.81 (AAAAANNNNN) n 2.000.74 (AAAAAANNNN) n 2.310.72 (AAAAAAAANN) n 2.210.67 (AAAAAAAAAN) n 1.730.82 Slide 40 Curved DNA models built from Crystal parameters (A 3 G 7 ) n (A 6 G 4 ) n (A 10 ) n Slide 41 Bond Angle Deformation Deformation from equilibrium value costs energy. Simplest form of energy penalty is: E k o Slide 42 Bonds are also stretchable but at a cost of energy Bond Breaking energy Slide 43 Ethane (three fold symmetry) Ethiline (two fold symmetry) Slide 44 Normal and Extreme Limiting (within parenthesis) distances () used by Ramachandran co-workers Minimum Energy position: r ij o BetweenHNOCPS H2.0 (1.9)2.4 (2.2) 2.65 (2.5) N 2.7 (2.6) 2.9 (2.8)3.2 (3.1)3.1 (3.0) O 2.7 (2.6)2.8 (2.7)3.2 (3.1)3.1 (2.9) C 3.0 (2.9)3.4 (3.2)3.3 (3.1) P 3.5 (3.3) S Interaction between Instantaneous Atomic dipoles and Induced Atomic dipoles Slide 45 Force Field for Biomolecular Simulation Slide 46 E( x, y, z) E( x+1, y, z) E( x+2, y, z) .. Search for Conformation with Lowest Energy Slide 47 Multivariable Optimization: NP-hard Problem Systematic Grid Search procedure: Impossible, large no. variables Guided Grid Search: Depends on Choice Approximate Method based on Taylor series Newton-Rhapson Method: Slide 48 Energy Landscape of typical bio-molecules Energy Positional Variables Slide 49 Always Accept Reject Accept Energy Uniformly generated Random numbers are used to accept if exp(- U/kT) > random no and reject otherwise Conformation 0: Calculate energy (E i ) Alter conformation randomly Calculate energy (E i+1 ) Calculate = exp(-(E i+1 -E i )/kT) If > random no accept the conformation Repeat the procedure Slide 50 Deterministic Method Molecular Dynamics Verlet Algorithm: Slide 51 Leapfrog-Verlet Algorithm t 0 -1/2 tt 0 +1/2 tt 0 +3/2 t t 0 +5/2 t t 0 +7/2 t t0t0 t 0 + t t 0 +2 tt 0 +3 t t 0 +4 t EEEE EEEE EEEE EEEE EEEE vvvv Slide 52 Slide 53 Time scale of Vibrational Motions TypeWave no (cm -1 )Period T p (/c) (fs) T p / (fs) O-H, N-H stretch3200-36009.83.1 C-H Stretch300011.13.5 O-C-O Asymm. Stretch240013.94.5 C=C, C=N stretch210015.95.1 C=O (carbonyl) stretch170019.66.2 C=C stretch H-O-H bend160020.86.4 C-N-H, H-N-H bend150022.27.1 C-N stretch (amides)125026.28.4 Water Libration (rocking)70041.713 C=C-H bending Slide 54 Simple Pendulum Average Position of a simple pendulum 1 2 3 4 5 Period of measurement of position : ~2.3 T Recommended period of measurement ~T /10 Slide 55 Duration of Simulation Protein Folding requires 1 s to 1ms Ligand binding/dissociation requires 1 s No. of steps = 1ms / t = 10 -3 s/10 -15 s = 10 12 Need of faster computer Engaging several computers in parallel Increasing t by Shake, Rattle or Lincs algorithms Slide 56 Softwares for Molecular Simulation Accelrys, MOE, SYBYL, TATA-BioSuite (Composite package, costly) CHARMM, AMBER (for Simulation, special Academic Price) GROMACS, NAMD (for Simulation, FREE) MOLDEN (for molecule Building, FREE) GAMESS (for QM calculation, FREE) Slide 57 Heating phase Equilibration Slide 58 Dickerson Dodecamer seq: d(CGCGAATTCGCG) 2 Slide 59 Slide 60 CURVES calculated values Slide 61 Slide 62 S replaces O in backbone of substituted DNA. It yields two chiral conformers of DNA PSR and PSS. S. Mukherjee and D. Bhattacharyya (2004) Biopolymers 73, 269282 Slide 63 Slide 64 Slide 65 PS-R PS-S Normal PO PS-R PS-S Slide 66 Students: Dr. Debashree Bandyopadhyay Dr. Shayantani Mukherjee Dr. Kakali Sen Mr. Sudipta Samanta Partially Supported by CSIR, DBT and CAMCS (SINP) Collaborators: Dr. Rabi Majudar Dr. Samita Basu Dr. Sangam Banerjee Dr. Abhijit Mitra (IIIT, Hyderabad) Dr. N. Pradhan (NIMHANS, Bangalore)