-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
1/30
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
2/30
Coil-Helix Transitions
The transition between a random coil and a helixstructure: also called the coil-helix transition
is an important component in protein folding pathways.
The term, random coil: refers to a set of equivalent coil-like structures: each is unfolded, relative to a typical helical structure
-helix, 310 helix, -strand, etc.
Focus: thermodynamic properties of these transitions.
For simplicity, we begin with a homopolymer: a polynucleotide of identical amino acids
and focus on the transition: random coil to -helix.
We investigate this coil-helix transition:
using a statistical thermodynamic treatment: The Zipper Model.
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
3/30
The Nucleation of the -helix
The nucleation step in -helix formation: involves formation of an H-bond between:
Keto Oxygen of residue j.
Amide Hydrogen of residue j+4.
this requires the torsion angles to
assume mean values:
= -57o, = -47
o.
entropically unfavorable.
This H-bond helps to stabilize
the helical structure. Energetic favorability, however:
relies on isolation of the H-bond
from competition with water
as is the case in the folded proteins interior;
or in a non-polar solvent.
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
4/30
Model Polypeptide System
As a result, many shorter oligo-polypeptides: form an -helix only in organic solvent (e.g., octanol).
Certain long polypeptides, however:
can be induced to form an -helix in Aq. solution
Classic Example (Zimm and Bragg, 1959): poly-[-benzyl-L-glutamate]
in 80% dichloroacetic acid / 20% ethylene dichloride.
undergoes a coil-to-helix transition when heated:
the opposite of protein denaturation in Aq. solution due to dichloroacetic acids ability to form strong H-bonds:
with the amide-Nitrogens in the coil.
Here, -helix formation thus endothermic.
since Ho
> 0, this process is entropy driven (So
> 0)
consistent with release ofsolventfrom the helix.
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
5/30
Estimating s
In order to apply the Zipper model to a transition: s must be related to experimentally measured quantities:
The midpoint of the melting transition:
The melting temperature, Tm
Ho
for adding 1 helical unit onto a pre-existing helix:
The enthalpy of helix growth, H
o
g. Method: Experimental determination of s
s modeled as a micro-equilibrium constant of helix formation
In practice, Ho
g determined from by comparing Tms of polymers of
2 different lengths.
Allows modeling within the zipper model.
While adjusted to yield the best fit.
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
6/30
Comparison with Experiment
Points are experimental data (2 sets, Doty, etal.): determined by optical rotation.
for several lengths: N = 26, 1500 residues.
Curves: predicted values: Fractional number of helical
residues, h.
estimated by the Zipper model.
s values computed using:
Ho
g = 0.89 kcal/mol > 0. helix formation endothermic.
thus, s increases with T.
2 fitted values shown: = 1 x 10
-4(dashed).
= 2 x 10-4
(solid).
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
7/30
General Transition Characteristics
Transition shows a large N-dependence. even though s and are length-independent.
Cooperativity of the transition increases with N.
as measured by the narrowness of the transition
Relative contributions of parametersalso stronglyN-dependent
In short polymers, nucleation ()dominant
initial formation unfavorable.
Propagation strongly inhibited.
At large N, propagation (s)
dominates
nucleation penalty distributed
over more residues
s quickly dominates as T increases.
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
8/30
Validating the Size of
Fitted Application of a Zipper model: predicts the coil to -helix transition to be highly cooperative.
fitted value, = 10-4.
This prediction can be separately validated as follows: statistical weight of nucleation = s.
This accounts for formation of 1 H-bond. s accounts for the balance between H
oand S
o
for only 1 residue.
accounts for the cooperativity of nucleation: nucleation restricts the angles of 4 residues:
to values typical of an -helix: ( = -57
o
,= -47
o
), the S
ofor only 1 of these 4 is included in s
thus, = exp[3So
res/R]
Net entropy change/residue:
So
res = R ln Whelix R ln Wcoil -R ln 9 = -18 J/mol K.
Substitution yields the estimate, = 1.5 x 10-3.
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
9/30
The Coil to 310-Helix Transition
s and always depend on the nature of the transition. other transitions exhibit different s and values.
Example: Sequences of type (AAAAK)nA.
A, K = alanine, lysine, respectively.
convert from 310
helices to -helices
when n is increased from 3 to 4. i.e., with increasing total polymer length.
This suggests that the 310 helix:
is easier to initiate (from a coil) than an -helix:
(310
) > (-helix)
but, once initiated, the -helix is more easily propagated:
s(-helix) > s(310)
The difference in is due to conformational entropies:
nucleation of a 310 helix fixes the torsion angles of 1 less residue:
-helix: H-bond between residues j and j+4.
310-helix: H-bond b/w residues j and j+3.
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
10/30
Sequence Dependence of, s
For regular-helix formation (or melting), both and s are sequence dependent. each type of residue characterized by a different s value.
cooperativity, ofhelix formation will also vary: but with the mean residue content.
Ex: In Lysine-containing polymers,
10
-3
. nucleation 10x more favorable compared to poly-[-benzyl-L-glutamate].
The impact of residue differences on -helixstability:
studied using host-guest peptides. energetic variations due to a single , internal switched
residue are measured; essentially no variation in .
emphasis: determination of variation in s.
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
11/30
Host-Guest Parameters Begin with a host -helix (yyyyyyy):
y = some residue stable as a -helix. transition free energy/residue: G
o(y).
Replace 1 y with a guest residue (X). yields sequence: yyy-X-yyy
Measure Go
(kJ/mol) for all X values:
then, G
o
(x) = G
o
(y) + G
o
(x) Values yield -helix propensities.
Here, shown normalized relative to the G
oof Gly;
since Gly lacks a C (R = H).
All but Pro more favorable than Gly. Pro is a strong -helix breaker.
Conversion from Go(x) to s:
assume ~ independent of X. then, s = exp[-G
o(X)/RT].
Here, s = 1.0 means neutral favorability.
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
12/30
Modeling Melting Initiation
Consider an N-residue polypeptide:
in the fully helical conformation: hhhhhh
weight: N =sN.
Melting of the Helix:
can occur by 2 fundamentally different processes
melting of an end residue:
2 conformations: chhhhh and hhhhhc.
total weight: N-1 = 2sN-1
.
melting a middle residue
N-2 conformations, of the form : hhhchhh each has 2 helix-islands
total weight: N-1 = (N-2)2s
N-1.
More generally, a conformation with j helix-islands:
will contain j factors of.
this motivates our Zipper model.
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
13/30
End vs. Middle Melting
The relative probabilities of initiating melting: at the end vs. the middle
estimated by a ratio of statistical weights:
Pe/Pm = 2sN-1
/(N-2)2s
N-1 2/N.
we have 2 opposing factors:
N = number of central melting points. = penalty for initiating melting at a given middle point.
Assuming the typical experimental value ( = 2x10-4)
Pe/Pm = 1 when N 104.
For short helices (N < 104
residues), dominates
transition initiates at the ends.
For long helices (N > 104
residues), N overcomes ...
denaturation may then proceed from the middle. These trends observed experimentally, in globular proteins.
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
14/30
Predicting Protein Structure
Given sets of (s,) values for all 20 amino acids for formation of all types of 2
ostructures:
coil to -helices, -strand, or 310 helix, etc..
We should be able to apply a Zipper model:
to predict the probability of adopting each type of structure during folding and melting of globular proteins.
Added complication:
s = exp(-Go/RT) also depends on external factors.
e.g.,Go
varies with residue environment. Globular proteins offer 2 distinct environments:
hydrophobic: buried residues in the protein interior.
hydrophilic: solvent accessible residues at the protein surface.
meaningful assignment of an s value to each residue j:
demands knowledge of whether j is buried, in each context.
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
15/30
Chou and Fasman
Statistical Thermodynamics not routinely used tomodel protein folding.
however, many statistical methods for predicting 2o
structure have been developed.
these incorporate many of the essential features of theZipper model.
The Method of Chou and Fasman (1974):
begins with a empirical set of residue parameters.
defined not by measured transition energies (Go),
but by the statistical tendency of each residue to form each
type of structure
as determined from the mole fractions present in actual
protein crystals.
first parameters used data from 64 different proteins.
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
16/30
Chou and Fasman Parameters
For each type of amino acid three types of parameters are computed: = propensity to form an -helix.
= propensity to form a -sheet.
= propensity to turn (adopt a coil).
Example: Determination of values. First, a mole fraction is computed for each type, i:
(i) = occurrence of i in an -helix / occurrence of i in
the data set.
Secondly, an average alpha-helical amino acid is defined:
with an average value of(i) :
< > = i(i) / 20
Parameter i then defined as a relative tendency:
= (i) / < >
really just a weighted average.
Re eated for each t e of 2o
structure.
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
17/30
The Favorability of Propagation
Parameters and : correspond to the mean propagation terms,
in their respective Zipper models; averaged over solvent conditions.
Ex.: corresponds conceptually to
in the Zipper model of the coil to -helix transition. Qualitative propensity also assigned to each residue,
relative to each type of structure.
e.g., relative to -helix formation, residues categorized as: Strong Helix Formers (H)
Average Helix Formers (h) Weak Helix Formers (I)
Indifferent (i)
Weak Helix Breakers (b)
Strong Helix Breakers (B)
Again, this is repeated for each type of 2o
structure.
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
18/30
Chou-Fasman Parameter Set
Comparison w/ Host-Guest Parameters:
relative favorabilities:
general agreement.
differences in theordering.
values play the role of
propagation terms, s.
Proline:
low and .
due to restricted
torsion angles.
-helix, -sheet breaker.
Glycine:
low , but high
great conformational
freedom.
3rd residue of a Type II turn.
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
19/30
The Cooperativity of Nucleation
The cooperativity of 2o
structure formation: i.e., the statistical unlikelihood of nucleation.
is also included in the Chou-Fasman model:
but, implicitly, in the rules of region assignment: e.g., whether a sub-sequence is helix, sheet, or coil.
Regions of 2o structure assigned by inspection: where any 2
ostructure requires a string of residues of similar
propensity.
Example: For an -helix: initiation of a helix requires a contiguous set of helix formers:
H, h, or I with I given -weight. clearly modeling the cooperativity of helix nucleation.
nucleated helices propagate through residues, H, h, I, and i.
and terminate when two or more helix breakers are encountered. again, modeling the cooperativity of the process.
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
20/30
Example: Chou-Fasman Method
Applied to the first 24 residues of Adenylate kinase. method predicts 2 structures:
N-terminal string with -helix forming tendency. mean weight: = 1.39.
2nd string with both -helix and -sheet forming tendency.
mean -tendency higher: = 1.56.
Experimentally, strings correspond to -helix, -sheet. A -turn (specific coil) is also observed.
predicted by a hydropathy-based modification by Rose (1978).
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
21/30
Example (cont.)
Applied to the remainder of Adenylate kinase: And also compared with a 2nd method (Nagano).
best results provided by a joint method:
here, obtains ~ 70% accuracy.
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
22/30
Evaluating Accuracy
The most widely used method: the overall, per-residue, 3-state accuracy (Q3):
Q3 = [(PH+PE+PC)/N] x 100%
N = total number of residues.
PX = number of correctly predicted residues in state X.
X = -Helix, -shEet, or Coil.
Although other methods exist,
Q3 is the most conceptually simple.
Pioneering method by Chou-Fasman:
overall accuracy of only about Q3
= 50%.
as assessed by a database of 267 known structures.
initially very popular, due to conceptual simplicity.
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
23/30
Improvements on Chou-Fasman
Many improvements have appeared. differ based on parameter definition and application.
an in-depth consideration beyond the scope of this course.
however, success correlated with the addition of relevant
statistical information
[1] Information regarding residue context. i.e., The propensity of a residue to adopt a given state:
determined by its n neighboring residues
as compared with observations in a database.
We examine: the GOR method (Garnier, 1987):
[2] Information regarding homologous proteins.
protein first subjected to multiple alignment.
to identify homologous proteins.
prediction then based on consensus propensities.
We examine: the PHD method (Rost and Sander, 1993).
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
24/30
The GOR Method
Propensity of a residue to adopt state S: defined not only by its own identity:
as in Chou-Fasman,
but also by the identities of neighboring residues.
GOR uses a 17-residue window:
a central, predicted residue + 8 flanking residues
on each side.
e.g., residues 4-20 used to predict the state of the 12 th
residue (F) of adenylate kinase:
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
25/30
The GOR Method (cont.)
Using sequences in the database, 3 Scoring Matrices, MS werefirst constructed: One for each of the 3 basic helical states, S = {H, E, C}.
Each is a 20x17 matrix, with elements mxy:
row, x = amino acid type (e.g., Ala). column, y = residue position within the window,
mxy = the probability that residue y is of type x given that the central residue is in state S
So, the sum of the mxy values in each column is 1.
Again, each matrix constructed in advance, from observed frequencies in the data base
(e.g., from all known protein structures).
Any candidate sequence is evaluated at each position, k: By applying each scoring matrix, MS:
Residue k is taken as the central residue of the window
And elements (mxy) are summed for all 1 y 17. such that the residue at each window position, y is of type x.
Highest of the 3 sums (H, E, and C) : yields the prediction, S, for that residues state.
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
26/30
Example: The GOR Method An application of GOR IV shown below:
run at the Network Protein Sequence @nalysis site:
http://npsa-pbil.ibcp.fr/cgi-bin/secpred_gor4.pl .
to the 1st 24 residues of Adenylate kinase:
this method correctly predicts the -sheet and -turn.
the -helix at residues (1-8) is predicted coil:
although its structural propensity to form a
helix is noted (blue line).
Overall, Gor IV has an accuracy of Q3 = 64.4%.
http://npsa-pbil.ibcp.fr/cgi-bin/secpred_gor4.plhttp://npsa-pbil.ibcp.fr/cgi-bin/secpred_gor4.pl -
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
27/30
Use of Multiple Alignments
The method ofMultiple alignments was first used to
aid in protein 2o
structural prediction:
by Zvelebil (1987), in combination with the GOR I method.
accuracy improved by 9%.
Basic idea:
Given a sequence to be evaluated:
identify a set of homologous (i.e., similar) sequences
each with > 25% sequence identity.
2o
structure prediction then based on consensus propensities.
one popular multiple alignment based-method:
The PHD Method:
Profile network from HeiDelberg:
combines sequence homology info. with a neural network.
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
28/30
Profile network from HeiDelberg
The PHD Method (Rost and Sander, 1993): combines sequence homology information,
with the optimization strength of a 2-layer neural network.
1st Layer: Raw Predictions Input:
fractions of the 20 types of residue ateach multiple-alignment position
in a 13-residue window aroundevaluated residue, k.
total of 20x13 = 260 input nodes.
Output: probability for each state (PH, PE, PC).
2nd Layer: Elimination of Infeasible Structures. input: output of the first layer.
After application to each residue in the chain.
output: refined probabilities. refines the raw predictions of the 1st layer
e.g., HHHEEHH becomes HHHHHHH
-
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
29/30
Example
An application of PHD shown below: run at the PredictProtein server (Columbia):
http://dodo.cpmc.columbia.edu/predictprotein/submit_def.html .
initial homology search: Psi-Blast.
to 1st 24 residues of Adenylate kinase.
This method correctly: predicts the -sheet (r10-r14).
E region (blue).
predicts the reverse-turn (r16-r22). L region (green).
however, -helix predicted to be coil. as in the GOR method
but, a 30% -helical probability isassigned to the region (red).
Overall PHD accuracy: Q3 = 70.8%.
http://dodo.cpmc.columbia.edu/predictprotein/submit_def.htmlhttp://dodo.cpmc.columbia.edu/predictprotein/submit_def.htmlhttp://dodo.cpmc.columbia.edu/predictprotein/submit_def.htmlhttp://dodo.cpmc.columbia.edu/predictprotein/submit_def.html -
8/14/2019 Lecture 10' - Structural Transitions of Polypeptides
30/30
Conclusion
In this Lecture, the helix-coil transition was used todiscuss:
The coil to -helix transition of a model polypeptide system:
Poly-[-benzyl-L-glutamate];
and the sequence-dependence of s and .
The tendency of short polypeptides to melt at helix ends.
The lower cooperativity of 310 helix formation.
Limitations of the Zipper model were then discussed:
The dependence of s on the (unknown) residue
environment.
So that purely statistical methods of prediction are more
usual.
The conceptual relationship b/w the Zipper model:
and statistical methods of predicting protein 2o structure