Download - Lecture 10' - Structural Transitions of Polypeptides

8/14/2019 Lecture 10' - Structural Transitions of Polypeptides

1/30


2/30

Coil-Helix Transitions

The transition between a random coil and a helixstructure: also called the coil-helix transition

is an important component in protein folding pathways.

The term, random coil: refers to a set of equivalent coil-like structures: each is unfolded, relative to a typical helical structure

-helix, 310 helix, -strand, etc.

Focus: thermodynamic properties of these transitions.

For simplicity, we begin with a homopolymer: a polynucleotide of identical amino acids

and focus on the transition: random coil to -helix.

We investigate this coil-helix transition:

using a statistical thermodynamic treatment: The Zipper Model.


3/30

The Nucleation of the -helix

The nucleation step in -helix formation: involves formation of an H-bond between:

Keto Oxygen of residue j.

Amide Hydrogen of residue j+4.

this requires the torsion angles to

assume mean values:

= -57o, = -47

o.

entropically unfavorable.

This H-bond helps to stabilize

the helical structure. Energetic favorability, however:

relies on isolation of the H-bond

from competition with water

as is the case in the folded proteins interior;

or in a non-polar solvent.


4/30

Model Polypeptide System

As a result, many shorter oligo-polypeptides: form an -helix only in organic solvent (e.g., octanol).

Certain long polypeptides, however:

can be induced to form an -helix in Aq. solution

Classic Example (Zimm and Bragg, 1959): poly-[-benzyl-L-glutamate]

in 80% dichloroacetic acid / 20% ethylene dichloride.

undergoes a coil-to-helix transition when heated:

the opposite of protein denaturation in Aq. solution due to dichloroacetic acids ability to form strong H-bonds:

with the amide-Nitrogens in the coil.

Here, -helix formation thus endothermic.

since Ho

> 0, this process is entropy driven (So

> 0)

consistent with release ofsolventfrom the helix.


5/30

Estimating s

In order to apply the Zipper model to a transition: s must be related to experimentally measured quantities:

The midpoint of the melting transition:

The melting temperature, Tm

Ho

for adding 1 helical unit onto a pre-existing helix:

The enthalpy of helix growth, H

o

g. Method: Experimental determination of s

s modeled as a micro-equilibrium constant of helix formation

In practice, Ho

g determined from by comparing Tms of polymers of

2 different lengths.

Allows modeling within the zipper model.

While adjusted to yield the best fit.


6/30

Comparison with Experiment

Points are experimental data (2 sets, Doty, etal.): determined by optical rotation.

for several lengths: N = 26, 1500 residues.

Curves: predicted values: Fractional number of helical

residues, h.

estimated by the Zipper model.

s values computed using:

Ho

g = 0.89 kcal/mol > 0. helix formation endothermic.

thus, s increases with T.

2 fitted values shown: = 1 x 10

-4(dashed).

= 2 x 10-4

(solid).


7/30

General Transition Characteristics

Transition shows a large N-dependence. even though s and are length-independent.

Cooperativity of the transition increases with N.

as measured by the narrowness of the transition

Relative contributions of parametersalso stronglyN-dependent

In short polymers, nucleation ()dominant

initial formation unfavorable.

Propagation strongly inhibited.

At large N, propagation (s)

dominates

nucleation penalty distributed

over more residues

s quickly dominates as T increases.


8/30

Validating the Size of

Fitted Application of a Zipper model: predicts the coil to -helix transition to be highly cooperative.

fitted value, = 10-4.

This prediction can be separately validated as follows: statistical weight of nucleation = s.

This accounts for formation of 1 H-bond. s accounts for the balance between H

oand S

o

for only 1 residue.

accounts for the cooperativity of nucleation: nucleation restricts the angles of 4 residues:

to values typical of an -helix: ( = -57

o

,= -47

o

), the S

ofor only 1 of these 4 is included in s

thus, = exp[3So

res/R]

Net entropy change/residue:

So

res = R ln Whelix R ln Wcoil -R ln 9 = -18 J/mol K.

Substitution yields the estimate, = 1.5 x 10-3.


9/30

The Coil to 310-Helix Transition

s and always depend on the nature of the transition. other transitions exhibit different s and values.

Example: Sequences of type (AAAAK)nA.

A, K = alanine, lysine, respectively.

convert from 310

helices to -helices

when n is increased from 3 to 4. i.e., with increasing total polymer length.

This suggests that the 310 helix:

is easier to initiate (from a coil) than an -helix:

(310

) > (-helix)

but, once initiated, the -helix is more easily propagated:

s(-helix) > s(310)

The difference in is due to conformational entropies:

nucleation of a 310 helix fixes the torsion angles of 1 less residue:

-helix: H-bond between residues j and j+4.

310-helix: H-bond b/w residues j and j+3.


10/30

Sequence Dependence of, s

For regular-helix formation (or melting), both and s are sequence dependent. each type of residue characterized by a different s value.

cooperativity, ofhelix formation will also vary: but with the mean residue content.

Ex: In Lysine-containing polymers,

10

-3

. nucleation 10x more favorable compared to poly-[-benzyl-L-glutamate].

The impact of residue differences on -helixstability:

studied using host-guest peptides. energetic variations due to a single , internal switched

residue are measured; essentially no variation in .

emphasis: determination of variation in s.


11/30

Host-Guest Parameters Begin with a host -helix (yyyyyyy):

y = some residue stable as a -helix. transition free energy/residue: G

o(y).

Replace 1 y with a guest residue (X). yields sequence: yyy-X-yyy

Measure Go

(kJ/mol) for all X values:

then, G

o

(x) = G

o

(y) + G

o

(x) Values yield -helix propensities.

Here, shown normalized relative to the G

oof Gly;

since Gly lacks a C (R = H).

All but Pro more favorable than Gly. Pro is a strong -helix breaker.

Conversion from Go(x) to s:

assume ~ independent of X. then, s = exp[-G

o(X)/RT].

Here, s = 1.0 means neutral favorability.


12/30

Modeling Melting Initiation

Consider an N-residue polypeptide:

in the fully helical conformation: hhhhhh

weight: N =sN.

Melting of the Helix:

can occur by 2 fundamentally different processes

melting of an end residue:

2 conformations: chhhhh and hhhhhc.

total weight: N-1 = 2sN-1

.

melting a middle residue

N-2 conformations, of the form : hhhchhh each has 2 helix-islands

total weight: N-1 = (N-2)2s

N-1.

More generally, a conformation with j helix-islands:

will contain j factors of.

this motivates our Zipper model.


13/30

End vs. Middle Melting

The relative probabilities of initiating melting: at the end vs. the middle

estimated by a ratio of statistical weights:

Pe/Pm = 2sN-1

/(N-2)2s

N-1 2/N.

we have 2 opposing factors:

N = number of central melting points. = penalty for initiating melting at a given middle point.

Assuming the typical experimental value ( = 2x10-4)

Pe/Pm = 1 when N 104.

For short helices (N < 104

residues), dominates

transition initiates at the ends.

For long helices (N > 104

residues), N overcomes ...

denaturation may then proceed from the middle. These trends observed experimentally, in globular proteins.


14/30

Predicting Protein Structure

Given sets of (s,) values for all 20 amino acids for formation of all types of 2

ostructures:

coil to -helices, -strand, or 310 helix, etc..

We should be able to apply a Zipper model:

to predict the probability of adopting each type of structure during folding and melting of globular proteins.

Added complication:

s = exp(-Go/RT) also depends on external factors.

e.g.,Go

varies with residue environment. Globular proteins offer 2 distinct environments:

hydrophobic: buried residues in the protein interior.

hydrophilic: solvent accessible residues at the protein surface.

meaningful assignment of an s value to each residue j:

demands knowledge of whether j is buried, in each context.


15/30

Chou and Fasman

Statistical Thermodynamics not routinely used tomodel protein folding.

however, many statistical methods for predicting 2o

structure have been developed.

these incorporate many of the essential features of theZipper model.

The Method of Chou and Fasman (1974):

begins with a empirical set of residue parameters.

defined not by measured transition energies (Go),

but by the statistical tendency of each residue to form each

type of structure

as determined from the mole fractions present in actual

protein crystals.

first parameters used data from 64 different proteins.


16/30

Chou and Fasman Parameters

For each type of amino acid three types of parameters are computed: = propensity to form an -helix.

= propensity to form a -sheet.

= propensity to turn (adopt a coil).

Example: Determination of values. First, a mole fraction is computed for each type, i:

(i) = occurrence of i in an -helix / occurrence of i in

the data set.

Secondly, an average alpha-helical amino acid is defined:

with an average value of(i) :

< > = i(i) / 20

Parameter i then defined as a relative tendency:

= (i) / < >

really just a weighted average.

Re eated for each t e of 2o

structure.


17/30

The Favorability of Propagation

Parameters and : correspond to the mean propagation terms,

in their respective Zipper models; averaged over solvent conditions.

Ex.: corresponds conceptually to

in the Zipper model of the coil to -helix transition. Qualitative propensity also assigned to each residue,

relative to each type of structure.

e.g., relative to -helix formation, residues categorized as: Strong Helix Formers (H)

Average Helix Formers (h) Weak Helix Formers (I)

Indifferent (i)

Weak Helix Breakers (b)

Strong Helix Breakers (B)

Again, this is repeated for each type of 2o

structure.


18/30

Chou-Fasman Parameter Set

Comparison w/ Host-Guest Parameters:

relative favorabilities:

general agreement.

differences in theordering.

values play the role of

propagation terms, s.

Proline:

low and .

due to restricted

torsion angles.

-helix, -sheet breaker.

Glycine:

low , but high

great conformational

freedom.

3rd residue of a Type II turn.


19/30

The Cooperativity of Nucleation

The cooperativity of 2o

structure formation: i.e., the statistical unlikelihood of nucleation.

is also included in the Chou-Fasman model:

but, implicitly, in the rules of region assignment: e.g., whether a sub-sequence is helix, sheet, or coil.

Regions of 2o structure assigned by inspection: where any 2

ostructure requires a string of residues of similar

propensity.

Example: For an -helix: initiation of a helix requires a contiguous set of helix formers:

H, h, or I with I given -weight. clearly modeling the cooperativity of helix nucleation.

nucleated helices propagate through residues, H, h, I, and i.

and terminate when two or more helix breakers are encountered. again, modeling the cooperativity of the process.


20/30

Example: Chou-Fasman Method

Applied to the first 24 residues of Adenylate kinase. method predicts 2 structures:

N-terminal string with -helix forming tendency. mean weight: = 1.39.

2nd string with both -helix and -sheet forming tendency.

mean -tendency higher: = 1.56.

Experimentally, strings correspond to -helix, -sheet. A -turn (specific coil) is also observed.

predicted by a hydropathy-based modification by Rose (1978).


21/30

Example (cont.)

Applied to the remainder of Adenylate kinase: And also compared with a 2nd method (Nagano).

best results provided by a joint method:

here, obtains ~ 70% accuracy.


22/30

Evaluating Accuracy

The most widely used method: the overall, per-residue, 3-state accuracy (Q3):

Q3 = [(PH+PE+PC)/N] x 100%

N = total number of residues.

PX = number of correctly predicted residues in state X.

X = -Helix, -shEet, or Coil.

Although other methods exist,

Q3 is the most conceptually simple.

Pioneering method by Chou-Fasman:

overall accuracy of only about Q3

= 50%.

as assessed by a database of 267 known structures.

initially very popular, due to conceptual simplicity.


23/30

Improvements on Chou-Fasman

Many improvements have appeared. differ based on parameter definition and application.

an in-depth consideration beyond the scope of this course.

however, success correlated with the addition of relevant

statistical information

[1] Information regarding residue context. i.e., The propensity of a residue to adopt a given state:

determined by its n neighboring residues

as compared with observations in a database.

We examine: the GOR method (Garnier, 1987):

[2] Information regarding homologous proteins.

protein first subjected to multiple alignment.

to identify homologous proteins.

prediction then based on consensus propensities.

We examine: the PHD method (Rost and Sander, 1993).


24/30

The GOR Method

Propensity of a residue to adopt state S: defined not only by its own identity:

as in Chou-Fasman,

but also by the identities of neighboring residues.

GOR uses a 17-residue window:

a central, predicted residue + 8 flanking residues

on each side.

e.g., residues 4-20 used to predict the state of the 12 th

residue (F) of adenylate kinase:


25/30

The GOR Method (cont.)

Using sequences in the database, 3 Scoring Matrices, MS werefirst constructed: One for each of the 3 basic helical states, S = {H, E, C}.

Each is a 20x17 matrix, with elements mxy:

row, x = amino acid type (e.g., Ala). column, y = residue position within the window,

mxy = the probability that residue y is of type x given that the central residue is in state S

So, the sum of the mxy values in each column is 1.

Again, each matrix constructed in advance, from observed frequencies in the data base

(e.g., from all known protein structures).

Any candidate sequence is evaluated at each position, k: By applying each scoring matrix, MS:

Residue k is taken as the central residue of the window

And elements (mxy) are summed for all 1 y 17. such that the residue at each window position, y is of type x.

Highest of the 3 sums (H, E, and C) : yields the prediction, S, for that residues state.


26/30

Example: The GOR Method An application of GOR IV shown below:

run at the Network Protein Sequence @nalysis site:

http://npsa-pbil.ibcp.fr/cgi-bin/secpred_gor4.pl .

to the 1st 24 residues of Adenylate kinase:

this method correctly predicts the -sheet and -turn.

the -helix at residues (1-8) is predicted coil:

although its structural propensity to form a

helix is noted (blue line).

Overall, Gor IV has an accuracy of Q3 = 64.4%.
http://npsa-pbil.ibcp.fr/cgi-bin/secpred_gor4.plhttp://npsa-pbil.ibcp.fr/cgi-bin/secpred_gor4.pl


27/30

Use of Multiple Alignments

The method ofMultiple alignments was first used to

aid in protein 2o

structural prediction:

by Zvelebil (1987), in combination with the GOR I method.

accuracy improved by 9%.

Basic idea:

Given a sequence to be evaluated:

identify a set of homologous (i.e., similar) sequences

each with > 25% sequence identity.

2o

structure prediction then based on consensus propensities.

one popular multiple alignment based-method:

The PHD Method:

Profile network from HeiDelberg:

combines sequence homology info. with a neural network.


28/30

Profile network from HeiDelberg

The PHD Method (Rost and Sander, 1993): combines sequence homology information,

with the optimization strength of a 2-layer neural network.

1st Layer: Raw Predictions Input:

fractions of the 20 types of residue ateach multiple-alignment position

in a 13-residue window aroundevaluated residue, k.

total of 20x13 = 260 input nodes.

Output: probability for each state (PH, PE, PC).

2nd Layer: Elimination of Infeasible Structures. input: output of the first layer.

After application to each residue in the chain.

output: refined probabilities. refines the raw predictions of the 1st layer

e.g., HHHEEHH becomes HHHHHHH


29/30

Example

An application of PHD shown below: run at the PredictProtein server (Columbia):

http://dodo.cpmc.columbia.edu/predictprotein/submit_def.html .

initial homology search: Psi-Blast.

to 1st 24 residues of Adenylate kinase.

This method correctly: predicts the -sheet (r10-r14).

E region (blue).

predicts the reverse-turn (r16-r22). L region (green).

however, -helix predicted to be coil. as in the GOR method

but, a 30% -helical probability isassigned to the region (red).

Overall PHD accuracy: Q3 = 70.8%.
http://dodo.cpmc.columbia.edu/predictprotein/submit_def.htmlhttp://dodo.cpmc.columbia.edu/predictprotein/submit_def.htmlhttp://dodo.cpmc.columbia.edu/predictprotein/submit_def.htmlhttp://dodo.cpmc.columbia.edu/predictprotein/submit_def.html


30/30

Conclusion

In this Lecture, the helix-coil transition was used todiscuss:

The coil to -helix transition of a model polypeptide system:

Poly-[-benzyl-L-glutamate];

and the sequence-dependence of s and .

The tendency of short polypeptides to melt at helix ends.

The lower cooperativity of 310 helix formation.

Limitations of the Zipper model were then discussed:

The dependence of s on the (unknown) residue

environment.

So that purely statistical methods of prediction are more

usual.

The conceptual relationship b/w the Zipper model:

and statistical methods of predicting protein 2o structure

Download - Lecture 10' - Structural Transitions of Polypeptides

Top Related