chou fasman algorithm for protein structure prediction

Chou-Fasman Algorithm for Protein Structure

Prediction

By Roshan KarunarathnaGopika Ravichandra

2

Contents…

• Importance of the Structures of proteins• Prediction of 2D Structures• Chou-Fasman Algorithm• How it works!

Chou-Fasman Algorithm for Protein Prediction

Chou-Fasman Algorithm for Protein Prediction 3

What is chou-fasman algorithm? • The experimental methods used by biotechnologists

to determine the structures of proteins demand sophisticated equipment and time. • A host of computational methods are developed to

predict the location of secondary structure elements in proteins for complementing or creating insights into experimental results. • Chou-Fasman algorithm is an empirical algorithm

developed for the prediction of protein secondary structure


Before we go…..

• Structures of proteins……• Why study of structures are important….• What is the need of an algorithm ….


Secondary structure prediction• In either case, amino acid propensities should be

useful for predicting secondary structure• Two classical methods that use previously

determined propensities:• Chou-Fasman• Garnier-Osguthorpe-Robson


Goal…

• Take primary structure (sequence) and, using rules derived from known structures, predict the secondary structure that is most likely to be adopted by each residue• Major classes are a-helices, b-sheets and loops


Structural Propensities

• Due to the size, shape and charge of its side chain, each amino acid may “fit” better in one type of secondary structure than another• Classic example: The rigidity and side chain angle of

proline cannot be accomodated in an -helical structure


Structural Propensities

• Two ways to view the significance of this preference (or propensity)• It may control or affect the folding of the protein in its

immediate vicinity (amino acid determines structure)• It may constitute selective pressure to use particular

amino acids in regions that must have a particular structure (structure determines amino acid)


Chou-Fasman method

• Uses table of conformational parameters (propensities) determined primarily from measurements of secondary structure by CD spectroscopy• Table consists of one “likelihood” for each structure

for each amino acid


Chou-Fasman Algorithm• Conformational parametersfor every amino acid (AA):P(a) = propensity in an alpha helix P(b) = propensity in a beta sheet P(turn) = propensity in a turnBased on observed propensities in proteins of known structure


Chou-Fasman propensities (partial table)

Amino Acid P P Pt

Glu 1.51 0.37 0.74Met 1.45 1.05 0.60Ala 1.42 0.83 0.66Val 1.06 1.70 0.50Ile 1.08 1.60 0.50Tyr 0.69 1.47 1.14Pro 0.57 0.55 1.52Gly 0.57 0.75 1.56


Chou-Fasman method

• A prediction is made for each type of structure for each amino acid• Can result in ambiguity if a region has high propensities

for both helix and sheet (higher value usually chosen, with exceptions)


Chou-Fasman method

• Calculation rules are somewhat ad hoc• Example: Method for helix• Search for nucleating region where 4 out of 6 a.a. have

P > 1.03• Extend until 4 consecutive a.a. have an average P < 1.00• If region is at least 6 a.a. long, has an average P > 1.03,

and average P > average P consider region to be helix


• Scan the peptide and identify regions where 3 out of 5 contiguous residues have P(β)>100.• These residues nucleate β- strands. Extend these in

both directions until a set of four contiguous residues have an average P(β)<100.• This ends β- strand.


• region containing overlapping α and β Any assignment are taken to be helical or β depending on if the average P(α) and P(β) for that region is largest. • If this residues an α or β- region so that it

becomes less than 5 residues, the α or β assignment for that region is removed.


SPASEASDGQSVSV

P(a) P(b)S: 77 75P: 55 55A: 142 83S: 77

SPASEASDGQFETTY

P(a) P(b)E: 151 37A: 142 83S: 77 75D: 101 54G: 57Q: 111 1) 4 of 6, P(a) > 100

2) Extend RIGHT until 4 contiguous

Residues have P(a) < 1003) Calculate SP(a) and SP(b). Is SP(a) >

SP(b)? (Do Not Include last 4 in sum)

Find potential alphahelix:

MFCTYYGNNGEHIELMM

MFCTYYGNNGEHIELMM


Accuracy of Chou-Fasman predictions

• Sequences whose 3D structures are known are processed so that each residue is “assigned” to a given secondary structure class by looking at the backbone angles• Three classes most often used (helix=H, sheet=E, turn=C)

but sometimes use four classes (helix, sheet, turn, loop)

Conclusion…..

Confusion matrix for Chou-Fasman method on 78 proteins

Predicted True

H E C Unknown

H 47.5 3.0 4.3 45.2

E 20.8 16.8 7.1 55.4

C 6.4 3.6 38.0 52.0

Data from Z-Y Zhu, Protein Engineering 8:103-109, 1995

Average accuracy =54.4

Thank You!

chou fasman algorithm for protein structure prediction

Science

helix choufasman algorithm

loops choufasman algorithm

robson choufasman algorithm

loop choufasman algorithm

protein prediction

secondary structure

accuracy of chou

choufasman garnierosguthorpe