protein structural prediction. protein structure is hierarchical

39
Protein Structural Prediction

Upload: flora-griffin

Post on 24-Dec-2015

262 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Protein Structural Prediction. Protein Structure is Hierarchical

Protein Structural Prediction

Page 2: Protein Structural Prediction. Protein Structure is Hierarchical

Protein Structure is Hierarchical

Page 3: Protein Structural Prediction. Protein Structure is Hierarchical

Structure Determines Function

What determines structure?

• Energy• Kinematics

How can we determine structure?

• Experimental methods• Computational predictions

The Protein Folding Problem

Page 4: Protein Structural Prediction. Protein Structure is Hierarchical

Primary Structure: Sequence

• The primary structure of a protein is the amino acid sequence

Page 5: Protein Structural Prediction. Protein Structure is Hierarchical

Primary Structure: Sequence

• Twenty different amino acids have distinct shapes and properties

Page 6: Protein Structural Prediction. Protein Structure is Hierarchical

Primary Structure: Sequence

A useful mnemonic for the hydrophobic amino acids is "FAMILY VW"

Page 7: Protein Structural Prediction. Protein Structure is Hierarchical

Secondary Structure: , , & loops

helices and sheets are stabilized by hydrogen bonds between backbone oxygen and hydrogen atoms

Page 8: Protein Structural Prediction. Protein Structure is Hierarchical

Secondary Structure: helix

Page 9: Protein Structural Prediction. Protein Structure is Hierarchical

Secondary Structure: sheet

sheet

buldge

Page 10: Protein Structural Prediction. Protein Structure is Hierarchical

Second-and-a-half-ary Structure: Motifs

beta helix

beta barrel

beta trefoil

Page 11: Protein Structural Prediction. Protein Structure is Hierarchical

Tertiary Structure: Domains

Page 12: Protein Structural Prediction. Protein Structure is Hierarchical

Mosaic Proteins

Page 13: Protein Structural Prediction. Protein Structure is Hierarchical

Tertiary Structure: A Protein Fold

Page 14: Protein Structural Prediction. Protein Structure is Hierarchical

Protein Folds Composed of , , other

Page 15: Protein Structural Prediction. Protein Structure is Hierarchical

Quaternary Structure: Multimeric Proteins or Functional Assemblies

• Multimeric Proteins• Macromolecular Assemblies

Ribosome:Protein Synthesis

Replisome:DNA copying

Hemoglobin:A tetramer

Page 16: Protein Structural Prediction. Protein Structure is Hierarchical

Protein Folding

• The amino-acid sequence of a protein determines the 3D fold [Anfinsen et al., 1950s]

Some exceptions: All proteins can be denatured Some proteins have multiple conformations Some proteins get folding help from chaperones

• The function of a protein is determined by its 3D fold

• Can we predict 3D fold of a protein given its amino-acid sequence?

Page 17: Protein Structural Prediction. Protein Structure is Hierarchical

The Leventhal Paradox

• Given a small protein (100aa) assume 3 possible conformations/peptide bond

• 3100 = 5 × 1047 conformations• Fastest motions 10- 15 sec so sampling all conformations would take

5 × 1032 sec• 60 × 60 × 24 × 365 = 31536000 seconds in a year• Sampling all conformations will take 1.6 × 1025 years• Each protein folds quickly into a single stable native conformation the

Leventhal paradox

Page 18: Protein Structural Prediction. Protein Structure is Hierarchical

Quick Overview of Energy

Bond Strength (kcal/mole)

H-bonds 3-7

Ionic bonds 10

Hydrophobic interactions 1-2

Van der vaals interactions 1

Disulfide bridge 51

Page 19: Protein Structural Prediction. Protein Structure is Hierarchical

The Hydrophobic Effect

• Important for folding, because every amino acid participates!

2.25 Trp

1.80 Ile

1.79 Phe

1.70 Leu

1.54 Cys

1.23 Met

1.22 Val

0.96 Tyr

0.72 Pro

0.31 Ala

0.26 Thr

0.13 His

0.00 Gly

-0.04 Ser

-0.22 Gln

-0.60 Asn

-0.64 Glu

-0.77 Asp

-0.99 Lys

-1.01 Arg

Experimentally Determined Hydrophobicity Levels Fauchere and Pilska (1983). Eur. J. Med. Chem. 18, 369-75.

Page 20: Protein Structural Prediction. Protein Structure is Hierarchical

Protein Structure Determination

• Experimental X-ray crystallography NMR spectrometry

• Computational – Structure Prediction(The Holy Grail)

Sequence implies structure, therefore in principle we can predict the structure from the sequence alone

Page 21: Protein Structural Prediction. Protein Structure is Hierarchical

Protein Structure Prediction

• ab initio Use just first principles: energy, geometry, and kinematics

• Homology Find the best match to a database of sequences with known 3D-

structure

• Threading

• Meta-servers and other methods

Page 22: Protein Structural Prediction. Protein Structure is Hierarchical

Ab initio Prediction

• Sampling the global conformation space Lattice models / Discrete-state models Molecular Dynamics Pre-set libraries of fragment 3D motifs

• Picking native conformations with an energy function Solvation model: how protein interacts with water Pair interactions between amino acids

• Predicting secondary structure Local homology Fragment libraries

Page 23: Protein Structural Prediction. Protein Structure is Hierarchical

Lattice String Folding

• HP model: main modeled force is hydrophobic attraction NP-hard in both 2-D square and 3-D cubic Constant approximation algorithms Not so relevant biologically

Page 24: Protein Structural Prediction. Protein Structure is Hierarchical

Lattice String Folding

Page 25: Protein Structural Prediction. Protein Structure is Hierarchical

ROSETTAhttp://www.bioinfo.rpi.edu/~bystrc/hmmstr/server.php

http://depts.washington.edu/bakerpg/papers/Bonneau-ARBBS-v30-p173.pdf

• Monte Carlo based method

• Limit conformational search space by using sequence—structure

motif I-Sites library (http://isites.bio.rpi.edu/Isites/)

261 patterns in library

Certain positions in motif favor certain residues

• Remove all sequences with <25% identity

• Find structures of the 25 nearest sequence neighbors of

each 9-mer

Rationale Local structures often fold independently of full protein

Can predict large areas of protein by matching sequence to I-

Sites

?? ?

Page 26: Protein Structural Prediction. Protein Structure is Hierarchical

I-Sites Examples

• Non polar helix

Abundance of alanine at all positions

Non-polar side chains favored at positions 3, 6, 10 (methionine, leucine, isoleucine)

• Amphipathic helix

Non-polar side chains favored at positions 6, 9, 13, 16 (methionine, leucine, isoleucine)

Polar side chains favored at positions 1, 8, 11, 18 (glutamic acid, lysine)

Page 27: Protein Structural Prediction. Protein Structure is Hierarchical

ROSETTA Method

• New structures generated by swapping

compatible fragments

• Accepted structures are clustered based

on energy and structural size

• Best cluster is one with the greatest

number of conformations within 4-Å rms

deviation structure of the center

• Representative structures taken from each

of the best five clusters and returned to

the user as predictions

?? ?

Page 28: Protein Structural Prediction. Protein Structure is Hierarchical

Robetta & Rosetta

Page 29: Protein Structural Prediction. Protein Structure is Hierarchical
Page 30: Protein Structural Prediction. Protein Structure is Hierarchical

Rosetta results in CASP

Page 31: Protein Structural Prediction. Protein Structure is Hierarchical

Rosetta Results

• In CASP4, Rosetta’s best models ranged from 6–10 Å rmsd C

• For comparison, good comparative models give 2-5 Å rmsd C

• Most effective with small proteins (<100 residues) and structures with

helices

Page 32: Protein Structural Prediction. Protein Structure is Hierarchical

Only a few folds are found in nature

Page 33: Protein Structural Prediction. Protein Structure is Hierarchical

The SCOP Database

Structural Classification Of Proteins

FAMILY: proteins that are >30% similar, or >15% similar and have similar known structure/function

SUPERFAMILY: proteins whose families have some sequence and function/structure similarity suggesting a common evolutionary origin

COMMON FOLD: superfamilies that have same secondary structures in same arrangement, probably resulting by physics and chemistry

CLASS: alpha, beta, alpha–beta, alpha+beta, multidomain

Page 34: Protein Structural Prediction. Protein Structure is Hierarchical

Status of Protein Databases

SCOP: Structural Classification of Proteins. 1.67 release24037 PDB Entries (15 May 2004). 65122 Domains.

ClassNumber of folds

Number of superfamilies

Number of families

All alpha proteins 202 342 550

All beta proteins 141 280 529

Alpha and beta proteins (a/b) 130 213 593

Alpha and beta proteins (a+b) 260 386 650

Multi-domain proteins 40 40 55

Membrane and cell surface proteins

42 82 91

Small proteins 71 104 162

Total 887 1447 2630

EMBL

PDB

Page 35: Protein Structural Prediction. Protein Structure is Hierarchical

Evolution of Proteins – Domains #members in different families obey power law

429 families common in all 14 eukaryotes; 80% of animal domains, 90% of fungi domains

80% of proteins are multidomain in eukaryotes;domains usually combine pairwise in same order --why?

Evolution of proteins happens mainly through duplication, recombination, and divergence

Chothia, Gough, Vogel, Teichmann, Science 300:1701-17-3, 2003

Page 36: Protein Structural Prediction. Protein Structure is Hierarchical

Homology-based Prediction

• Align query sequence with sequences of known structure, usually >30% similar

• Superimpose the aligned sequence onto the structure template, according to the computed sequence alignment

• Perform local refinement of the resulting structure in 3D

90% of new structures submitted to PDB in the past three years have similar folds in PDB

The number of unique structural folds is small (possibly a few thousand)

Page 37: Protein Structural Prediction. Protein Structure is Hierarchical

Examples of Fold Classes

Page 38: Protein Structural Prediction. Protein Structure is Hierarchical

Homology-based Prediction

Raw model

Loop modeling

Side chain placement

Refinement

Page 39: Protein Structural Prediction. Protein Structure is Hierarchical

Homology-based Prediction