protein structure prediction
DESCRIPTION
Protein data bank (PDB) : 46818 structures (oct 2007) SCOP (Structural Classification Of Proteins): • 971 folds (major structural similarity) • 1586 super-families (probable common evolutionary origin) • 3004 families (clear evolutionary relationship, ~ 30% identity). - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/1.jpg)
Protein data bank (PDB) : 46818 structures (oct 2007)
SCOP (Structural Classification Of Proteins): • 971 folds (major structural similarity)• 1586 super-families (probable common evolutionary origin)• 3004 families (clear evolutionary relationship, ~ 30% identity)
Nearly all folds are known (?)
But 5 millions known protein sequences (trEMBL)
-> needs for structure prediction
Protein structure prediction
![Page 2: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/2.jpg)
Usually, structure-activity relationships : site-directed mutagenesis, pharmacologic studies, drug design,…But also:• genomic studies : recognizing orphan genes• distant evolution studies
Structure prediction: what for ?
QuickTime™ et undécompresseur TIFF (non compressé)
sont requis pour visionner cette image.
Sequences diverge more than structures
![Page 3: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/3.jpg)
Known structures :
Simulations at the atom level:
molecular modelling (enthalpic energy) /
molecular dynamics /normal modes
Methods for protein structural studies
![Page 4: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/4.jpg)
Unknown structures :
Before using molecular mechanics, one
must have a « realistic » structure.
3D structure prediction :1) homology modelling2) ab initio folding3) threading
Methods for protein structural studies
![Page 5: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/5.jpg)
Needs to know a 3D structure that is homolog to the query sequence
e.g.: Modeller web server (http://www.salilab.org/modeller)
Homology modelling
![Page 6: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/6.jpg)
e.g.: Modeller web server (http://www.salilab.org/modeller)
Homology modelling
![Page 7: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/7.jpg)
AGVLVAGHM
. . .
generation
Minimisation - energy evaluation
Target sequence:
Protein Data Bank (PDB)
Ab initio folding
Baker et al.
![Page 8: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/8.jpg)
Threading (1)
Protein Data Bank (PDB)families
![Page 9: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/9.jpg)
Threading (1)
family family core + interactions
Protein Data Bank -> library of cores
![Page 10: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/10.jpg)
Threading (2)
Protein Data Bank (PDB)Statistics for 3D neighboring residue pairs -> Energy
A L = -1.2A I = -2.2.
..
Other characteristics:residue accessibility, secondary structure,…
![Page 11: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/11.jpg)
Threading (3)
core
![Page 12: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/12.jpg)
V I = -2.3L N = -4.2
L G = -5.1
Threading (3)
Thread the sequence onto the core
![Page 13: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/13.jpg)
N G = -1.3V I = -2.2
S A = -4.2
Threading (3)
Thread the sequence onto the core
![Page 14: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/14.jpg)
I G = -3.3N G = -3.0
G L = -2.1Compute energy for every alignment of the sequence onto the core (many alignments, gaps…)
Threading (3)
Thread the sequence onto the core
-> choose the best core (low energy)
Thread the sequence onto all cores
![Page 15: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/15.jpg)
Threading
Threading methods are under developments :- optimisation of 3D alignments- better core definition- statistical assessment for results
Can be used when sequence tools (BLAST or PSIBLAST) cannot find simlarities
![Page 16: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/16.jpg)
Threading
Robetta : http://robetta.bakerlab.org/
3DPSSM : http://www.sbg.bio.ic.ac.uk/∼3dpssm/
bioinbgu : http://www.cs.bgu.ac.il/∼bioinbgu/form.html
GenTHREADER : http://bioinf.cs.ucl.ac.uk/psipred/psiform.html
FROST :http://genome.jouy.inra.fr/frost/
![Page 17: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/17.jpg)
The end…
![Page 18: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/18.jpg)
QuickTime™ et undécompresseur TIFF (non compressé)sont requis pour visionner cette image.
![Page 19: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/19.jpg)
A) La quantification des similarités des paires de structures (comparaison «~tout contre tout~») donne la position d'une structure dans un espace abstrait de hautes dimensions. La hauteur des pics reflète la densité de population de repliements, les axes horizontaux sont les axes des deux premiers vecteurs propres (i.e. associés aux deux plus grandes valeurs propres), l'axe vertical donne le nombre de repliements. La distribution des architectures est donnée par la projection sur le plan (la proximité sur ce plan donne une indication sur la similarité structurale entre 2 protéines)
B) 40% de tous les domaines connus sont couverts par 16 classes de repliements. Ces 16 repliements sont montrés ici sous forme de diagrammes topologiques de structures secondaires dans la classe de leur attracteur (le numéro d'attracteur est le même que dans la figure A).
Figures tirées de Holm et Sander (1996) "Mapping the protein universe"
![Page 20: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/20.jpg)
Threading: fonction d’évaluation
![Page 21: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/21.jpg)
Méthode d’alignement séquence/structure
![Page 22: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/22.jpg)
Méthode d’alignement séquence/structure (2)
![Page 23: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/23.jpg)
Normalisation des scores
![Page 24: Protein structure prediction](https://reader030.vdocuments.mx/reader030/viewer/2022033101/56812b37550346895d8f4604/html5/thumbnails/24.jpg)