construyendo modelos 3d de proteinas ‘fold recognition / threading’

29
Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Upload: byron-singleton

Post on 25-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Construyendo modelos 3D de proteinas‘fold recognition / threading’

Page 2: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Why make a structural model for your protein ?

The structure can provide clues to the function through structural similarity with other proteins

With a structure it is easier to guess the location of active sites

We can apply docking algorithms to the structures (both with other proteins and with small molecules)

With a structure we can plan more precise experiments in the lab

Page 3: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Protein Modeling Methods

• Ab initio methods:solution of a protein folding problemsearch in conformational space

• Energy-based methods:energy minimizationmolecular simulation

• Knowledge-based methods:homology modelingfold recognition / threading

Page 4: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Why do we need Ab Initio Methods?data taken from PDB

http://www.rcsb.org/pdb/holdings.html

New folds and those sequences with very little sequence homology <15%

Page 5: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Protein Modeling Methods

• Ab initio methods:solution of a protein folding problemsearch in conformational space

• Energy-based methods:energy minimizationmolecular simulation

• Knowledge-based methods:homology modelingfold recogniion

Page 6: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Predicting Protein Structure:Predicting Protein Structure: Threading / Fold RecognitionThreading / Fold Recognition

BasisBasisIt is estimated there are only around 1000 to 10 000 stable folds in nature**

Fold recognition is essentially finding the best fit of a sequence to a set of candidate folds**Select the best sequence-fold alignment using a fitness scoring function**

Page 7: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

The Threading Problem

• Find the best way to “mount” the residue sequence of one protein on a known structure taken from another protein

Page 8: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Why is it called threading ?

• threading a specific sequence through all known folds

• for each fold estimate the probability that the sequence can have that fold

Page 9: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Threading: Basic StrategyThreading: Basic Strategy

Sequence

Template

Spatial Interactions

dhgakdflsdfjaslfkjsdlfjsdfjasd

Library of folds

Query

Scoring & selection

Page 10: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Protein Threading

• Conserved Core Segments

Protein B

J

L

KI

Protein A

Conserved Core

Segments

Page 11: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Two structurally similar proteins

Spatial adjacencies (interactions)

Possible threading with a sequence

Page 12: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Input/Output of Protein Threading

Pairwise amino acid

scoring function

Amino acid sequence

a[1..n]

g(…)

Core segments C[1..m]

THREADING

Page 13: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Fold recognition (Threading)

The sequence:

+Known protein folds

SLVAYGAAM

structural model

Page 14: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Input:

sequence

H bond donorH bond acceptorGlycinHydrophobic

Library of folds of known proteins

Page 15: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

S=20S=5S=-2

Z=5Z=1.5Z= -1

H bond donorH bond acceptorGlycinHydrophobic

Page 16: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

10100N

::::::::

10100167-9987-242

10100-80101-50101

GextGopY…DCA

Amino acid typeP

osit

ion

on s

eque

nce

Page 17: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Fold recognition/ Threading

Disadvantages:

• threading methods seldom lead to the alignment quality that is needed for homology modeling.

• less than 30% of the predicted first hits are true remote homologues (PredictProtein).

Page 18: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Threading resources

• TOPITS Heuristic Threader, part of larger structure prediction system

• 3DPSSM Integrated system, does its own MSA and secondary structure predictions and then threading

• GenThreaderSimilar to 3DPSSM

Page 19: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

In homology modelling, construction of the side chains is done using the template structures when there is high similarity between the built protein and the templates

In spite of the huge size of the problem (because each side chain influences its neighbours) there are quite succesful algorithms to this problem.

Side chain construction

Without such similarity the construction can be done using rotamer libraries

A compromise between the probability of the rotamer and its fitness in specific position determines the score. Comparing the scores of all the rotamer for a given amino acid determines the preferred rotamer.

Page 20: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

In this work we examined differences in structures of amino- acid side chains around point mutations.

Phe

AsnConformation - a given setof dihedral angle which defines a structure.

Rotamer - energetically favourable conformation.

Page 21: Construyendo modelos 3D de proteinas ‘fold recognition / threading’
Page 22: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Ab initio

The sequence

SLVAYGAAM

structural model

Page 23: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Ab initio methods for modelling

This field is of great theoretical interest but, so far, of very little practical applications. Here there is no use of sequence alignments and no direct use of known structures

The basic idea is to build empirical function that simulates real physical forces and potentials of chemical contacts

If we will have perfect function and we will be able to scan all the possible conformations, then we will be able to detect the correct fold

Page 24: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Predicting Protein Structure:Predicting Protein Structure: Ab InitioAb Initio Methods Methods

Sequence

Secondary structure

Prediction

Tertiary structure

Low energy structures

Predicted structureEnergy

Minimization

Validation

Mean field potentials

Page 25: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Ab initio Methods

Simplified modelssimplified alphabet (HP)simplified representation (lattice)

Build-up techniquesDeterministic methods

quantum mechanicsdiffusion equations

Stochastic searchesMonte Carlogenetic algorithms

Page 26: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Rosetta approach

• Rosetta (David Baker) consistently outstanding performer in last two CASPs

• Integrated method– I-Sites: much finer grained substructures than secondary

structures. A library of all structures each AA 9mer is found in (taken from PDB)

– Heuristic global energy function to estimate quality of folds

– Monte Carlo search through assignments of I-Sites to minimize energy function.

• Also, HMMSTR, HMM-driven method for assigning I-Sites.

Page 27: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Rosetta prediction method

• Define global scoring function that estimates probability of a structure given a sequence

• Generate version of I-sites with fixed length subsequences (9 amino acids)

– Calculate P(I-Site|sequence) for all sequences and I-sites

• Generate structures by Monte Carlo sampling of assignments of fixed size I-sites to subsequences

• End up with ensemble of plausible structures

Page 28: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Rosetta is way ahead

• CASP 4 results.

• CASP 5 similar, but not as dramatic.

Page 29: Construyendo modelos 3D de proteinas ‘fold recognition / threading’

Fully automated predictions

• CAFASP-2

• Meta-servers work best– Integrate predictions from several other servers– Significantly better predictions than any

individual approach

• Several public metaservers available:– http://bioinfo.pl/Meta/ is best all-around