kihara lab protein structure prediction performance in casp11

Post on 19-Feb-2017

156 Views

Category:

Science

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Hyung-Rae Kim, Amit Roy, Daisuke Kihara

http://kiharalab.org

KIHARA LAB

333

Overall Prediction Procedure

Server Models

PRESCO Residue

Environment Score

BLOSUM30

CC80

CCPC

QUIB

QUC2

Ranking by AA matrices

Structure Refinement by CHARMM runs

Side-Chain Modeling

5 models

Final 5 models

20-30 models

CABS 5 Clusters

Fragment-interaction -potential

Add HStarting Structure

Minimize in Screened Coulomb

Potential (SCP)

10 x 10ns → 5000 snapshotsconstraints on

secondary structure

MD 20 x 10ns with SCP

CH27 force field

10 x 10ns → 5000 snapshotsconstraints on

all Ca atoms

Dfire score, RMSD with

initial structureCorr (dfire,Irmsd) > 0.4Discarded (very rare)

Average structure from low dfire, Irmsd

snapshots from set 1 and 2

Relax at low T MDModel 1 and 2

Select structures with Lowest dfire score

Model 3,4,5

Hassan, S. A., Guarnieri, F., & Mehler, E. L. (2000). The Journal of Physical Chemistry B, 104(27), 6478-6489.

Mirjalili, V., Noyes, K., & Feig, M. (2014). Proteins: Structure, Function, and Bioinformatics, 82(S2), 196-207.

Refinement Procedure

Side-Chain Depth Environment (SDE)

within a sphere of 6 or 8 Å

along the main-chainCenter

(Kim & Kihara, Proteins 2014)

Finding Similar SDE from Database

Structure Database

2536 proteins

500 lowest RMSD fragments of 9 side-chain centroids;Superimposed with the query fragment

Select SDE with the same number of side-chain centroids in the sphere of 8.0/6.0Å

Query SDE

Compute residue-depth RMSD for corresponding side-chain centroids

Sort by depth RMSD to the query

surface

(Kim & Kihara, Proteins, 2014)

Decoy Evaluation with Protein Residue Environmental Score (PRESCO)

CCPC, CC80 Matrices:Contact definition of two residues: any pair of side-chain heavy atoms or Cα atom less than 4.5 ÅCompute a knowledge-based residue contact potential (Gaussian chain reference state, composition correction averaging)Correlation coefficients of residue pairs are used as values of the amino acid similarity matrix

Residue Contact Potential-Based Matrix

(Tan, Huang, & Kihara, Proteins, 2006)

Structure-derived Amino Acid Similarity Matrices in AAIndexBLAJ010101 - Structural superposition data for identifying potential remote homologues

(Blake-Cohen, 2001)HENS920101 - BLOSUM45 substitution matrix (Henikoff-Henikoff, 1992)JOHM930101 - Structure-based amino acid scoring table (Johnson-Overington, 1993)KOLA920101 - Conformational similarity weight matrix (Kolaskar-Kulkarni-Kale, 1992)KOSJ950115 - Context-dependent optimal substitution matrices for all residues

(Koshi-Goldstein, 1995)MIYS930101 - Base-substitution-protein-stability matrix (Miyazawa-Jernigan, 1993)OVEJ920101 - STR matrix from structure-based alignments (Overington et al., 1992)PRLA000101 - Structure derived matrix (SDM) for alignment of distantly related sequences

(Prlic et al., 2000)PRLA000102 - Homologous structure derived matrix for alignment of distantly related sequences

(Prlic et al., 2000)QU_C930101 - Cross-correlation coefficients of preference factors main chain (Qu et al., 1993)QU_C930102 - Cross-correlation coefficients of preference factors side chain (Qu et al., 1993)QUIB020101 - STROMA score matrix for the alignment of known distant homologs

(Qian-Goldstein, 2002)

Alignment Accuracy by AA Matrices

2761 Fold level protein sequence pairs, Lindahl & Eloffson Database

(Tan, Huang, Kihara, Proteins 2006)

Correct alignments: >50% of residues are correctly aligned

Native Structure Recognition

10

Decoy Sets DFIRE dDFIRE DOPE RW RWplus OPUS-PSP

GOAP MRE (CC80)

SDE (QUIB)

Combinations of MRE & SDE

# Targets      BLSM3

0+QU_C2

BLSM30+QU_C2

CC80+QU_C1

4state_reduced

6 7 7 6 6 7 7 7 7 7 7 7 7

Fisa 3 3 3 3 3 3 3 2 2 2 2 3 4Fisa_casp3 4 4 3 4 4 5 5 2 1 3 3 4 5Lmds 7 6 7 7 7 8 7 10 6 10 10 10 10

Lattice_ssfit 8 8 8 8 8 8 8 8 8 8 8 8 8hg_structal 12 16 ---- ---- 12 18 22 28 11 27 27 27 29

ig_structal 0 26 ---- ---- 0 20 47 61 6 61 61 61 61

ig_structal_hires

0 16 ---- ---- 0 14 18 20 6 20 20 20 20

Moulder 19 18 19 19 19 19 19 20 16 20 20 20 20

ROSETTA 20 12 21 20 20 39 45 25 31 41 41 39 58

I-TASSER 49 48 30 53 56 55 45 56 47 56 56 56 56

#Total (Z-score)

128(-1.94)

164(-2.52)

98/168(-2.47)

120/168(-3.23)

135(-2.13)

196(-2.86)

226(-3.57)

239 (6.78)

141(2.14)

255(5.70)

255(5.76)

255(5.65)

278

Scoring Function Models only Native includedAverage Rank

Ranked 1 Average Rank

ranked 1

MRE (CC80) 6.77 29 1.32 131SDE (QUIB) 2.89 56 1.98 97Combinations of MRE & SDE

BLSM30+QU_C1 6.79 31 1.18 139CC80(SDE)+BLSM30(SDE)

2.82 66 1.99 89

QMEAN6 2.87 85 1.71 113RWplus 2.97 57 1.78 106RW 3.08 51 1.71 110QMEANall_atom 3.59 74 1.71 119QMEANSSE_agree 3.74 62 3.72 39RF_HA_SRS 4.65 49 1.38 137OPUS_CA 4.72 79 5.13 55RF_HA 5.44 62 2.78 112DOPE 5.77 54 3.27 95DFIRE 6.03 50 5.69 33Floudas-CM 7.75 38 7.05 42Melo-ANOLEA 9.62 19 5.19 86Random 9.72 13.9 10.1 8.3

Benchamark on Ryukumov & Fiser CASP Set

Comparison against36 scoring functions.Only showing results of 13 functions.

Best Second best Third best

Side-Chain Building

(Peterson, Kang, Kihara, Proteins 2014)

T0804 Top 1 models

Kiharalab: TS333_1Boniecki_pred: TS301_1Skwark: TS358_1

T0804 Kiharalab Top 1 Model

Native (Coordinates not available)Kiharalab_1GDT-TS: 31.44 GOAP: -18178.22

QUARK_5GDT-TS: 30.93 GOAP: -14959.68

Best in Top 1 Models

T0804 Server Models Selected by PRESCOFinal Selection

Rank Model GDT-TS

1 QUARK_TS5 30.93

2 myprotein-me_TS4 12.63

3 Zhang-server_TS5 29.77

4 Seok-server_TS2 11.86

5 BAKER_ROSETTASERVER_TS3 12.37

QU_C2 + BLOSUM30

Rank Model

1 QUARK_TS5

2 TASSER-VMT_TS5

3 myprotein-me_TS4

4 BAKER_ROSETTAS_TS3

5 Zhang-Server-TS1

QU_C1 + QUIB

Rank Model

1 QUARK_TS5

2 SAM-T08-server_TS3

3 myprotein-me_TS1

4 myprotein-me_TS4

5 TASSER-VMT-TS1

CC80+ BLOSUM30

Rank Model

1 BAKER_ROSETTAS_TS3

2 myprotein-me_TS4

3 QUARK_TS5

4 myprotein-me_TS1

5 RBO_Aleph_TS3

CCPC+ BLOSUM30

Rank Model

1 QUARK_TS5

2 BAKER_ROSETTAS_TS3

3 myprotein-me_TS4

4 myprotein-me_TS1

5 TASSER-VMT_TS5

QUIB

Rank Model

1 SAM-T08-server_TS3

2 myprotein-me_TS4

3 QUARK_TS5

4 BAKER_ROSETTAS_TS3

5 BAKER_ROSETTAS_TS2

T0799-D1 Kiharalab Top 1 Model

Native (Coordinates not available)Kiharalab_1GDT-TS: 19.86 GOAP: -33178.17

BAKER-ROSETTASERVER_3GDT-TS: 19.86 GOAP: -31360.77

3rd Best in Top 1 Models

T0834-D1 Kiharalab Top1 Model 3rd Best in Top 1 Models

Kiharalab_1GDT-TS: 37.12 GOAP: -26474.14

RBO_ALeph_5GDT-TS: 37.88 GOAP: -26865.67

Superimposition with the native (130-192) (D1 also includes 2-37)

Acknowledgements

http://kiharalab.org@kiharalab

Hyung-Rae KimAmit Roy

Lenna Peterson

Daisuke Kihara

top related