bioinformatics 2 -- lecture 9

24
Bioinformatics 2 -- lecture 9 Ramachandran angles Sidechain chi angles Rotamers Dead End Elimination Theorem

Upload: ethelbert-watts

Post on 18-Jan-2018

222 views

Category:

Documents


0 download

DESCRIPTION

Backbone angles phi and psi In 1968, G.N.Ramachandran built a model like this, ala-ala-ala, to explore the relationship between interatomic distnces and the two freely rotatable packbone angles phi and psi. Atom-atom distances that were too close were not permissible. What angles were permissible?

TRANSCRIPT

Page 1: Bioinformatics 2 -- lecture 9

Bioinformatics 2 -- lecture 9

Ramachandran angles

Sidechain chi angles

Rotamers

Dead End Elimination Theorem

Page 2: Bioinformatics 2 -- lecture 9

Backbone angles phi and psi

Para ver esta película, debedisponer de QuickTime™ y deun descompresor TIFF (LZW).

In 1968, G.N.Ramachandran built a model like this, ala-ala-ala, to explore the relationship between interatomic distnces and the two freely rotatable packbone angles phi and psi. Atom-atom distances that were too close were not permissible. What angles were permissible?

Page 3: Bioinformatics 2 -- lecture 9

Ramachandran Plot

Para ver esta película, debedisponer de QuickTime™ y deun descompresor TIFF (LZW).

Best

Allowed

The regions labeled alpha and beta represent valleys of stability, surrounded by a high energy plateau. Values of phi are limited primarily to the range between -60 degrees and -150 degrees. For psi, the range is limited to regions centered about -60 degrees and +120 degrees

Use

SE

Q:M

easu

re--

>Ram

acha

ndra

n Pl

otto

vie

w a

ll re

sidu

es in

you

r pro

tein

plo

tted

phi

vers

us p

si.

This plot is forall amino acids except Pro andGly.

Page 4: Bioinformatics 2 -- lecture 9

Backbone angle statistics

Colors represent the frequency (in bins of 10°x10°) of phi/psi angles. E,B and H are most common. L, l and x are found most often in Gly.

Allowed regions are islands. Are bonds really "freely rotatable"?

Page 5: Bioinformatics 2 -- lecture 9

Sidechain angle space -- rotamersA random sampling of Phenylalanine sidechains, when

superimposed, fall into three classes: rotamers.

This simplifies the problem of sidechain modeling.All we have to do is select the right rotamers and we're close to the right answer.

Page 6: Bioinformatics 2 -- lecture 9

Sidechain modeling

Given a backbone conformation and the sequence, can we predict the sidechain conformations?

Energy calculations are sensitive to small changes. So the wrong sidechain conformation will give the wrong energy.

Page 7: Bioinformatics 2 -- lecture 9

Goal of sidechain modeling

Desmet et al, Nature v.356, pp339-342 (1992)

Given the sequence and only the backbone atom coordinates, accurately model the positions of the sidechains.

fine lines = true structurethink lines = sidechain predictions using the method of Desmet et al.

Page 8: Bioinformatics 2 -- lecture 9

Steric interactions determine allowed rotamers

CG

H

H

HO=C

N

CA

CBCG

H H

HO=C

N

CA

CB

CG

H

H

HO=C

N

CA

CB

"m" "p""t"-60° gauche 180° anti/trans +60° gauche

3-bond or 1-4 interactions define the preferred angles, but these may differ greatly in energy depending on the atom groups involved.

Page 9: Bioinformatics 2 -- lecture 9

Exercise: measure a rotamer

Create a tripeptide TWV, using Protein Builder Now, create "meters" for the chi1 and chi2 anglesDihedral (from right side menu)Select N-CA-CB-CG (1-2-3-4)Select CA-CB-CG-CD1 (2-3-4-5)

select these atoms

1

2

3

45

Page 10: Bioinformatics 2 -- lecture 9

Trp sidechain is hard to rotate

Rendering the molecule as space filling(Render-->Space filling) allows you to better visualize the

contacts.

W sidechain isshown here lying over Thr backbone

Rotamers of W*:p-90 +60 -90p90 +60 +90t-105 180 -105t90 180 90m0 -65 5m95 -65 95

Page 11: Bioinformatics 2 -- lecture 9

Rotamer Libraries

Rotamer libraries have been compiled by clustering the sidechains of each amino acid over the whole database. Each cluster is a representative conformation (or rotamer), and is represented in the library by the best sidechain angles (chi angles), the "centroid" angles, for that cluster.

Two commonly used rotamer libraries:

*Jane & David Richardson: http://kinemage.biochem.duke.edu/databases/rotamer.php

Roland Dunbrack: http://dunbrack.fccc.edu/bbdep/index.php

*rotamers of W on the previous page are from the Richardson library.

Page 12: Bioinformatics 2 -- lecture 9

Exploring Rotamers using MOE

The environment of a buried leucine in 1A07. The interior of a protein is tightly packed. Bad packing produces voids or collisions.

Page 13: Bioinformatics 2 -- lecture 9

Exercise: Rotamer explorerOpen 1A07 from the Protein Database

Edit-->Add hydrogens

Compute-->partial charges

Select an amino acid in the interior.

SE: Edit-->Rotamer Explorer (get from MOE)

Select rotamer with the lowest energy. Are the current chi angles close to the angles of a rotamer? How close? Is it the lowest energy rotamer?

Select “Mutate”. (The coordinates are permanently changed.)

Page 14: Bioinformatics 2 -- lecture 9

Exercise: Rotamer explorerSelect an amino acid on the surface.

SE: Edit-->Rotamer Explorer (get from MOE)

Are the current angles close to a rotamer? Is it the lowest energy rotamer?

What interactions does the best rotamer have?

Mutate.

Then select a nearby sidechain and do the same thing.

How many times would you have to mutate before you could be sure of having the lowest energy rotamer set?

Page 15: Bioinformatics 2 -- lecture 9

Dead end elimination theorem

•There is a global minimum energy conformation (GMEC), where each residue has a unique rotamer.

In other words: GMEC is the set of rotamers that has the lowest energy.

•Energy is a pairwise thing. Total energy can be broken down into pairwise interactions. Each atom is either fixed (backbone) or movable (sidechain).

fixed-movable movable-movable fixed-fixed

E is a constant, =Etemplate

E depends on rotamer, but independent of

other rotamers

E depends on rotamer, and depends on

surrounding rotamers

Page 16: Bioinformatics 2 -- lecture 9

Theoretical complexity of sidechain modeling

The Global Minimum Energy Configuration (GMEC) is one, unique set of rotamers.

How many possible sets of rotamers are there?

n1 n2 n3 n4 n5 … nL

where n1 is the number of rotamers for residue 1, and so on.

Estimated complexity for a protein of 100 residue, with an average of 5 rotamers per position: 5100 = 8*1069

DEE reduces the complexity of the problem from 5L to approximately (5L)2

Page 17: Bioinformatics 2 -- lecture 9

Dead end elimination theorem•Each residue is numbered (i or j) and each residue has a set of rotamers (r, s or t). So, the notation ir means "choose rotamer r for position i".

•The total energy is the sum of the three components:

NOTE: Eglobal ≥ EGMEC for any choice of rotamers.

Eglobal = Etemplate + iE(ir) + ijE(ir,js)

where r and s are any choice of rotamers.

fixed-fixedfixed-movable

movable-movable

Page 18: Bioinformatics 2 -- lecture 9

Dead end elimination theorem•If ig is in the GMEC and it is not, then we can separate the terms that contain ig or it and re-write the inequality.

E(ir) + j mins E(irjs) > E(it) + j maxs E(it,js)

EGMEC = Etemplate + E(ig) + jE(ig,jg) + jE(jg) + jkE(jg,kg)

EnotGMEC = Etemplate + E(it) + jE(it,jg) + jE(jg) + jkE(jg,kg)...is less than...

E(ir) + j E(irjs) > E(ig) + j E(ig,js)Canceling all terms in black, we get:

So, if we find two rotamers ir and it, and:

Then ir cannot possibly be in the GMEC.

Page 19: Bioinformatics 2 -- lecture 9

Dead end elimination theorem

E(ir) + j mins E(irjs) > E(it) + j maxs E(it,js)

If the "worst case scenario" for rotamer t is better than the "best case scenario" for rotamer r, then you can eliminate r.

This can be translated into plain English as follows:

Page 20: Bioinformatics 2 -- lecture 9

Exercise: Dead End Elimination

Using the DEE worksheet:

(1) Find a rotamer that satisfies the DEE theorem.

(2) Eliminate it.

(3) Repeat until each residue has only one rotamer.

What is the final GMEC energy?

Page 21: Bioinformatics 2 -- lecture 9

DEE exercise

abc

1

2

3

Three sidechains. Each with three rotamers. Therefore, there are 3x3x3=27 ways to arrange the sidechains. • Each rotamer has an energy E(r), which is the non-bonded energy between sidechain and template. • Each pair of rotamers has an interaction energy E(r1,r2), which is the non-bonded energy between sidechains.

Page 22: Bioinformatics 2 -- lecture 9

DEE exercise

-1 1 13 5 15 5 -1

-2 2 50 5 -10 0 0

0 0 112 5 04 3 0

-1 3 51 5 51 1 -1

-2 0 02 5 05 -1 0

0 12 40 5 31 0 0

r1

r2 E(r1,r2)

1

2

3

21 3

abc

abc

abc

a b ca b c a b c

0 0 5 0 0 0 0 0 10

0

0

5

0

0

0

0

0

10

E(r2)

E(r1)

Page 23: Bioinformatics 2 -- lecture 9

DEE exercise: instructions

(1) The best (worst) energies are found using the worksheet: Add E(r1) to the sum of the lowest (highest) E(r1,r2) that have not been previously eliminated.

(2) There are 9 possible DEE comparisons to make: 1a versus 1b, 1a versus 1c, 1b versus 1c, 2a versus 2b, etc. etc. For each comparison, find the minimum and maximum energy choices of the other rotamers. If the maximum energy of r1 is less than the minimum energy of r2, eliminate r2.

(3) Scratch out the eliminated rotamer and repeat until one rotamer per position remains.

If the “best case scenario” for r1 is worse than the “worst case scenario” for r2 you can eliminate r1.

Page 24: Bioinformatics 2 -- lecture 9

Sequence design using DEE•Did you notice that Rotamer Explorer in MOE allows you to choose a different sidechain?

Choosing an amino acid for each position, based on the backbone structure and the energy function, is called Protein Design.