allan stewart

28
Face-centered cubic (FCC) lattice models for protein folding: energy function inference and biplane packing Allan Stewart

Upload: brandon-sargent

Post on 02-Jan-2016

27 views

Category:

Documents


0 download

DESCRIPTION

Allan Stewart. Face-centered cubic (FCC) lattice models for protein folding: energy function inference and biplane packing. Proteins carry out the work of the cell. Reményi A et al. Genes Dev. 2003;17:2048-2059. Dogma of computational protein structure prediction (PSP). - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Allan Stewart

Face-centered cubic (FCC) lattice models for protein folding: energy

function inference and biplane packing

Allan Stewart

Page 2: Allan Stewart

Proteins carry out the work of the cell

Reményi A et al. Genes Dev. 2003;17:2048-2059

Page 3: Allan Stewart

Dogma of computational protein structure prediction (PSP)

The biological “native” has the minimum energy conformation over the entire fold landscape.

Controversial whether native is unique or there if there may generally be an ensemble

Page 4: Allan Stewart

Protein folding is NP-hard in most formulations

Reduction to partition problem (Ngo & Marks '92)

Page 5: Allan Stewart

The problem is hard under any reasonable model

The protein energy is minimized iff

there is an assignment of move vectors into two subsets st. the sum of the subsets are equal

We find NP-hardness results for other formulations: Ising model, bin packing, Hamiltonian path. (See Istrail & Lam, Combinatorial Algorithms for Protein Folding in Lattice Models: A Survey of Mathematical Results, 2009)

This sort of worst-case intractability analysis will not improve in even the simplest of models.

Page 6: Allan Stewart

Biological basis of folding principally involves hydrophobic collapse

“Truth is much too complicated to allow anything but approximations.”

~ John von Neumann

Protein primary structure: N - AGECH... - C The tertiary (3D) structure is dependent on

primary structure alone (experimental evidence)

Page 7: Allan Stewart

Biological basis of folding principally involves hydrophobic collapse

Suppose the protein sequence to be a string over the 20-letter amino acid alphabet

Avoiding the complexity (charge, size) of amino acids, we classify the residues as

'H': hydrophobic residues 'P': hydrophilic (polar) residues

The HP model (Ken Dill, 1985) is a simplest framework for folding, and prioritizes H-H interactions.

Page 8: Allan Stewart

Biological basis of folding principally involves hydrophobic collapse

Right: red are hydrophobic, green hydrophilic Reds form a core in the center → Long-range interactions

Page 9: Allan Stewart

There are two camps in protein folding

Off-lattice (continuous mathematics) More flexibility Heuristic methods perform fairly well Optimality of simulation is uncertain

On-lattice (discrete mathematics) Exhaustive enumeration of space Provably timely and near-optimal results But a lot is yet unknown... We don't know if the lattice gives good

prediction

Page 10: Allan Stewart

My Thesis

continuous

discrete

PDBRepo

FCCSC

LatFit

Energy

Fit an Energyfunction

Structures

StatisticalEvaluationof existingmethods

Model

PredictNative

Conjecture/ theorems

discrete

Page 11: Allan Stewart

Face-centered cubic lattice

Lattices are discrete subgroups of R

distinguished by their basis vectors (connectivity) and coordination number

3

Page 12: Allan Stewart

Folding is the minimization of the energy potential function

Protein conformations are Boltzmann distributed

Typical energy functions sum values for each of the pairs of amino acids in the protein sequence.

Caveat foldtor Your fold is only as good as your potential function,

and how hard you work is dependent on the function.

(Some don't)

Page 13: Allan Stewart

Prior work shows that we can do with just a few parameters

HP model typically only scores H-H contacts

This corresponds to a symmetric interaction matrix

Page 14: Allan Stewart

We look for empirical parameters which improve over the 'HP' matrix

• Extract 1198 PDB structures

• Generate decoys

decoys are natives which have been perturbed by roughly 16%

• Count all types of contacts

• Use gradient ascent to optimize choice of parameters

max

Page 15: Allan Stewart

We found an optimum energy function, but not a universal one.

H P S

H -.13 0 -.05

P -.04 -.06

S X

13997

H P S

H -.15 0 0

P -.04 0

S X

13365 13997 → 72% successful prediction. A large fraction of the decoys are very

deceptive

Page 16: Allan Stewart

Pairwise Function Impossibility Conjecture

In collaboration with Warren Schudy and Sorin Istrail, we conjecture that no linear function f which sums pairwise potential satisfies axioms (1) and (2)

We formulate it as an LP with the above as linear constraints.

(1)(2)

Page 17: Allan Stewart

Towards Realistic Models of Folding

“For me, the first challenge for computing science is to discover how to maintain order in a finite, but very large, discrete universe that is intricately intertwined.”

~ Edsgar Dijkstra

Page 18: Allan Stewart

What do lattice algorithms look like? We chop the protein into blocks and align

blocks with high hydrophobicity.

(Hart and Istrail 1995) Use inequalities to bound numbers of contacts

Approximation algorithms

Page 19: Allan Stewart

What do lattice algorithms look like?

Hart and Istrail (1997) show an 86% approximation ratio for a 4x2 biplane on FCC sidechain model

Page 20: Allan Stewart

The biplane is near-optimal, but is it realistic???

We found an optimal center cutting plane through each protein and annotated the hydrophobics lying within distance k.

There is high variance in biplane hydrophobicity.

Roughly 50:50 biplanar to non-biplanar

Page 21: Allan Stewart

The biplane is near-optimal, but is it realistic???

Biplane corresponds best to a globular fold. The alpha helix is a problem!

set alpha beta unstructured solvent

biplanar 30% 19% 49% 46%

non-biplanar 37% 20% 44% 47%

Page 22: Allan Stewart

Rescuing the alpha-helix with the FCC

The alpha-helix is a right-handed helix, ~4 residues per turn.

The FCC places spheres at angles : the

dihedral angles of the helix.

Page 23: Allan Stewart

Idea #1: Find a 4-tuple of alpha vectors in FCC

Alpha bundles from

Pokarowski et al. 2003

Page 24: Allan Stewart

Idea #2: Assemble octahedrons in FCC lattice

Build an octahedron-like conformation with hydrophobics towards center.

Exploits angles of FCC: face angles = 120deg.

7369

Goal

Page 25: Allan Stewart

Conclusion: new frontiers for FCC sidechain folding

Implications for algorithm design Block partitioning Fold block into biplane or octahedron

Can we prove bounds for increasingly complex methods?

If we prove pairwise impossibility, how do we construct our energy function?

Page 26: Allan Stewart

Questions?

Fire away.

“If you don't work on important problems, it's not likely that you'll do important work.”

~ Richard Hamming

Thank you for doing important work.

Page 27: Allan Stewart
Page 28: Allan Stewart

If I reach this slide, something went wrong

“There's no sense in being precise when you don't even know what you're talking about.”

– John von Neumann “It is better to do the right problem the wrong

way than the wrong problem the right way.

— Richard Hamming