seminar in bioinformatics a method for biomolecular structural recognition and docking allowing...

64
Seminar in BioInformatics A Method for Biomolecular Structural Recognition and Docking Allowing Conformational Flexibility (1998) Bilha Sandak, Ruth Nussinov and Haim wolfson Presented by : Tzahi Sofer

Post on 20-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Seminar in BioInformatics

A Method for Biomolecular Structural Recognition and

Docking Allowing Conformational Flexibility (1998)

Bilha Sandak, Ruth Nussinov and Haim wolfson

Presented by : Tzahi Sofer

Lecture StructureLecture Structure

• Overview – the problem, general idea of solutions, other approaches.• Problem definition.• Preview.• The algorithm.• Result Analysis, examples.• Summery and Discussion .

Problem DefinitionProblem Definition

• The problem:The problem: generating binding modes generating binding modes between two molecules (a ligand and a between two molecules (a ligand and a receptor), also known as receptor), also known as molecular molecular docking.docking.

OverviewOverview• Solving this problem involves recognition ofSolving this problem involves recognition of molecular surfaces and depends on the 3-D molecular surfaces and depends on the 3-D structures & flexibility of the molecules.structures & flexibility of the molecules.

• Our approach allows Our approach allows hinge motionshinge motions to exist to exist in either the ligand or the receptor molecules,in either the ligand or the receptor molecules, of diverse size.of diverse size.

Overview (cont.)Overview (cont.)• We achieve this by adapting a technique We achieve this by adapting a technique from from computer vision & robotics computer vision & robotics (Wolfson, 1991).(Wolfson, 1991).

• Other docking techniques have enabled hinge Other docking techniques have enabled hinge movements only in movements only in small ligands.small ligands. Partial Partial flexibility in the receptor is enabled by few flexibility in the receptor is enabled by few of them of them (DesJarlais 1986, Leach & Kuntz 1992,(DesJarlais 1986, Leach & Kuntz 1992, Rarey, “FlexX” 1996)Rarey, “FlexX” 1996)

• We apply the algorithm to cases of We apply the algorithm to cases of boundbound and and unboundunbound complexes. complexes.

PreviewPreview• During the process of molecular association, During the process of molecular association, either of the participating molecules, may either of the participating molecules, may undergo conformational changes.undergo conformational changes.

hingehinge

ligandligand

receptorreceptor

Preview (cont.)Preview (cont.)• Rigid docking Vs. flexibile docking .Rigid docking Vs. flexibile docking . • More then 6 degrees of freedom (3 rotation,More then 6 degrees of freedom (3 rotation, 3 translation, 1 relative).3 translation, 1 relative). • By allowing flexibility in By allowing flexibility in eithereither the ligand or the ligand or the receptor, additional candidate inhibitorsthe receptor, additional candidate inhibitors may be obtained.may be obtained. • Simultaneous match of Simultaneous match of all partsall parts of of the molecule. the molecule.

Redefining the problemRedefining the problem • So, the problem is: So, the problem is:

“ “given a database of given a database of known ligands, and a newly introduced known ligands, and a newly introduced

receptor, recover all ligands which exhibit receptor, recover all ligands which exhibit substantial partial surface match, substantial partial surface match, withoutwithout

collidingcolliding. If the ligands contain . If the ligands contain hingeshinges, solve, solve the problem by recovering the ligand in a the problem by recovering the ligand in a

plausible conformation, without having the plausible conformation, without having the parts parts self-collideself-collide.”.”

Method OutcomeMethod Outcome

• The outputThe output – transformations. – transformations.

• VerificationVerification- by docking bound structures.- by docking bound structures.

• Good binding modes are generated as well.Good binding modes are generated as well.

Key QuestionsKey Questions

How do we find hinges?How do we find hinges?

What is the complexity advantage?What is the complexity advantage?

How to represent the model?How to represent the model?

The AlgorithmThe Algorithm• Overview.Overview. • Phases – Phases –

Preprocessing.Preprocessing. Recognition.Recognition.

• Complexity analysis.Complexity analysis.

The Algorithm - OverviewThe Algorithm - Overview• Representing the surfaces as sets of interestRepresenting the surfaces as sets of interest points – a non-trivial task.points – a non-trivial task.

• Two major issues:Two major issues: a precise representation.a precise representation. execution time and memory consumption.execution time and memory consumption.

• Recall the representation generated by Recall the representation generated by Lin and NussinovLin and Nussinov(1996)(1996) . . There is also the sphere representetaion of There is also the sphere representetaion of Kuntz(1982).Kuntz(1982).

The Algorithm - OverviewThe Algorithm - Overview• here, “here, “capscaps” for the receptor and “” for the receptor and “pitspits” for” for the ligand.the ligand.

• Position the hingesPosition the hinges in the model. in the model. hinge locations are determained by either:hinge locations are determained by either:

Comparing conformations of ligandsComparing conformations of ligands

flexible alignment (FlexProt).flexible alignment (FlexProt).

narrow regions in the molecule.narrow regions in the molecule.

The Algorithm - PreProcessingThe Algorithm - PreProcessing• Represent the model as an “interest point” setRepresent the model as an “interest point” set • The hinge positions are picked as the origin of The hinge positions are picked as the origin of 3-D frame, called the “ligand frame”.3-D frame, called the “ligand frame”. • Store the model in a look-up (hash) table:Store the model in a look-up (hash) table:

creating “triplet frames” from triplets of creating “triplet frames” from triplets of

interest points. interest points. Why triplets?Why triplets?

What is a frame?

FramesFrames• An orthonormal 3-D frame.An orthonormal 3-D frame.• Its basis is orthonormal.Its basis is orthonormal.• Here, we compute the basis for each Here, we compute the basis for each invariant triangle.invariant triangle.• Then, the transformation is computed from Then, the transformation is computed from the frame’s basis to the unit basis:the frame’s basis to the unit basis:

Translational vector.Translational vector. Rotational transformation. Rotational transformation.

The Algorithm – PreProcessing The Algorithm – PreProcessing (cont.)(cont.)

the triplet triangle side lengths serves as the triplet triangle side lengths serves as an address to the hash table.an address to the hash table. the information stored at each entry is the the information stored at each entry is the ligand’s id, part number, and the ligand’s id, part number, and the transformations between the ligand frame transformations between the ligand frame and the triplet frame.and the triplet frame.

PreProcessing RemarksPreProcessing Remarks

• Multiple hinges = multiple trans’ Multiple hinges = multiple trans’ (different ligand frames).(different ligand frames).

• Min/Max distance constraints - robustness/Min/Max distance constraints - robustness/ reduced matchings.reduced matchings.

The Algorithm – RecognitionThe Algorithm – Recognition• Represent the target as an “interest point” set.Represent the target as an “interest point” set.

• candidate models:candidate models: calculate triplet frames of the target.calculate triplet frames of the target.

compute the “candidate ligand frame”compute the “candidate ligand frame” by by applying the trans’ at that entry to theapplying the trans’ at that entry to the receptor (target) triplet frame.receptor (target) triplet frame.

the origins are the the origins are the candidate hinge candidate hinge location.location.

The Algorithm – Recognition The Algorithm – Recognition (cont.)(cont.)

• Choose only high scoring candidate solutions-Choose only high scoring candidate solutions-

the hinge locations are inserted into a lookthe hinge locations are inserted into a look up table.up table.

we pick locations receiving votes from we pick locations receiving votes from both sides connected to it.both sides connected to it. the hinge location is the translation from the hinge location is the translation from

the original hinge position of the ligand to the original hinge position of the ligand to its new candidate location.its new candidate location.

The Algorithm – Recognition The Algorithm – Recognition (cont.)(cont.)

• Verify the conformations – Verify the conformations – collision check ligand-receptor.collision check ligand-receptor.

self collision check ligand-ligand.self collision check ligand-ligand.

done by applying the trans’ to the part’s done by applying the trans’ to the part’s atoms.atoms.

colliding criteria – 2 * van der Waals colliding criteria – 2 * van der Waals radii – threshold.radii – threshold.

The Algorithm – optimizations & The Algorithm – optimizations & heuristicsheuristics

• min/max distancemin/max distance – – dense regions heristic.dense regions heristic. • regular/rapid runregular/rapid run – deviding the receptor’s – deviding the receptor’s triplets set into 8 segments and 1 overlappingtriplets set into 8 segments and 1 overlapping segment (discarding triplets).segment (discarding triplets).

• voting thresholdvoting threshold . .

The Algorithm – optimizations & The Algorithm – optimizations & heuristics (cont.)heuristics (cont.)

• prune/no pruneprune/no prune – – prunning trans’ in the prunning trans’ in the verification check.verification check.

• collision check – only in the respective collision check – only in the respective segment.segment.

• contact percentage/distance/threshold contact percentage/distance/threshold - a - a screen for the self collision check.screen for the self collision check.

The Algorithm – ComplexityThe Algorithm – Complexity• PreProcessingPreProcessing::

Im = N* (m/N)* ((m/N) –1)*((m/N) –2)/6Im = N* (m/N)* ((m/N) –1)*((m/N) –2)/6 = O(m³ / 6N²) = = O(m³ / 6N²) = O(m³)O(m³) for N close to 1for N close to 1

where:where: Im - number of triplets.Im - number of triplets.

N – number of parts.N – number of parts. m – number of interest points m – number of interest points

reduced by N² compared to the rigid model.reduced by N² compared to the rigid model.

The Algorithm – ComplexityThe Algorithm – Complexity(cont.)(cont.)

• An insertion to the hash table – O(1).An insertion to the hash table – O(1).• So, the preprocessing phase – So, the preprocessing phase – O(m³).O(m³).

• Hash table manipulationHash table manipulation::

B = (D / q)³B = (D / q)³where:where:

B - number of bins in the table.B - number of bins in the table.D – D – maxdist-mindistmaxdist-mindist constraints constraintsq – resolution.q – resolution.

The Algorithm – ComplexityThe Algorithm – Complexity(cont.)(cont.)

• R = Im / B.R = Im / B.where:where:

R – the avarage number of records R – the avarage number of records in an entry.in an entry. So, each look-up is O(R ), assumingSo, each look-up is O(R ), assuming homongeneuse distribution. homongeneuse distribution.

The Algorithm – ComplexityThe Algorithm – Complexity(cont.)(cont.)

• Recognition:Recognition: matching stage:matching stage:

Cm = O(n³ * R)Cm = O(n³ * R) where:where:

n - the number of interest n - the number of interest points points

in the targetin the target R – the avarage access time R – the avarage access time

to a to a look up table.look up table.

For small bins – For small bins – O(n³) , O(n³) , but there is a but there is a Trade off with accuracy.Trade off with accuracy.

The Algorithm – ComplexityThe Algorithm – Complexity(cont.)(cont.)

Verification stage:Verification stage: Ccc = O( (m * n * f)/(8N))Ccc = O( (m * n * f)/(8N)) Cscc = O( (m * g) / (Cscc = O( (m * g) / (jumpjump²) )²) )

Complexity summeryComplexity summery

O(n³ + m³ + Ccc + Cscc)O(n³ + m³ + Ccc + Cscc)+ Lin’s representation….+ Lin’s representation….

The Algorithm – SummeryThe Algorithm – Summery• Preprocessing:Preprocessing:

Represent the model as an interest point Represent the model as an interest point Set.Set.

position the hinges in the model.position the hinges in the model. store the model in a look up table.store the model in a look up table.

• Recognition:Recognition: Represent the target as an interest point Represent the target as an interest point

Set.Set. Recover candidate model from the table.Recover candidate model from the table. Choose only high scoring solutions.Choose only high scoring solutions. Verify by collision checks. Verify by collision checks.

Experimental resultsExperimental results• We investigate We investigate bound structures.bound structures. • Thus, the “correct” solutions are those with Thus, the “correct” solutions are those with rotations and translations close to zero.rotations and translations close to zero. • “ “best solution” = low RMSD.best solution” = low RMSD. • Good-fitting predictive binding modes are Good-fitting predictive binding modes are generated as well.generated as well.

UNBOUND MOLECULAR STRUCTURES

RMS values for atoms of the interface is 4.32 Ao and 5.72 Ao for First and Second parts respectively

ExamplesExamples• Docking MTX/DHFRDocking MTX/DHFR

Motivation – MTX is an anticancer drug,Motivation – MTX is an anticancer drug, preventing the replication of cells.preventing the replication of cells. MTX – flexibile, hinge at C9 Atom.MTX – flexibile, hinge at C9 Atom. DHFR – rigid.DHFR – rigid. bound case.bound case. ResultsResults ResultsResults

Examples (cont.)Examples (cont.)• Docking Maltose/MBPDocking Maltose/MBP

Motivation – transportation of substrates Motivation – transportation of substrates between the inner and outer membrane ofbetween the inner and outer membrane of a bacteria.a bacteria. Maltose – rigid (and small).Maltose – rigid (and small). MBP – hinge at Cα atom of the GLU111.MBP – hinge at Cα atom of the GLU111. unbound case.unbound case. Although small, all of the Maltose’s Although small, all of the Maltose’s atomsatoms are in contact with the receptor.are in contact with the receptor.

SummerySummery • A method for docking a ligand into a A method for docking a ligand into a

receptor, allowing hinge motion in either ofreceptor, allowing hinge motion in either ofthem.them.

• A A geometric analogygeometric analogy between the problems between the problems of object matching in Computer Vision and of object matching in Computer Vision and those of molecular binding.those of molecular binding.

• No a-priory knowledge of the match.No a-priory knowledge of the match.

DiscussionDiscussion• Advantages and attributesAdvantages and attributes

More results compared to the rigid More results compared to the rigid method.method.

3-D to 3-D matching problem.3-D to 3-D matching problem. Full 3-D rotation during matching.Full 3-D rotation during matching. Large number of interest points.Large number of interest points. Handling noisy samples.Handling noisy samples. Relatively low Complexity.Relatively low Complexity. Diverse sized molecules.Diverse sized molecules. Good predictive results.Good predictive results. Short execution times (under 1 min.)Short execution times (under 1 min.)

Discussion (cont.)Discussion (cont.)• Disadvantages Disadvantages

Not a good method for small parts.Not a good method for small parts. Inconsistency.Inconsistency. No biological/chemical considerations.No biological/chemical considerations.

Additional possible featuresAdditional possible features

• Inserting Chemical properties Inserting Chemical properties considerations.considerations.

• Allowing rotational bond movement only,Allowing rotational bond movement only, where full 3-D movement is not requiered.where full 3-D movement is not requiered.

The endThe endThe endThe end

BibliographyBibliography

• Sandak, Nussinuv and Wolfson,Sandak, Nussinuv and Wolfson,““A method for Biomolecular Structural A method for Biomolecular Structural Recognition and Docking allowing Recognition and Docking allowing Conformational Flexiblity”, journal of Conformational Flexiblity”, journal of computational biology, vol. 5, 1998.computational biology, vol. 5, 1998.

• Shuo Liang Lin, Ruth Nussinov, Daniel Shuo Liang Lin, Ruth Nussinov, Daniel Fischer, Haim J. Wolfson,Fischer, Haim J. Wolfson,“A Geometry-base “A Geometry-base

Suite of Molecular Docking Processes, Suite of Molecular Docking Processes, Molecular surface representation by Molecular surface representation by

sparse critical points”sparse critical points”• Sandak, Nussinov and Wolfson,Sandak, Nussinov and Wolfson,

““Flexible docking allowing induced fit”,1998.Flexible docking allowing induced fit”,1998.

Hinge position

(0,0,0)(0,0,0)

T1T1(3,4,5)(3,4,5)

33

5544

ligandligand

receptorreceptorT1T1

(1,2,3)(1,2,3)

(1,2,3)(1,2,3)

(1.1, 2.3, 3)(1.1, 2.3, 3)

(1, 2, 3.2)(1, 2, 3.2)

(ligand_part, hinge_location)(ligand_part, hinge_location)

Ligand(pits)

Receptor(caps)

bound!

unbound

Connolly’s molecular surfaceConnolly’s molecular surface

Probe spheres are rained upon the atoms from all directions, stopping just before a collision.

Set the probe ball in all possible locations of the molecule.

The intersection point between the probe ball and the atoms defines a point on the molecular surface.

In three dimensions this produces a grid.

Connolly’s molecular surfaceConnolly’s molecular surface

Probe balls

van der Waals atoms

Connolly’s molecular surfaceConnolly’s molecular surface

Probe balls

Contact surface

reentrant surface

van der Waals atoms

Contact surface + reentrant surface = molecular surface

Connolly’s molecular surfaceConnolly’s molecular surface

Connolly’s representationConnolly’s representation

Face – an atomic size unit of Connolly’s surface

Convex face – the part of an atom’s van der Waal’s surface which the probe ball can touch.

Concave face – the part of the probe ball surface which is bordered by three atoms.

Saddle face – the part of the surface between two atoms.

Connolly’s representationConnolly’s representation

Convex Convex faceface

ConcaveConcave

faceface

Saddle Saddle faceface

Connolly’s molecular surfaceConnolly’s molecular surface

This representation has

the ability to visualize the

shape complementarity

at interfaces.It has become popular for protein

recognition problems.But need more concise representation….

Critical points creation – basic Critical points creation – basic termsterms

The representation is a set of critical points, obtained by projecting the gravity center of a Connolly face onto the surface.

Cap – a point on a convex face

Pit – a point on a concave face

Belt – a point on a saddle-shaped face

Dense MS surface (Connolly)

Sparse surface

A sparse set of critical pointsA sparse set of critical points

Dense MS surface (Connolly) Sparse surface