crossminer: federated csd and pdb 3d searching

14
1 CrossMiner: Federated CSD and PDB 3D searching Version 1.0 – July 2016 CrossMiner ver. 1.2 Table of Contents CrossMiner Terminology .................................................................... 2 Introduction ....................................................................................... 2 Overview of CrossMiner ..................................................................... 3 Example 1: Searching with a Pharmacophore ..................................... 4 Example 2: Editing a Pharmacophore ................................................. 8 Example 3: Creating a Pharmacophore from a Reference Molecule & Scaffold Hopping ...............................................................................11 Exercise 4: Building your Own Feature Database ...............................13

Upload: dinhdat

Post on 11-Feb-2017

226 views

Category:

Documents


0 download

TRANSCRIPT

1

CrossMiner:FederatedCSDandPDB3Dsearching

Version1.0–July2016CrossMinerver.1.2

TableofContents

CrossMinerTerminology....................................................................2

Introduction.......................................................................................2

OverviewofCrossMiner.....................................................................3

Example1:SearchingwithaPharmacophore.....................................4

Example2:EditingaPharmacophore.................................................8

Example3:CreatingaPharmacophorefromaReferenceMolecule&ScaffoldHopping...............................................................................11

Exercise4:BuildingyourOwnFeatureDatabase...............................13

2

IntroductionCrossMinercanbethoughtofasapharmacophorebasedquerytool.However,itismuchmorepowerfulthantraditionalpharmacophorequerytoolsasitallowsyoutoquerynotonlydatabasesofligands,butalsoproteinsandprotein-ligandinteractions.CrossMinerincludesapreconfigureddatabaseofbiologicallyrelevantsubsetsoftheCambridgeStructuralDatabase(CSD)andtheProteinDataBank(PDB).Thepharmacophoreusedinthequeryisinteractive,allowingyoutoeasilyedititandinanumberofwaysthroughasimpleuserinterface.Thisdeliversanoverallinteractivesearchexperiencewithapplicationareasininteractionsearching,scaffoldhoppingortheidentificationofnovelfragmentsforspecificproteinenvironments.ThesubsetoftheCSDincludedwithCrossMinerconsistsofstructureswhicharenotorganometallic,haveanR-factorofatmaximum10%,have3Dcoordinates,havenodisorder,andarenotpolymeric(about234400structurestotal).TheincludedPDBdatabaseisacuratedProasisdatabasesubsetofallco-crystalizedligands,withthebindingsitedefinedasallmoleculeswithanatomwithina6Åradiusaroundtheligand(210,200bindingsitesfrom58,300PDBentries).Forfurtherdiscussion,pleaserefertotheCrossMinerdocumentation(intheinstallationdirectory)ortheoriginalpaper:KorbO,KuhnB,HertJ,TaylorN,ColeJ,GroomC&StahlM“InteractiveandVersatileNavigationofStructuralDatabases”JMedChem,2016,59(9):4257,DOI:10.1021/acs.jmedchem.5b01756ThistutorialisgearedtowardsthenoviceCrossMineruserwhohaslifescienceexperience.Itcoverstheprimaryfeatures,insearchingacrossligands,proteins,andligand-proteininteractions.Someoftheresultsmayvary,dependingonyourversionofCrossMiner.Whenyouhavecompletedthistutorial,youshouldbeabletobuildqueries,executesearches,andcreateyourowncustomizedfeaturedatabase.

CrossMinerTerminologyCrossMinerusesseveralterms,somecommontothefieldofdrugdiscovery,andsomenot.Forease,thesetermsaredefinedasbelow:Feature:apoint(orvector)whichrepresentsaSMARTSquery,andinthecaseofavector,geometricrules.Afeaturecouldbeinaprotein,ligand,orpharmacophore.Pharmacophorefeature:(orvector)issimplyafeaturethathasbeenselectedtobepartofapharmacophore.Exitvector:Afeaturethatrepresentsabondofanytype.Thisfeaturewillberepresentedastwomeshspheres.InthecaseofCrossMiner,directionalityinanexitvectordoesnotmatter.

3

OverviewofCrossMinerCrossMinerisapowerfultoolwithasimpleuserinterface.Thisquicksectionwillfamiliariseyouwiththebasicfunctionsandunderlyingdatacomponentsbeforemovingontoexploringsomescientificquestions.1. LaunchCrossMinerfromtheStartMenu>CCDC(Windows)or

Applications/CCDC.Thistutorialwillrefertoseveralareasinthetool,suchas‘resultsbrowser’and‘pharmacophoreandreferencefeaturebrowser’(referredtofromhereoutjustasthe‘featurebrowser’).Notewherethesecomponentsare.

CrossMiner’sunderlyingdataisadatabaseofcrystalstructurefeatureannotations.Forthistutorial,wewillusetheincludeddatabase–subsetsoftheCSDandthePDB.YouwillnoticethatnothingisloadedwhenyouopenCrossMiner–thisisdoneintentionallytoallowtheusertoloadacustomizeddatabaseinsteadofthedefaultone.Toloadthedefaultdatabase:2. ClickFile>>LoadFeatureDatabase,andselect:[CCDCinstallation]\CSD_CrossMiner1.2\databases\csd_pdb_crossminer.feat.

Bydefault,CCDCinstallationpathsare:Windows:C:\ProgramFiles\CCDCMac:\Applications\CCDC

Loadingwilltakeafewminutes,butevenoncethebarhits100%,itwillneedamomenttoinitialisethestructures.3. Onceloaded,youwillseetheCSDandPDBsubsetdatabaseslistedinthe

databaseselectionwindow.Youcanloadmultipledatabases,andusethetickboxestoindicatewhichdatabaseshouldbesearched.

4. Youwillalsoseealistoffeaturesnowinthebottomright,inthefeature

browser.Thesearethefeaturesusedtogeneratethesedatabases(referencecolumn).

4

Example1:SearchingwithaPharmacophoreHumancathepsinLplaysamajorroleinproteincatabolism,andisimplicatedinanumberofpathologicalprocesses.Assuch,itisacommonresearchtargetandservesasagoodexamplefornoveldrugdiscovery.Forthisexample,wewillbeusingapharmacophoreincludedwithCrossMinertoexploretheCSDandPDBforpossiblehits.1. LoadthecathepsinLpharmacophorebyclickingFile>>Load

Pharmacophoreandselect:[CrossMinerinstallation]\example_pharmacophores\catl_s3.cm

2. Thiswillloadthepharmacophoreintotheviewingarea.Takeamomentto

rotatethemoleculeandunderstandthedifferentpharmacophorefeaturepoints:

P:ProteinfeatureS:SmallmoleculefeatureDashedline:intramolecularconstraint.Constrainedfeaturesmustbelongto

eitherthesamemoleculeaseachother(intra,dashedgreenline)ordifferentmolecules(inter,dashedredline).

Meshsphere:theactualfeatureitself,wherethespheresizerepresentstheradiusoftolerance.

Solidsphere:theprojectedvirtualpointtorepresentthedirectionalityofahydrogenbondacceptor/donor.Afeaturecanhavemorethanoneprojectedpoint.Forexample,aHbondacceptorcanhavemultiplepotentiallonepairpreferredprojections

Notethatthecolorcodingisdefinedinthefeaturebrowser.Eghydrophobicfeaturesaregreen,hydrogenbondacceptorsarered,andsoon.ThepharmacophoreinyourvieweronlyhasoneprojectedHbondacceptor.Thesepointscorrelatetothefeaturebrowser:BindicatestheBasefeature,andVindicatestheaccompanyingVirtualpoint.

Protein:Hbondacceptorfeature(mesh)withprojecteddirectionality(solid)

Protein:Hbonddonorfeature(mesh)withprojecteddirectionality(solid)

Intramolecularconstraint

Smallmolecule:heavyatomsconstraint

Protein:hydrophobicfeatureconstraint

53. Spheresizeisdirectlycorrelatedtotolerance–thesmallerthespherethe

lowerthetoleranceforgeometricmatching.Youcanseetheradiusofeachfeature(Å)inthefeaturebrowser.Bydefault,theradiussizeis1Å.

Notethatsometimestheviewercangetquitecrowded–aneasywaytofindyourfeatureistoclickthetickboxinthefeatureonandoffagain.

4. Clickthestartbutton tobeginsearchingacrossthedatabasesformatches.Asthesearchruns,youwillseeresultspopulatingtheresultsbrowserwindow,aswellastheprogressbarwithtotalnumberofhitsattopright.

5. Thisparticularsearchreturnsahighnumberofhits,andmaytakeawhile.

Thespherenexttothestartsearchbuttonisgreenwhilethesearchis

running.Pausethesearchbyclickingthepause buttonwhenyoufeelyouhaveenoughhits(afewhundredwilldo).Donotstopthesearch,asthatstopstheentireprocessandremovesallhits.

6. Bydefault,allresultsareoverlaid,whichmakesforaneasyappreciationof

thecommonmotifsmatchingthepharmacophores.However,thisismessy.Byclickoneachresultinthebrowser,youcanviewthemoneatatime.Or,holddowntheshiftorctrlkeytoselectmultiple.

7. LocatetheresultwiththelowestRMSDbyclickingonthermsdcolumnin

thebrowser,toshowascendingorder.ThelowestRMSDresultinthisexampleis2HJ_001_1–yoursmaybedifferent,dependingonwhenyoustoppedyoursearch.Theseresultsallcomebackwitha2Ddiagram,withpharmacophorematchesindicated.SelectyoursmallestRMSDresultbyclickingontherow(doesnotneedtobeticked).

8. Foreaseofviewing,changethestyleintheupperlefttoCappedSticks.Take a few moments to explore how the returned result matches thepharmacophorequery.Inparticulartonote:

• Feature matching is based on size of sphere (and thus the tolerancelevel).Tinysphereswithtighttolerancewillresultinhitswithveryclose

StartsearchPausesearch

6alignmenttothecentreofthesphere,whilelargersphereshaveawider

areaforalignment.• Therearenoexplicithydrogens in theproteinsites –hydrogenbond

donors and acceptors are defined in this database based on featuredefinitions,whichcomefromexpertknowledgeaboutprotonationandtautomerstate.Thus,thereisnowaytodisplayhydrogens,astheyareinferred. See the referencepaper listed in the introduction formoreinformation.

Therearealotofresults,likelyduetothelargeradiusofthefeatures.Lowerthetoleranceofthefeaturestogetasmallerresultsetwithmoreprecisealignmentto the pharmacophore centres. To do this, you will need to edit thepharmacophore,whichcannotbedonewhenviewingresults.

9. Clickthestop buttontocleartheresultsandenableediting.10. EnsurethatPickingmodeissettoEditPharmacophore.11. In editing mode, you can double click on the radius sizes in the feature

browsertochangethem.Changetheradiusofeveryfeatureto0.5tolowerthetolerance.Thiswillresultinfewerhits,whichareallcloseralignedwiththecentreofthepharmacophorefeatures.

Startthesearchonthenew,tighter,pharmacophore.Itwilltakeafewminutestopickuphits,astherearefarfewer.Letthesearchgotocompletionthistime,resultingin20hits.12. Ifclusterhitsisselected,thentheresultsareclusteredbasedontheadjacent

Tanimoto value. If clustered, youwill see representativesof thosesimilargroups in the resultsviewer (and thus fewer hits thanactually returned).Unchecktheboxtoseeallhits,includingthosewhichareverysimilartoeachother.

The lowestRMSDresultshereare2XU1and2XU3–whichareunsurprisinglycathepsinLPDBentries.Ifyouexplorethetopresult(2XU1_001),you’llseethatthehitmatchesthepharmacophorealmostexactly.

Stopsearch

7

13. TosavethetwolowestRMSDhitsforfurtherworkoutsideCrossMiner,clickthetickboxestotheleftof2XU1and2XU3toselectthem.Markingthemdoesnotdisplaythem,andsimilarly,displayingthem(viashiftorctrl)doesnotmarkthem.

14. ClickFiletodisplayseveralsaveoptions,whichwillsavethevisiblehits(in

thedisplayarea),allhits,orthemarkedhits.Forthisstep,clickFile>>SaveMarkedHits.Saveyourhitsascathepsin_hits.mol2.

Amol2filewillcontainproteinresidueinformation,whereasaSDFwillnot.

8

Example2:EditingaPharmacophoreOneofthepowersofCrossMineristheabilitytomanuallyinteractwithandeditapharmacophoreatanytime(evenwhileasearchisrunning).Forthisexample,you will be editing the cathepsin L pharmacophore. For ease, start with thedefaultcathepsinLpharmacophore.Reloaditasperstep1inexample1.Ifthesearchstartsagain,justclickstop.1. Make sure you are in Edit Pharmacophore mode. Right click on the

hydrophobic feature to bring up the editingmenu. ClickMorph Into andwater.YouwillalsoneedtoclickAnyMolecule,asthewateristechnicallynotapartoftheproteinorsmallmolecule.

2. You’llnoticethatthenameofthefeaturedoesnotchange.Youwillneedto

do this manually by right clicking on the feature and selecting ChangeDescription.Callitwater.

Therightclickmenuoffeatures iswhereanumberofthingsaredefined.Usethismenuto:

• Definewherethefeaturebelongs:aprotein,asmallmolecule,orany.• Addconstraints(Constrainto).• Changethefeaturetype(MorphInto).• Changethelabelofthefeature(ChangeDescription).• Deleteafeature(DeleteFeature).

3. Changetheradiusofthevirtual‘acceptor_projected’(V)from1.20to1.00,

andthewaterradiusfrom1.50to1.00.Runthesearch.4. Reducingthefeaturesizestillreturnsalargenumberofhits.Selectoneof

thelowRMSDresultsandvisualizeit.CrossMinerallowsyoutovisuallyinteractwiththepharmacophore.Youcaneditthefeaturesnumerically in thefeaturebrowser,andalsomanually inviewer.Whilethisisnotarigorousapproach,itishelpfulforexploringchemicalspace.

95. Mouseoverthewaterfeature,thenclickanddragusingthemiddlemouse

button.Thiswillchangetheradiusofthewatersphere.Thislargersphereallows formore flexibility inawaterplacementaround the ligand. If youhavetroublegettingthiswork,ensurethatInteractiveeditingmodebuttonis on (see step 7 below). Change the radius back to1.00 in the featurebrowser.

Wheneveryoueditthepharmacophore,thecurrentresultwilldisappear(asthisstartsanewsearch).Toeditapharmacophoreoverlaidwithamoleculeintheviewingarea,youwillneedtofirstsetthemoleculeasareference.6. Rightclickonahitintheresultsbrowser,andclickUseasreference.Thiswill

loadthemoleculeintotheviewer,withallfeaturesdefined.Thismeansthatprojection lines betweenbase and virtual features are shown, aswell ascentroidsandprojectedaromaticvirtualpoints.

7. Switchto interactiveeditingbyclickontheInteractiveeditingbutton.An

openhandmeanseditingisoff,andaclosedhandmeanseditingison.8. Dragoneoftheheavyatomfeaturestoanearbycarbonatom.Notethatthe

searchstartsassoonasyouletitgo.Thereferencemoleculestaysvisibleevenwhilethesearchruns.Toundisplaythereferencemolecule,clickthetickboxnexttoreferenceinthevisibilitybar.Thisbarisforcontrollingthevisibilityofthereferencemolecule,hits,constraints,andlabels.

EveryCrossMineruserhasaccidentlygrabbedapharmacophorepointanddrugitacrossthescreenwhentheymeanttorotatethemoleculeviewer.Therearetwomajortrickstoavoidingthis:

• Besuretoturnoff‘interactiveediting’modewhenyouarenotusingit.• Set‘Pickingmode’backtoPickatomswhenyouaredoneeditingyour

pharmacophore.Ctrl+z(orEdit>>Undo)isveryhelpful.Itwillundothelastchangemadetothepharmacophore. However, the search will start over again, so form goodCrossMinerhabitsearly!

Interactiveeditingoff

Interactiveeditingon

Projectedvirtualpoints

Centroid

109. Youcanalsoaddnewfeaturesfromthefeaturebrowser.Scrolltofluorine

in the featurebrowserand rightclick, then clickCreate fluorine.Thiswilldropthenewfeatureintothecanvas.

10. Pickapointyouwouldliketoalignthefluorineto.WhilestillwithInteractive

editingmodeon,dragthenewfluorineneartothereferencepoint.Whenthenewfluorinefeature is clearlyclosertothatreferencepointthananyotherreferencepoint,right-clickthefeature,andselectSnapToAtom.Thiswillalignthefeaturetothenearestatomicreferencepoint.

This new search probably won’t find any results. This step was just todemonstratehowtoaddandmanoeuvrenewfeatures.11. SaveyoureditedpharmacophorebyclickingFile>>SavePharmacophore.

Namethefilepharmacophore_edited.cm.Therearetwooptionsforsavingapharmacophore,asdefinedbelow.

SavePharmacophore:WillsavethecurrentpharmacophoreinaCrossMinerfileformat(cm).SavePymolPharmacophore:Willsavethecurrentpharmacophoreina(py)fileformat.Thisisavisualbasedformattingwithnoscientificinformationcontainedbeyondlabels(noatomtyping,etc).

11

Example3:CreatingaPharmacophorefromaReferenceMolecule&ScaffoldHoppingAfarmorecommonuseofCrossMineristocreateapharmacophorefromareferencemolecule.Thiscouldbebuildingapharmacophorefromtheligandofaco-crystallizedprotein-ligandcomplex,ormanuallycreatingaCrossMinerpharmacophorefromasetofatomsrepresentingapharmacophorecreatedbyanothermechanism.Ifyoualreadyhaveworkinyourviewer,clearitbyclickingEdit>>ClearPharmacophoreandEdit>>ClearReference.1. ClickFile>>LoadReferenceandselect2xu1_ligand.sdffromtheworkshop

datafolder.Notethatthefeaturesareautomaticallyassigned.Thisisbasedontheatomtypesandfeaturedefinitionsthatarepartoftheloadeddatabase.2. Rightclickonfeaturesofthereferencemoleculetodefinepharmacophore

featurepoints,asyoudidabove.Definethepharmacophoreasshownintheimageatright.

3. Wewantallofthesefeaturestobeinthesamemolecule,andthusallto

haveintraconstraints.Ratherthanmanuallysettingtheseup,justclicktheIntrabuttononthetoolbar,toautomaticallyconstrainallthefeatures.

Startthesearch.ImmediatelyalotofhitswillbefoundintheCSD(asindicatedbythe6-letterrefcodesintheresultsbrowser).Butforthisexperiment,weareonlyinterestedinmatchingligandsfromthePDB.Stopthesearch.4. Clickthetickboxnexttocsd536_crossminerinthedatabasebrowserto

disablesearchingtheCSD.Restartthesearchandletitruntocompletion.Scrollthroughtheresults,assortedbyRMSD.Clearlytherearehitsfromtheco-crystalizedligand,buttherearealsosomeinterestinganalogs.

planar_ring_projected(projectedawayfrommolecule)

hydrophobic

donor_projected

acceptor_projected(oxygenprojectingawayfrommolecule)

Setallfeaturestointraconstraints

12 This demonstrates how easy scaffold hopping could be carried out withCrossMiner.Asanotherexample,youwillexperimentwithscaffoldhoppingofPD180970,theco-crystallizedligandoftheABLkinasedomain.5. Load the file 2hzi_ligand.sdf as you did previously. Define the

pharmacophore as indicated. For the acceptor_projected, two points areindicated – ensure that bothprojected virtual points are set to features.Ensurethatthefeaturesareallsettointraconstraints,asabove.

Forthissearch,leaveboththeCSDandPDBdatabasesselected.You’llseethatthere is no shortage of great results from the CSD to match this query. Tryexperimentingwithnarrowingthisresultsetdown.Tryrestrictingsomeoftheradii,oraddinginanewfeaturethatyourchemistryexperiencesaysmightbeimportant.

planar_ring_projected

hydrophobicexit_vector

acceptor_projected(2)

13

Exercise4:BuildingyourOwnFeatureDatabaseForthisexerciseyouwillbebuildingyourownfeaturedatabasefromasetofligands. These ligands do not have pre-configured features, and will rely onCrossMinertodetectandaddthefeatures.1. SelectFeatureDatabase>>Create.Select‘Add’andnavigatetothetutorial

datafolderandloadfviia_ligands01.sdf.2. You will also need to load the feature definitions – this step allows for

flexibilityinhowfeaturesaredefined,ifyouwanttoaddcustomfeatures.Click on Add Substructures, and navigate to [CrossMinerInstallation]\feature_definitions.Selecteveryentrystartingwithfeatures_,usingtheshiftkey.ClickOK.

3. ClickCreateFeatureDatabase,notOK,tocreatethedatabase.Itwillaskyou

for a save location – indicate the tutorial data directory, and name itfviaa_ligands01.feat.Oncetheindexinghascompleted,clickOKtoclosetheCreateFeatureDatabasewindow.

Tryoutthenewdatabase.LoaditbyclickingFile>>LoadFeatureDatabaseasyouhavebefore.Notethatinthedatabasebrowser,onlyyourdatabaseislisted.4. Thesimplesttestistoseeifyoucanpullbackamoleculefromtheoriginal

dataset.Loadareferencemoleculefromtheworkshopdatadirectory(File>> Load Reference >> 4YT6_4JY.SDF). Define two planar_ring_projecte’pharmacophorepointsasindicatedinthediagramatright.Makesuretheyhaveintraconstraints.Startthesearch.

5. InCrossMinerversion1.2youshouldgetback29hits,whichwhenclustered

areshownas5results.To custom define features (which are based on SMARTS), refer to theCrossMinerdocumentation.

planar_ring_projected

14