crossminer: federated csd and pdb 3d searching
TRANSCRIPT
1
CrossMiner:FederatedCSDandPDB3Dsearching
Version1.0–July2016CrossMinerver.1.2
TableofContents
CrossMinerTerminology....................................................................2
Introduction.......................................................................................2
OverviewofCrossMiner.....................................................................3
Example1:SearchingwithaPharmacophore.....................................4
Example2:EditingaPharmacophore.................................................8
Example3:CreatingaPharmacophorefromaReferenceMolecule&ScaffoldHopping...............................................................................11
Exercise4:BuildingyourOwnFeatureDatabase...............................13
2
IntroductionCrossMinercanbethoughtofasapharmacophorebasedquerytool.However,itismuchmorepowerfulthantraditionalpharmacophorequerytoolsasitallowsyoutoquerynotonlydatabasesofligands,butalsoproteinsandprotein-ligandinteractions.CrossMinerincludesapreconfigureddatabaseofbiologicallyrelevantsubsetsoftheCambridgeStructuralDatabase(CSD)andtheProteinDataBank(PDB).Thepharmacophoreusedinthequeryisinteractive,allowingyoutoeasilyedititandinanumberofwaysthroughasimpleuserinterface.Thisdeliversanoverallinteractivesearchexperiencewithapplicationareasininteractionsearching,scaffoldhoppingortheidentificationofnovelfragmentsforspecificproteinenvironments.ThesubsetoftheCSDincludedwithCrossMinerconsistsofstructureswhicharenotorganometallic,haveanR-factorofatmaximum10%,have3Dcoordinates,havenodisorder,andarenotpolymeric(about234400structurestotal).TheincludedPDBdatabaseisacuratedProasisdatabasesubsetofallco-crystalizedligands,withthebindingsitedefinedasallmoleculeswithanatomwithina6Åradiusaroundtheligand(210,200bindingsitesfrom58,300PDBentries).Forfurtherdiscussion,pleaserefertotheCrossMinerdocumentation(intheinstallationdirectory)ortheoriginalpaper:KorbO,KuhnB,HertJ,TaylorN,ColeJ,GroomC&StahlM“InteractiveandVersatileNavigationofStructuralDatabases”JMedChem,2016,59(9):4257,DOI:10.1021/acs.jmedchem.5b01756ThistutorialisgearedtowardsthenoviceCrossMineruserwhohaslifescienceexperience.Itcoverstheprimaryfeatures,insearchingacrossligands,proteins,andligand-proteininteractions.Someoftheresultsmayvary,dependingonyourversionofCrossMiner.Whenyouhavecompletedthistutorial,youshouldbeabletobuildqueries,executesearches,andcreateyourowncustomizedfeaturedatabase.
CrossMinerTerminologyCrossMinerusesseveralterms,somecommontothefieldofdrugdiscovery,andsomenot.Forease,thesetermsaredefinedasbelow:Feature:apoint(orvector)whichrepresentsaSMARTSquery,andinthecaseofavector,geometricrules.Afeaturecouldbeinaprotein,ligand,orpharmacophore.Pharmacophorefeature:(orvector)issimplyafeaturethathasbeenselectedtobepartofapharmacophore.Exitvector:Afeaturethatrepresentsabondofanytype.Thisfeaturewillberepresentedastwomeshspheres.InthecaseofCrossMiner,directionalityinanexitvectordoesnotmatter.
3
OverviewofCrossMinerCrossMinerisapowerfultoolwithasimpleuserinterface.Thisquicksectionwillfamiliariseyouwiththebasicfunctionsandunderlyingdatacomponentsbeforemovingontoexploringsomescientificquestions.1. LaunchCrossMinerfromtheStartMenu>CCDC(Windows)or
Applications/CCDC.Thistutorialwillrefertoseveralareasinthetool,suchas‘resultsbrowser’and‘pharmacophoreandreferencefeaturebrowser’(referredtofromhereoutjustasthe‘featurebrowser’).Notewherethesecomponentsare.
CrossMiner’sunderlyingdataisadatabaseofcrystalstructurefeatureannotations.Forthistutorial,wewillusetheincludeddatabase–subsetsoftheCSDandthePDB.YouwillnoticethatnothingisloadedwhenyouopenCrossMiner–thisisdoneintentionallytoallowtheusertoloadacustomizeddatabaseinsteadofthedefaultone.Toloadthedefaultdatabase:2. ClickFile>>LoadFeatureDatabase,andselect:[CCDCinstallation]\CSD_CrossMiner1.2\databases\csd_pdb_crossminer.feat.
Bydefault,CCDCinstallationpathsare:Windows:C:\ProgramFiles\CCDCMac:\Applications\CCDC
Loadingwilltakeafewminutes,butevenoncethebarhits100%,itwillneedamomenttoinitialisethestructures.3. Onceloaded,youwillseetheCSDandPDBsubsetdatabaseslistedinthe
databaseselectionwindow.Youcanloadmultipledatabases,andusethetickboxestoindicatewhichdatabaseshouldbesearched.
4. Youwillalsoseealistoffeaturesnowinthebottomright,inthefeature
browser.Thesearethefeaturesusedtogeneratethesedatabases(referencecolumn).
4
Example1:SearchingwithaPharmacophoreHumancathepsinLplaysamajorroleinproteincatabolism,andisimplicatedinanumberofpathologicalprocesses.Assuch,itisacommonresearchtargetandservesasagoodexamplefornoveldrugdiscovery.Forthisexample,wewillbeusingapharmacophoreincludedwithCrossMinertoexploretheCSDandPDBforpossiblehits.1. LoadthecathepsinLpharmacophorebyclickingFile>>Load
Pharmacophoreandselect:[CrossMinerinstallation]\example_pharmacophores\catl_s3.cm
2. Thiswillloadthepharmacophoreintotheviewingarea.Takeamomentto
rotatethemoleculeandunderstandthedifferentpharmacophorefeaturepoints:
P:ProteinfeatureS:SmallmoleculefeatureDashedline:intramolecularconstraint.Constrainedfeaturesmustbelongto
eitherthesamemoleculeaseachother(intra,dashedgreenline)ordifferentmolecules(inter,dashedredline).
Meshsphere:theactualfeatureitself,wherethespheresizerepresentstheradiusoftolerance.
Solidsphere:theprojectedvirtualpointtorepresentthedirectionalityofahydrogenbondacceptor/donor.Afeaturecanhavemorethanoneprojectedpoint.Forexample,aHbondacceptorcanhavemultiplepotentiallonepairpreferredprojections
Notethatthecolorcodingisdefinedinthefeaturebrowser.Eghydrophobicfeaturesaregreen,hydrogenbondacceptorsarered,andsoon.ThepharmacophoreinyourvieweronlyhasoneprojectedHbondacceptor.Thesepointscorrelatetothefeaturebrowser:BindicatestheBasefeature,andVindicatestheaccompanyingVirtualpoint.
Protein:Hbondacceptorfeature(mesh)withprojecteddirectionality(solid)
Protein:Hbonddonorfeature(mesh)withprojecteddirectionality(solid)
Intramolecularconstraint
Smallmolecule:heavyatomsconstraint
Protein:hydrophobicfeatureconstraint
53. Spheresizeisdirectlycorrelatedtotolerance–thesmallerthespherethe
lowerthetoleranceforgeometricmatching.Youcanseetheradiusofeachfeature(Å)inthefeaturebrowser.Bydefault,theradiussizeis1Å.
Notethatsometimestheviewercangetquitecrowded–aneasywaytofindyourfeatureistoclickthetickboxinthefeatureonandoffagain.
4. Clickthestartbutton tobeginsearchingacrossthedatabasesformatches.Asthesearchruns,youwillseeresultspopulatingtheresultsbrowserwindow,aswellastheprogressbarwithtotalnumberofhitsattopright.
5. Thisparticularsearchreturnsahighnumberofhits,andmaytakeawhile.
Thespherenexttothestartsearchbuttonisgreenwhilethesearchis
running.Pausethesearchbyclickingthepause buttonwhenyoufeelyouhaveenoughhits(afewhundredwilldo).Donotstopthesearch,asthatstopstheentireprocessandremovesallhits.
6. Bydefault,allresultsareoverlaid,whichmakesforaneasyappreciationof
thecommonmotifsmatchingthepharmacophores.However,thisismessy.Byclickoneachresultinthebrowser,youcanviewthemoneatatime.Or,holddowntheshiftorctrlkeytoselectmultiple.
7. LocatetheresultwiththelowestRMSDbyclickingonthermsdcolumnin
thebrowser,toshowascendingorder.ThelowestRMSDresultinthisexampleis2HJ_001_1–yoursmaybedifferent,dependingonwhenyoustoppedyoursearch.Theseresultsallcomebackwitha2Ddiagram,withpharmacophorematchesindicated.SelectyoursmallestRMSDresultbyclickingontherow(doesnotneedtobeticked).
8. Foreaseofviewing,changethestyleintheupperlefttoCappedSticks.Take a few moments to explore how the returned result matches thepharmacophorequery.Inparticulartonote:
• Feature matching is based on size of sphere (and thus the tolerancelevel).Tinysphereswithtighttolerancewillresultinhitswithveryclose
StartsearchPausesearch
6alignmenttothecentreofthesphere,whilelargersphereshaveawider
areaforalignment.• Therearenoexplicithydrogens in theproteinsites –hydrogenbond
donors and acceptors are defined in this database based on featuredefinitions,whichcomefromexpertknowledgeaboutprotonationandtautomerstate.Thus,thereisnowaytodisplayhydrogens,astheyareinferred. See the referencepaper listed in the introduction formoreinformation.
Therearealotofresults,likelyduetothelargeradiusofthefeatures.Lowerthetoleranceofthefeaturestogetasmallerresultsetwithmoreprecisealignmentto the pharmacophore centres. To do this, you will need to edit thepharmacophore,whichcannotbedonewhenviewingresults.
9. Clickthestop buttontocleartheresultsandenableediting.10. EnsurethatPickingmodeissettoEditPharmacophore.11. In editing mode, you can double click on the radius sizes in the feature
browsertochangethem.Changetheradiusofeveryfeatureto0.5tolowerthetolerance.Thiswillresultinfewerhits,whichareallcloseralignedwiththecentreofthepharmacophorefeatures.
Startthesearchonthenew,tighter,pharmacophore.Itwilltakeafewminutestopickuphits,astherearefarfewer.Letthesearchgotocompletionthistime,resultingin20hits.12. Ifclusterhitsisselected,thentheresultsareclusteredbasedontheadjacent
Tanimoto value. If clustered, youwill see representativesof thosesimilargroups in the resultsviewer (and thus fewer hits thanactually returned).Unchecktheboxtoseeallhits,includingthosewhichareverysimilartoeachother.
The lowestRMSDresultshereare2XU1and2XU3–whichareunsurprisinglycathepsinLPDBentries.Ifyouexplorethetopresult(2XU1_001),you’llseethatthehitmatchesthepharmacophorealmostexactly.
Stopsearch
7
13. TosavethetwolowestRMSDhitsforfurtherworkoutsideCrossMiner,clickthetickboxestotheleftof2XU1and2XU3toselectthem.Markingthemdoesnotdisplaythem,andsimilarly,displayingthem(viashiftorctrl)doesnotmarkthem.
14. ClickFiletodisplayseveralsaveoptions,whichwillsavethevisiblehits(in
thedisplayarea),allhits,orthemarkedhits.Forthisstep,clickFile>>SaveMarkedHits.Saveyourhitsascathepsin_hits.mol2.
Amol2filewillcontainproteinresidueinformation,whereasaSDFwillnot.
8
Example2:EditingaPharmacophoreOneofthepowersofCrossMineristheabilitytomanuallyinteractwithandeditapharmacophoreatanytime(evenwhileasearchisrunning).Forthisexample,you will be editing the cathepsin L pharmacophore. For ease, start with thedefaultcathepsinLpharmacophore.Reloaditasperstep1inexample1.Ifthesearchstartsagain,justclickstop.1. Make sure you are in Edit Pharmacophore mode. Right click on the
hydrophobic feature to bring up the editingmenu. ClickMorph Into andwater.YouwillalsoneedtoclickAnyMolecule,asthewateristechnicallynotapartoftheproteinorsmallmolecule.
2. You’llnoticethatthenameofthefeaturedoesnotchange.Youwillneedto
do this manually by right clicking on the feature and selecting ChangeDescription.Callitwater.
Therightclickmenuoffeatures iswhereanumberofthingsaredefined.Usethismenuto:
• Definewherethefeaturebelongs:aprotein,asmallmolecule,orany.• Addconstraints(Constrainto).• Changethefeaturetype(MorphInto).• Changethelabelofthefeature(ChangeDescription).• Deleteafeature(DeleteFeature).
3. Changetheradiusofthevirtual‘acceptor_projected’(V)from1.20to1.00,
andthewaterradiusfrom1.50to1.00.Runthesearch.4. Reducingthefeaturesizestillreturnsalargenumberofhits.Selectoneof
thelowRMSDresultsandvisualizeit.CrossMinerallowsyoutovisuallyinteractwiththepharmacophore.Youcaneditthefeaturesnumerically in thefeaturebrowser,andalsomanually inviewer.Whilethisisnotarigorousapproach,itishelpfulforexploringchemicalspace.
95. Mouseoverthewaterfeature,thenclickanddragusingthemiddlemouse
button.Thiswillchangetheradiusofthewatersphere.Thislargersphereallows formore flexibility inawaterplacementaround the ligand. If youhavetroublegettingthiswork,ensurethatInteractiveeditingmodebuttonis on (see step 7 below). Change the radius back to1.00 in the featurebrowser.
Wheneveryoueditthepharmacophore,thecurrentresultwilldisappear(asthisstartsanewsearch).Toeditapharmacophoreoverlaidwithamoleculeintheviewingarea,youwillneedtofirstsetthemoleculeasareference.6. Rightclickonahitintheresultsbrowser,andclickUseasreference.Thiswill
loadthemoleculeintotheviewer,withallfeaturesdefined.Thismeansthatprojection lines betweenbase and virtual features are shown, aswell ascentroidsandprojectedaromaticvirtualpoints.
7. Switchto interactiveeditingbyclickontheInteractiveeditingbutton.An
openhandmeanseditingisoff,andaclosedhandmeanseditingison.8. Dragoneoftheheavyatomfeaturestoanearbycarbonatom.Notethatthe
searchstartsassoonasyouletitgo.Thereferencemoleculestaysvisibleevenwhilethesearchruns.Toundisplaythereferencemolecule,clickthetickboxnexttoreferenceinthevisibilitybar.Thisbarisforcontrollingthevisibilityofthereferencemolecule,hits,constraints,andlabels.
EveryCrossMineruserhasaccidentlygrabbedapharmacophorepointanddrugitacrossthescreenwhentheymeanttorotatethemoleculeviewer.Therearetwomajortrickstoavoidingthis:
• Besuretoturnoff‘interactiveediting’modewhenyouarenotusingit.• Set‘Pickingmode’backtoPickatomswhenyouaredoneeditingyour
pharmacophore.Ctrl+z(orEdit>>Undo)isveryhelpful.Itwillundothelastchangemadetothepharmacophore. However, the search will start over again, so form goodCrossMinerhabitsearly!
Interactiveeditingoff
Interactiveeditingon
Projectedvirtualpoints
Centroid
109. Youcanalsoaddnewfeaturesfromthefeaturebrowser.Scrolltofluorine
in the featurebrowserand rightclick, then clickCreate fluorine.Thiswilldropthenewfeatureintothecanvas.
10. Pickapointyouwouldliketoalignthefluorineto.WhilestillwithInteractive
editingmodeon,dragthenewfluorineneartothereferencepoint.Whenthenewfluorinefeature is clearlyclosertothatreferencepointthananyotherreferencepoint,right-clickthefeature,andselectSnapToAtom.Thiswillalignthefeaturetothenearestatomicreferencepoint.
This new search probably won’t find any results. This step was just todemonstratehowtoaddandmanoeuvrenewfeatures.11. SaveyoureditedpharmacophorebyclickingFile>>SavePharmacophore.
Namethefilepharmacophore_edited.cm.Therearetwooptionsforsavingapharmacophore,asdefinedbelow.
SavePharmacophore:WillsavethecurrentpharmacophoreinaCrossMinerfileformat(cm).SavePymolPharmacophore:Willsavethecurrentpharmacophoreina(py)fileformat.Thisisavisualbasedformattingwithnoscientificinformationcontainedbeyondlabels(noatomtyping,etc).
11
Example3:CreatingaPharmacophorefromaReferenceMolecule&ScaffoldHoppingAfarmorecommonuseofCrossMineristocreateapharmacophorefromareferencemolecule.Thiscouldbebuildingapharmacophorefromtheligandofaco-crystallizedprotein-ligandcomplex,ormanuallycreatingaCrossMinerpharmacophorefromasetofatomsrepresentingapharmacophorecreatedbyanothermechanism.Ifyoualreadyhaveworkinyourviewer,clearitbyclickingEdit>>ClearPharmacophoreandEdit>>ClearReference.1. ClickFile>>LoadReferenceandselect2xu1_ligand.sdffromtheworkshop
datafolder.Notethatthefeaturesareautomaticallyassigned.Thisisbasedontheatomtypesandfeaturedefinitionsthatarepartoftheloadeddatabase.2. Rightclickonfeaturesofthereferencemoleculetodefinepharmacophore
featurepoints,asyoudidabove.Definethepharmacophoreasshownintheimageatright.
3. Wewantallofthesefeaturestobeinthesamemolecule,andthusallto
haveintraconstraints.Ratherthanmanuallysettingtheseup,justclicktheIntrabuttononthetoolbar,toautomaticallyconstrainallthefeatures.
Startthesearch.ImmediatelyalotofhitswillbefoundintheCSD(asindicatedbythe6-letterrefcodesintheresultsbrowser).Butforthisexperiment,weareonlyinterestedinmatchingligandsfromthePDB.Stopthesearch.4. Clickthetickboxnexttocsd536_crossminerinthedatabasebrowserto
disablesearchingtheCSD.Restartthesearchandletitruntocompletion.Scrollthroughtheresults,assortedbyRMSD.Clearlytherearehitsfromtheco-crystalizedligand,buttherearealsosomeinterestinganalogs.
planar_ring_projected(projectedawayfrommolecule)
hydrophobic
donor_projected
acceptor_projected(oxygenprojectingawayfrommolecule)
Setallfeaturestointraconstraints
12 This demonstrates how easy scaffold hopping could be carried out withCrossMiner.Asanotherexample,youwillexperimentwithscaffoldhoppingofPD180970,theco-crystallizedligandoftheABLkinasedomain.5. Load the file 2hzi_ligand.sdf as you did previously. Define the
pharmacophore as indicated. For the acceptor_projected, two points areindicated – ensure that bothprojected virtual points are set to features.Ensurethatthefeaturesareallsettointraconstraints,asabove.
Forthissearch,leaveboththeCSDandPDBdatabasesselected.You’llseethatthere is no shortage of great results from the CSD to match this query. Tryexperimentingwithnarrowingthisresultsetdown.Tryrestrictingsomeoftheradii,oraddinginanewfeaturethatyourchemistryexperiencesaysmightbeimportant.
planar_ring_projected
hydrophobicexit_vector
acceptor_projected(2)
13
Exercise4:BuildingyourOwnFeatureDatabaseForthisexerciseyouwillbebuildingyourownfeaturedatabasefromasetofligands. These ligands do not have pre-configured features, and will rely onCrossMinertodetectandaddthefeatures.1. SelectFeatureDatabase>>Create.Select‘Add’andnavigatetothetutorial
datafolderandloadfviia_ligands01.sdf.2. You will also need to load the feature definitions – this step allows for
flexibilityinhowfeaturesaredefined,ifyouwanttoaddcustomfeatures.Click on Add Substructures, and navigate to [CrossMinerInstallation]\feature_definitions.Selecteveryentrystartingwithfeatures_,usingtheshiftkey.ClickOK.
3. ClickCreateFeatureDatabase,notOK,tocreatethedatabase.Itwillaskyou
for a save location – indicate the tutorial data directory, and name itfviaa_ligands01.feat.Oncetheindexinghascompleted,clickOKtoclosetheCreateFeatureDatabasewindow.
Tryoutthenewdatabase.LoaditbyclickingFile>>LoadFeatureDatabaseasyouhavebefore.Notethatinthedatabasebrowser,onlyyourdatabaseislisted.4. Thesimplesttestistoseeifyoucanpullbackamoleculefromtheoriginal
dataset.Loadareferencemoleculefromtheworkshopdatadirectory(File>> Load Reference >> 4YT6_4JY.SDF). Define two planar_ring_projecte’pharmacophorepointsasindicatedinthediagramatright.Makesuretheyhaveintraconstraints.Startthesearch.
5. InCrossMinerversion1.2youshouldgetback29hits,whichwhenclustered
areshownas5results.To custom define features (which are based on SMARTS), refer to theCrossMinerdocumentation.
planar_ring_projected