Transcript
Page 1: Exclude blast hits - University at Buffaloubwp.buffalo.edu/.../uploads/sites/5/2017/02/Exclude_blast_hits.pdf · exclude such hits from your searches. I will use a gene from Clostridium

1

WhatdoIdoifmyblastsearchesseemtohaveallthetophitsfromthesamegenusorspecies?Ifthebacterialspeciesyouareusingtoannotateisclinicallysignificantorofgreatresearchinterest,youmayfindthatwhenyouperformblastsearches(particularlyinnr)thatyouseeminglyonlygethitsthataredifferentstrainsorisolatesofthesamespecies.Thisobviouslydoesn’tgiveyoumuchinformationabouthowwellconservedtheproteinonwhichyouareworkingiscomparedtoproteinsinothergenera.Thereisamethodtomodifyblasttoletyouexcludesuchhitsfromyoursearches.IwilluseagenefromClostridiumbotulinumasanexampletoillustratethisusingtheproteinsequenceofthegenewiththelocustagCLJ_B3418.Figure1showsthetopnrblasthitsforthisprotein.YoucaneasilyseethatallofthehitsbutonearefromClostridiumbotulinumwithveryhighlevelsofcoverageandidentities.TheyareessentiallyallthesameproteinfromdifferentisolatesofClostridiumbotulinum.

Figure1.Theblastresultsusinganon-filterednrblastsearchforCLJ_B3418.Theblastsearchcanbesetupslightlydifferentlytopreventthisproblemfromoccurring.Asnotedinfigure2,wecansetthesearchuptoexclude,inthiscase,thetaxid:1485(Clostridium).ThetaxidnumberstandsfortheNCBITaxonomyIDnumber.Byexcludingthetaxidnumber1485,allblasthitsinthattaxonomicclassificationwillnotbeincluded.TodothiswetypethegenusnameClostridiumintheOrganismtextboxbelowthesequenceinputbox.Asyoutypeapulldownmenuofoptionswillappearwhichyoucansubsequentlyjustclickonto

Page 2: Exclude blast hits - University at Buffaloubwp.buffalo.edu/.../uploads/sites/5/2017/02/Exclude_blast_hits.pdf · exclude such hits from your searches. I will use a gene from Clostridium

2

select(seehighlightedmenuiteminFigure2).ThesimplyclicktheExcludecheckboxnexttotheorganismnameandthenselectblast.

Figure2.SettingupablastsearchtoexcludetheCostridiumtaxid:1485.Figure3showstheresultsoftheblastresultforthesameproteinAFTERexcludingtheClostridiumtaxid1485.Notethedifferentnamesappearinginthesearchresults.However,alsonotethatthetophitisnolongertheonethatmatchestheproteinunderinvestigationinthespeciesyouareworkingon.ThusyouwouldtaketheFIRSTnrhitasthetophitinthiscaseinsteadofskippingoverthefirstone.NotealsothatifyouusethisblastresulttoselectsequencesfortheT-CoffeealignmentthatyouwillsubsequentlydointheSequenceBasedSimilarityModule,thatyouwillneedtoaddtheFASTAformattedsequenceoftheproteinunderinvestigationtothetopofthelistbeforeconstructingthealignment.Studentsshouldalsoaddacommentintheirtextbookofwhichtaxidnumberwasexcludedfromtheirsearch.Experimentwithdifferentlevelsofexclusion(onlyonespecies)oraddmultipleoptionsforexclusion(i.e.,thegenus)orsomewhereinbetween(differentspecificspeciesexcludedbyaddingadditionalorganismboxesinwhichtoenterchoicesbyusingthe+optiontotherightoftheexcludecheckboxtoaddanother).

Page 3: Exclude blast hits - University at Buffaloubwp.buffalo.edu/.../uploads/sites/5/2017/02/Exclude_blast_hits.pdf · exclude such hits from your searches. I will use a gene from Clostridium

3

Figure3.ThenrblastresultsforCLJ_B3418AFTERexcludingtheCostridiumtaxid:1485.Notethedifferentgenusandspeciesnamesofthetophits.YoucanalsousetheNCBITaxonomyBrowsertofinddifferentlevelsoftaxidtouseinyourexclusionsearches,especiallyifitisnotclearwhatyoushouldchoosefromthepulldownmenuinBLAST.TheuseoftheTaxonomyBrowserisdescribedingeneraltermsintheHorizontalGeneTransfersectionoftheprojectmanual.Briefly,goto:https://www.ncbi.nlm.nih.gov/taxonomyandenterthenameofyourorganism’sgenusinthesearchwindow(Clostridiumbotulinumistheexampleusedbelow)andclickonSearch.AresultsimilartoFigure4willdisplay.Clickontheorganismhyperlinkinblue,andyouwillbetakentothefulllineageoftheorganism(nextfigure).

Figure4.ResultsofsearchingforClostridiumbotulinumintheNCBITaxonomybrowser.

Page 4: Exclude blast hits - University at Buffaloubwp.buffalo.edu/.../uploads/sites/5/2017/02/Exclude_blast_hits.pdf · exclude such hits from your searches. I will use a gene from Clostridium

4

Figure5belowshowsaportionoftheC.botulinumresults.Inthelineageline,thelastentryisthegenus(Clostridum),butyoucanhoverthecursoroveranyofthelevelsoftaxonomyandseethenameofthelevel(i.e.,family,orderetc.).Figure6showswhatwilldisplaywhentheClostridiumhyperlinkisselected.

Figure5.TheClostridiumbotulinumresultsfromNCBITaxonomybrowser(notcomplete).OfinterestintheresultsfromclickingontheClostridiumhyperlinkdisplayeddisplayedinFigure6istheTaxonomyIDof1485(exactlytheonewefoundbylimitingtheBLASTresultsfromwithintheBLASTtool).Youcouldusethisinformationtosimplytype“Clostridium(taxid:1485)”–donot,however,includethequotationmarks-intheorganismwindowoftheBLASTsearchandclickexcludeasbefore.

Figure6.TheClostridiumgenustaxidinformation.

Page 5: Exclude blast hits - University at Buffaloubwp.buffalo.edu/.../uploads/sites/5/2017/02/Exclude_blast_hits.pdf · exclude such hits from your searches. I will use a gene from Clostridium

5

Wecanalsogofurther“up”intaxonomicwindowtoexcludemorethanonegenus(thoughyoushouldnothavetodothatroutinely).Forexample,Figure7showsthedisplaythatwouldcomeupifweclickedontheClostridiaceae(i.e.,thefamilytowhichthegenusClostridumbelongs)hyperlinkinsteadoftheClostridiumhyperlink.Differentgenerawillappearthatarepartofthisfamily.ClickingontheClostridiaceaelinkfromthispagewillresultintheinformationshownintheFigure8.

Figure7.ThedisplayresultingfromselectionoftheFamilyClostridiaceaeinthetaxonomybrowser.

Page 6: Exclude blast hits - University at Buffaloubwp.buffalo.edu/.../uploads/sites/5/2017/02/Exclude_blast_hits.pdf · exclude such hits from your searches. I will use a gene from Clostridium

6

HereweseethattheClostridiaceaefamilyhastheTaxonomyIDof31979.ToexcludethisFamilyfromtheBLASTresults,wewouldsimplytypein“Clostridiaceae(taxid:31979)”intotheorganismboxintheBLASTsearchandclickexclude.Thenextimagewillshowhowtheautofilloptionwillhighlightoncewepasteinthetaxid.

Figure8.ThetaxonomyidentificationnumberoftheFamilyClostridiaceae.

Figure9.ABLASTsearchsetuptoexcludemembersoftheFamilyClostridiaceaefromthesearchresults.Finally,Figure10showstheBLASTresultsfromdoingtheexclusionatthislevel.

Page 7: Exclude blast hits - University at Buffaloubwp.buffalo.edu/.../uploads/sites/5/2017/02/Exclude_blast_hits.pdf · exclude such hits from your searches. I will use a gene from Clostridium

7

Figure10.BLASTresultsafterexcludingtheFamilyClostridiaceaefromthesearch.


Top Related