blast fro windows
TRANSCRIPT
Standalone BLAST Setup for Windows PCTao Tao, [email protected]
IntroductionIn addition to providing BLAST sequence alignment services on the web, NCBI also makessequence alignment utilities available for download through FTP. This allows BLAST searchesto be performed on local platforms against databases either downloaded from NCBI or createdlocally. These utilities are run through command (DOS-like) windows and accept text-basedcommands. There is no graphic user interface.
The following tutorial discusses the steps needed to install NCBI C++ based BLAST programs(blast+) and a sample NCBI database on PCs running Windows Operating Systems. Theinstallation of soon to be deprecated C-based BLAST package (legacy blast) will be discussedonly briefly at the end.
DownloadingThe blast+ software package is available as two self-extracting archives. The ncbi-blast-#.#.#+-win32.exe is compatible with PCs running Windows XP, 32-bit Vista or Windows 7. Theother archive, ncbi-blast-#.#.#+-win64.exe, is for PCs with 64-bit chipsets running WindowsXP 2003, 64-bit Vista or Windows 7. In both cases, "#.#.#" denotes the current version numberof the package. Archives with the same base name are equivalent.
Note: the archive with the ".tar.gz" file extension does not have the installer function. Thediscussion below focuses on archives with an ".exe" extension.
StepsSteps to download the package are described below.
• Point a browser to this ftp directory: ftp://ftp.ncbi.nlm.nih.gov/blast/executables/LATEST/
• Right click on a desired archive and select "Save link as…" from the popup menu• In the prompt, switch to a desired directory (folder) and click the "Save" button to save
the archive to a desired location on the local diskThe above steps, as sample screenshots for the "ncbi-blast-2.2.23-win32.exe" archive, aregiven in Figure 2, where the first two steps are demonstrated by the top panel and the last stepis demonstrated by the bottom panel.
ExamplesExample downloading steps are depicted by screen shots in Figure 2a and 2b.
InstallationThe blast+ archive downloaded above contains a built-in installer. Accepting the licenseagreement after double-clicking, the installer will prompt for an installation directory. Thedefault installation directory in this test case is "C:\blast-2.2.23+". Clicking the "Install" button,
BLAST® H
elpBLAST®
Help
BLAST® H
elpBLAST®
Help
the installer will create a "doc" subdirectory containing a comprehensive user manual in pdfformat, an "uninstaller" for future removal of the installation, and a "bin" subdirectory wherethe following BLAST programs and accessory utilities are kept.
blastdbcheck.exe blastdbcmd.exe blastdb_aliastool.exeblastn.exe blastp.exe blastx.exeblast_formatter.exe convert2blastmask.exe dustmasker.exelegacy_blast.pl makeblastdb.exe makembindex.exepsiblast.exe rpsblast.exe rpstblastn.exesegmasker.exe tblastn.exe tblastx.exe update_blastdb.plwindowmasker.exe
For functions of individual program or utility, refer to Table x in Standalone BLAST setup forUnix.
Test BLAST databaseIn addition to BLAST programs and accessory utilities, target database are also a keycomponent of BLAST searches. The common set of pre-formatted NCBI BLAST databases isavailable as compressed archives from NCBI ftp site, or databases can be prepared de novofrom custom FASTA sequences locally using the makeblastdb utility. For more effectivedatabase management, a "db" subdirectory should be created under the "C:\blast-2.2.23+\".
Steps for downloading preformatted BLAST databases from NCBI are:• point a browser to ftp://ftp.ncbi.nlm.nih.gov/blast/db/• right-click on a desired file (refseq_rna.tar.gz in this example case)• select "Save link as …" from the popup menu• prompted, use the "Save in" to change the directory to "C:\blast-2.2.23+\db\"
Similar procedures in Figure 2 can be used to download the BLAST databases.
This downloaded database is blast-ready after inflation and extraction with a decompressionutility. Figure 3 below shows an example inflation/extraction procedure using WinZip.
A utility included in the blast+ package, update_blastdb.pl, is designed for regular downloadingof preformatted BLAST databases from NCBI. It requires the installation of the Perl packageand execution from the command prompt under the "C:\blast-2.2.23\db\" directory. The basecommand is:
perl update_blastdb.pl --passive base_database_name
where "base_database_name" is the name of the target database, without the "##.tar.gz"extension.
ConfigurationFor smooth execution of blast+, the PC must be configured to recognize BLAST programsunder the "C:\blast-2.2.23+\bin\" directory. To do this, a user environment variable named Pathneeds to be created with "C:\blast-2.2.23\bin\" as its value. The blast+ installer automaticallycreates a BLASTDB environment variable, which points to the "C:\blast-2.2.23+\" directory.For this setup with a separate "db" directory created for BLAST databases, the value ofBLASTDB environment variable must be modified to "C:\blast-2.2.3+\db\".
Page 2
Standalone BLAST Setup for Windows PC
BLAST® H
elpBLAST®
Help
BLAST® H
elpBLAST®
Help
Environment VariablesSteps to create or modify environment variables are summarized below:
• Click on "Start" >> "Settings" >> "Control Panel"• In the "Control Panel" window, double click the "System" icon• In the popup, click on the "Advanced" tab• Click the "Environment Variables" button• Click the "New" button under the "User variable for ..." panel• Type the environment variable name and enter the absolute path• Click "OK" to close the prompts
Example Screen ShotsScreen shots of these steps are given in Figure 4a, 4b, 4c, and 4d.
Execution and validationStandalone blast+ programs do NOT have a graphical user interface (GUI) and must beexecuted from a command prompt window (CMD). This window can be opened by clickingon "Start ⇨ Program ⇨ Accessories ⇨ Command Prompt" or by clicking "Start ⇨Run," typing "cmd" (minus quotes) and pressing enter. These processes are shown in thefollowing screenshots (Figures 5.1a and 5.1b).
Example ExecutionIn the command prompt, the working directory can be changed to "C:\blast-2.2.23+" by typing"cd \" followed by "cd blast-2.2.23+". If the first prompt is a drive other than "C:\", "C:" insteadof "cd \" should be used first. Figure 5.2 contains example commands and their console outputfor a work session that tests a blast-2.3.23+ installation.
Explanation of the ExecutionThe first "dir" listed the files and subdirectories under the blast-2.2.23+. The error/warningfree console outputs from "blastn -version" and "blastdbcmd -db refseq_rna -info" commandlines validate the installation.
A true validation of this installation is an actual search, which requires an input query. Thenext blastdbcmd command line dumps out a sequence from the installed database for use as aquery.
blastcmd –db refseq_rna –entry nm_000249 –outform "%f" –out test_query.txt
The exact meaning of the command line is (from left to right) to:a execute blastdbcmdb use refseq_rna as the target databasec get the database sequence with nm_000249 as its accessiond dump the sequence in FASTA format, ande send the output to a file named test_query.txt
This sequence is used subsequently as an input query in a test blastn search with the followingcommand line:
Page 3
Standalone BLAST Setup for Windows PC
BLAST® H
elpBLAST®
Help
BLAST® H
elpBLAST®
Help
blastn –query text_query.txt –db refseq_rna –out output.txt
which performs the following tasks:• calling blastn to run a nucleotide query vs nucleotide database search• using sequence contained in test_query.txt file as input query• searching against refseq_rna database, and• saving the result in output.txt
Parameters not explicitly specified will assume default values. To further customize the search,other search parameters with customized input values should be added. Typing "program -help" followed by enter key stroke will print out a complete list of program parameters andtheir accepted options to the console for quick reference. Further details are in the includeduser manual.
The final "dir" examines the directory content again to show that new output files are indeedgenerated as marked by red arrows.
Setup Steps For Legacy blastThe original standalone BLAST package based on NCBI C-toolkit (legacy blast) is deprecated.The installation of legacy blast package for Windows differs from that for blast+ describedabove. The key differences are summarized below.
a The legacy blast packages are located under a different ftp directory:
" ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/LATEST/
b The packages are named with this convention: blast-#.#.#-CHIP-win#.exe, where#.#.# is the version, CHIP is the chipset, and win# is the operating system (32 or 64bits)
c The packages do not contain installer function. It is recommended that the downloadedpackage be placed in a folder named blast-#.#.# (#.#.# to indicate version) first beforeextraction
d Double clicking the package will execute the self-exacting function to install thepackage by re-creating bin, doc and data subdirectories
bl2seq.exe blastall.exe blastclust.exe blastpgp.execopymat.exe fastacmd.exe formatrpsdb.exe implala.exe makemat.exemegablast.exe rpsblast.exe seedtop.ext
e The programs under the bin subdirectory have names and functions different fromthat provided by blast+: Input: List of UIDs (&id); Source Entrez database (&dbfrom);Destination Entrez database (&db)
f Configuring the legacy blast installation is similar to blast+. However, a additionalDATA environment variable with the path to the "data" subdirectory as its valueshould be specified
g If the refseq_rna database mentioned in section 3 is installed, the following commandlines can be used to test the installationOutput: XML containing linked UIDs fromsource and destination databases
Page 4
Standalone BLAST Setup for Windows PC
BLAST® H
elpBLAST®
Help
BLAST® H
elpBLAST®
Help
Technical AssistanceQuestions, feedbacks, and technical assistance requests should be sent to blast-help at:
Questions on other NCBI resources should be addressed to NCBI Service Desk at:
Figure 2a. Downloading of a blast+ package from NCBI through a web browser: Log on to ftp://ftp.ncbi.nlm.nih.gov/blast/executables/LATEST/ and select "Save link as ..." after right-clicking on "ncbi-blast-2.2.23+-win32.exe".
Page 5
Standalone BLAST Setup for Windows PC
BLAST® H
elpBLAST®
Help
BLAST® H
elpBLAST®
Help
Figure 2b. Downloading of a blast+ package from NCBI through a web browser: Change the location in the subsequence Windowsprompt to "C:" before saving the archive to a desired location.
Figure 3. Exacting the downloaded refseq_rna.tar.gz archive using WinZip.
Page 6
Standalone BLAST Setup for Windows PC
BLAST® H
elpBLAST®
Help
BLAST® H
elpBLAST®
Help
Figure 4a. Configure standalone BLAST using Windows' environment variables: The initial System popup displaying the system information.
Page 7
Standalone BLAST Setup for Windows PC
BLAST® H
elpBLAST®
Help
BLAST® H
elpBLAST®
Help
Figure 4b. Configure standalone BLAST using Windows' environment variables: Clicking the "Advanced" tab to see the "Environment Variables" button, which provides access to the Environment Variablessettings.
Page 8
Standalone BLAST Setup for Windows PC
BLAST® H
elpBLAST®
Help
BLAST® H
elpBLAST®
Help
Figure 4c. Configure standalone BLAST using Windows' environment variables: Clicking "Environment Variables" button on 4bleads to this popup, where existing environment variables can be accessed or new ones can be created using the "New" button. Inthis example, the BLASTDB environment variable in the "User variables …" is already specified.
Figure 4d. Create new user variable using the "New" button: Clicking the "New" button in Figure 4c leads to a popup similar tothis where new variable's name and path can be specified. Here a new Path variable is being specified with a value of "C:\blast-2.2.23+\bin\".
Page 9
Standalone BLAST Setup for Windows PC
BLAST® H
elpBLAST®
Help
BLAST® H
elpBLAST®
Help
Figure 5.1a Open a command prompt under Windows (XP): using the "Start" menu.
Figure 5.1b Open a command prompt under Windows (XP): using the "Run …" command menu.
Page 10
Standalone BLAST Setup for Windows PC
BLAST® H
elpBLAST®
Help
BLAST® H
elpBLAST®
Help
Figure 5.2 The output of a work session testing the blast+ installation. The command inputs are in red boxes. The test files produced by the blastdbcmd and blastn command inputs are marked by redarrows.
Page 11
Standalone BLAST Setup for Windows PC
BLAST® H
elpBLAST®
Help
BLAST® H
elpBLAST®
Help