blast fro windows

11
Standalone BLAST Setup for Windows PC Tao Tao, Ph.D. NCBI [email protected] Introduction In addition to providing BLAST sequence alignment services on the web, NCBI also makes sequence alignment utilities available for download through FTP. This allows BLAST searches to be performed on local platforms against databases either downloaded from NCBI or created locally. These utilities are run through command (DOS-like) windows and accept text-based commands. There is no graphic user interface. The following tutorial discusses the steps needed to install NCBI C++ based BLAST programs (blast+) and a sample NCBI database on PCs running Windows Operating Systems. The installation of soon to be deprecated C-based BLAST package (legacy blast) will be discussed only briefly at the end. Downloading The blast+ software package is available as two self-extracting archives. The ncbi-blast-#.#.# +-win32.exe is compatible with PCs running Windows XP, 32-bit Vista or Windows 7. The other archive, ncbi-blast-#.#.#+-win64.exe, is for PCs with 64-bit chipsets running Windows XP 2003, 64-bit Vista or Windows 7. In both cases, "#.#.#" denotes the current version number of the package. Archives with the same base name are equivalent. Note: the archive with the ".tar.gz" file extension does not have the installer function. The discussion below focuses on archives with an ".exe" extension. Steps Steps to download the package are described below. Point a browser to this ftp directory: ftp://ftp.ncbi.nlm.nih.gov/blast/executables/LATEST/ Right click on a desired archive and select "Save link as…" from the popup menu In the prompt, switch to a desired directory (folder) and click the "Save" button to save the archive to a desired location on the local disk The above steps, as sample screenshots for the "ncbi-blast-2.2.23-win32.exe" archive, are given in Figure 2, where the first two steps are demonstrated by the top panel and the last step is demonstrated by the bottom panel. Examples Example downloading steps are depicted by screen shots in Figure 2a and 2b. Installation The blast+ archive downloaded above contains a built-in installer. Accepting the license agreement after double-clicking, the installer will prompt for an installation directory. The default installation directory in this test case is "C:\blast-2.2.23+". Clicking the "Install" button, BLAST® Help BLAST® Help BLAST® Help BLAST® Help

Upload: kris-cahyo-mulyatno

Post on 24-Oct-2014

54 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: BLAST Fro Windows

Standalone BLAST Setup for Windows PCTao Tao, [email protected]

IntroductionIn addition to providing BLAST sequence alignment services on the web, NCBI also makessequence alignment utilities available for download through FTP. This allows BLAST searchesto be performed on local platforms against databases either downloaded from NCBI or createdlocally. These utilities are run through command (DOS-like) windows and accept text-basedcommands. There is no graphic user interface.

The following tutorial discusses the steps needed to install NCBI C++ based BLAST programs(blast+) and a sample NCBI database on PCs running Windows Operating Systems. Theinstallation of soon to be deprecated C-based BLAST package (legacy blast) will be discussedonly briefly at the end.

DownloadingThe blast+ software package is available as two self-extracting archives. The ncbi-blast-#.#.#+-win32.exe is compatible with PCs running Windows XP, 32-bit Vista or Windows 7. Theother archive, ncbi-blast-#.#.#+-win64.exe, is for PCs with 64-bit chipsets running WindowsXP 2003, 64-bit Vista or Windows 7. In both cases, "#.#.#" denotes the current version numberof the package. Archives with the same base name are equivalent.

Note: the archive with the ".tar.gz" file extension does not have the installer function. Thediscussion below focuses on archives with an ".exe" extension.

StepsSteps to download the package are described below.

• Point a browser to this ftp directory: ftp://ftp.ncbi.nlm.nih.gov/blast/executables/LATEST/

• Right click on a desired archive and select "Save link as…" from the popup menu• In the prompt, switch to a desired directory (folder) and click the "Save" button to save

the archive to a desired location on the local diskThe above steps, as sample screenshots for the "ncbi-blast-2.2.23-win32.exe" archive, aregiven in Figure 2, where the first two steps are demonstrated by the top panel and the last stepis demonstrated by the bottom panel.

ExamplesExample downloading steps are depicted by screen shots in Figure 2a and 2b.

InstallationThe blast+ archive downloaded above contains a built-in installer. Accepting the licenseagreement after double-clicking, the installer will prompt for an installation directory. Thedefault installation directory in this test case is "C:\blast-2.2.23+". Clicking the "Install" button,

BLAST® H

elpBLAST®

Help

BLAST® H

elpBLAST®

Help

Page 2: BLAST Fro Windows

the installer will create a "doc" subdirectory containing a comprehensive user manual in pdfformat, an "uninstaller" for future removal of the installation, and a "bin" subdirectory wherethe following BLAST programs and accessory utilities are kept.

blastdbcheck.exe blastdbcmd.exe blastdb_aliastool.exeblastn.exe blastp.exe blastx.exeblast_formatter.exe convert2blastmask.exe dustmasker.exelegacy_blast.pl makeblastdb.exe makembindex.exepsiblast.exe rpsblast.exe rpstblastn.exesegmasker.exe tblastn.exe tblastx.exe update_blastdb.plwindowmasker.exe

For functions of individual program or utility, refer to Table x in Standalone BLAST setup forUnix.

Test BLAST databaseIn addition to BLAST programs and accessory utilities, target database are also a keycomponent of BLAST searches. The common set of pre-formatted NCBI BLAST databases isavailable as compressed archives from NCBI ftp site, or databases can be prepared de novofrom custom FASTA sequences locally using the makeblastdb utility. For more effectivedatabase management, a "db" subdirectory should be created under the "C:\blast-2.2.23+\".

Steps for downloading preformatted BLAST databases from NCBI are:• point a browser to ftp://ftp.ncbi.nlm.nih.gov/blast/db/• right-click on a desired file (refseq_rna.tar.gz in this example case)• select "Save link as …" from the popup menu• prompted, use the "Save in" to change the directory to "C:\blast-2.2.23+\db\"

Similar procedures in Figure 2 can be used to download the BLAST databases.

This downloaded database is blast-ready after inflation and extraction with a decompressionutility. Figure 3 below shows an example inflation/extraction procedure using WinZip.

A utility included in the blast+ package, update_blastdb.pl, is designed for regular downloadingof preformatted BLAST databases from NCBI. It requires the installation of the Perl packageand execution from the command prompt under the "C:\blast-2.2.23\db\" directory. The basecommand is:

perl update_blastdb.pl --passive base_database_name

where "base_database_name" is the name of the target database, without the "##.tar.gz"extension.

ConfigurationFor smooth execution of blast+, the PC must be configured to recognize BLAST programsunder the "C:\blast-2.2.23+\bin\" directory. To do this, a user environment variable named Pathneeds to be created with "C:\blast-2.2.23\bin\" as its value. The blast+ installer automaticallycreates a BLASTDB environment variable, which points to the "C:\blast-2.2.23+\" directory.For this setup with a separate "db" directory created for BLAST databases, the value ofBLASTDB environment variable must be modified to "C:\blast-2.2.3+\db\".

Page 2

Standalone BLAST Setup for Windows PC

BLAST® H

elpBLAST®

Help

BLAST® H

elpBLAST®

Help

Page 3: BLAST Fro Windows

Environment VariablesSteps to create or modify environment variables are summarized below:

• Click on "Start" >> "Settings" >> "Control Panel"• In the "Control Panel" window, double click the "System" icon• In the popup, click on the "Advanced" tab• Click the "Environment Variables" button• Click the "New" button under the "User variable for ..." panel• Type the environment variable name and enter the absolute path• Click "OK" to close the prompts

Example Screen ShotsScreen shots of these steps are given in Figure 4a, 4b, 4c, and 4d.

Execution and validationStandalone blast+ programs do NOT have a graphical user interface (GUI) and must beexecuted from a command prompt window (CMD). This window can be opened by clickingon "Start ⇨ Program ⇨ Accessories ⇨ Command Prompt" or by clicking "Start ⇨Run," typing "cmd" (minus quotes) and pressing enter. These processes are shown in thefollowing screenshots (Figures 5.1a and 5.1b).

Example ExecutionIn the command prompt, the working directory can be changed to "C:\blast-2.2.23+" by typing"cd \" followed by "cd blast-2.2.23+". If the first prompt is a drive other than "C:\", "C:" insteadof "cd \" should be used first. Figure 5.2 contains example commands and their console outputfor a work session that tests a blast-2.3.23+ installation.

Explanation of the ExecutionThe first "dir" listed the files and subdirectories under the blast-2.2.23+. The error/warningfree console outputs from "blastn -version" and "blastdbcmd -db refseq_rna -info" commandlines validate the installation.

A true validation of this installation is an actual search, which requires an input query. Thenext blastdbcmd command line dumps out a sequence from the installed database for use as aquery.

blastcmd –db refseq_rna –entry nm_000249 –outform "%f" –out test_query.txt

The exact meaning of the command line is (from left to right) to:a execute blastdbcmdb use refseq_rna as the target databasec get the database sequence with nm_000249 as its accessiond dump the sequence in FASTA format, ande send the output to a file named test_query.txt

This sequence is used subsequently as an input query in a test blastn search with the followingcommand line:

Page 3

Standalone BLAST Setup for Windows PC

BLAST® H

elpBLAST®

Help

BLAST® H

elpBLAST®

Help

Page 4: BLAST Fro Windows

blastn –query text_query.txt –db refseq_rna –out output.txt

which performs the following tasks:• calling blastn to run a nucleotide query vs nucleotide database search• using sequence contained in test_query.txt file as input query• searching against refseq_rna database, and• saving the result in output.txt

Parameters not explicitly specified will assume default values. To further customize the search,other search parameters with customized input values should be added. Typing "program -help" followed by enter key stroke will print out a complete list of program parameters andtheir accepted options to the console for quick reference. Further details are in the includeduser manual.

The final "dir" examines the directory content again to show that new output files are indeedgenerated as marked by red arrows.

Setup Steps For Legacy blastThe original standalone BLAST package based on NCBI C-toolkit (legacy blast) is deprecated.The installation of legacy blast package for Windows differs from that for blast+ describedabove. The key differences are summarized below.

a The legacy blast packages are located under a different ftp directory:

" ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/LATEST/

b The packages are named with this convention: blast-#.#.#-CHIP-win#.exe, where#.#.# is the version, CHIP is the chipset, and win# is the operating system (32 or 64bits)

c The packages do not contain installer function. It is recommended that the downloadedpackage be placed in a folder named blast-#.#.# (#.#.# to indicate version) first beforeextraction

d Double clicking the package will execute the self-exacting function to install thepackage by re-creating bin, doc and data subdirectories

bl2seq.exe blastall.exe blastclust.exe blastpgp.execopymat.exe fastacmd.exe formatrpsdb.exe implala.exe makemat.exemegablast.exe rpsblast.exe seedtop.ext

e The programs under the bin subdirectory have names and functions different fromthat provided by blast+: Input: List of UIDs (&id); Source Entrez database (&dbfrom);Destination Entrez database (&db)

f Configuring the legacy blast installation is similar to blast+. However, a additionalDATA environment variable with the path to the "data" subdirectory as its valueshould be specified

g If the refseq_rna database mentioned in section 3 is installed, the following commandlines can be used to test the installationOutput: XML containing linked UIDs fromsource and destination databases

Page 4

Standalone BLAST Setup for Windows PC

BLAST® H

elpBLAST®

Help

BLAST® H

elpBLAST®

Help

Page 5: BLAST Fro Windows

Technical AssistanceQuestions, feedbacks, and technical assistance requests should be sent to blast-help at:

" [email protected]

Questions on other NCBI resources should be addressed to NCBI Service Desk at:

" [email protected]

Figure 2a. Downloading of a blast+ package from NCBI through a web browser: Log on to ftp://ftp.ncbi.nlm.nih.gov/blast/executables/LATEST/ and select "Save link as ..." after right-clicking on "ncbi-blast-2.2.23+-win32.exe".

Page 5

Standalone BLAST Setup for Windows PC

BLAST® H

elpBLAST®

Help

BLAST® H

elpBLAST®

Help

Page 6: BLAST Fro Windows

Figure 2b. Downloading of a blast+ package from NCBI through a web browser: Change the location in the subsequence Windowsprompt to "C:" before saving the archive to a desired location.

Figure 3. Exacting the downloaded refseq_rna.tar.gz archive using WinZip.

Page 6

Standalone BLAST Setup for Windows PC

BLAST® H

elpBLAST®

Help

BLAST® H

elpBLAST®

Help

Page 7: BLAST Fro Windows

Figure 4a. Configure standalone BLAST using Windows' environment variables: The initial System popup displaying the system information.

Page 7

Standalone BLAST Setup for Windows PC

BLAST® H

elpBLAST®

Help

BLAST® H

elpBLAST®

Help

Page 8: BLAST Fro Windows

Figure 4b. Configure standalone BLAST using Windows' environment variables: Clicking the "Advanced" tab to see the "Environment Variables" button, which provides access to the Environment Variablessettings.

Page 8

Standalone BLAST Setup for Windows PC

BLAST® H

elpBLAST®

Help

BLAST® H

elpBLAST®

Help

Page 9: BLAST Fro Windows

Figure 4c. Configure standalone BLAST using Windows' environment variables: Clicking "Environment Variables" button on 4bleads to this popup, where existing environment variables can be accessed or new ones can be created using the "New" button. Inthis example, the BLASTDB environment variable in the "User variables …" is already specified.

Figure 4d. Create new user variable using the "New" button: Clicking the "New" button in Figure 4c leads to a popup similar tothis where new variable's name and path can be specified. Here a new Path variable is being specified with a value of "C:\blast-2.2.23+\bin\".

Page 9

Standalone BLAST Setup for Windows PC

BLAST® H

elpBLAST®

Help

BLAST® H

elpBLAST®

Help

Page 10: BLAST Fro Windows

Figure 5.1a Open a command prompt under Windows (XP): using the "Start" menu.

Figure 5.1b Open a command prompt under Windows (XP): using the "Run …" command menu.

Page 10

Standalone BLAST Setup for Windows PC

BLAST® H

elpBLAST®

Help

BLAST® H

elpBLAST®

Help

Page 11: BLAST Fro Windows

Figure 5.2 The output of a work session testing the blast+ installation. The command inputs are in red boxes. The test files produced by the blastdbcmd and blastn command inputs are marked by redarrows.

Page 11

Standalone BLAST Setup for Windows PC

BLAST® H

elpBLAST®

Help

BLAST® H

elpBLAST®

Help