Download - SMRT-Portal Exercises
SMRT-Portal ExercisesJ Fass
UCD Genome Center Bioinformatics CoreThursday April 16, 2015
Running SMRT-Portal in AWS
see PacBio documentation
We’ll be running a virtual machine (VM) in the Amazon Web services “Cloud” (a server farm somewhere in the region you’ve selected). On this VM is a web server, serving you pages created by the SMRT-Portal application.
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Running SMRT-Portal in AWS
Launch an m3.2xlarge instance using ami-953fddd1.
Generate or re-use a key pair - you will need it!
Once running, find the public IP address (#.#.#.#), and open a browser tab with the URL:#.#.#.#:8080/smrtanalysis … or … #.#.#.#:8080/smrtportal
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Running SMRT-Portal in AWS
On a “vanilla” PacBio SMRT-Portal instance (U.S. East / N. Virginia), you would need to create one administrator account. This AMI already has one, but feel free to change the password, add non-admin accounts, etc.
user: administratorpwd: 5MRT-P0rtal
Note: pwd = >0 symbols, >0 numbers, >8 characters
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Running SMRT-Portal in AWS
Log in as administrator (special user), then create separate accounts if desired.
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
How I imported 8 SMRT Cells (E coli)
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
SSH to AWS instance
ssh -i ~/.ssh/yourKey.pem [email protected]
ssh commandoption block (supplies private key in this case)destination (username@computername)
… (or use PuTTY) …
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
PacBio Public Datasets
https://github.com/PacificBiosciences/DevNet/wiki/Datasets
look for “Data supporting publications” …
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
PacBio Public Datasets
https://github.com/PacificBiosciences/DevNet/wiki/Datasets
look for “Data supporting publications” … look for the first MG1655 xml & bas.h5 files …
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Enter “dropbox” directory
cd /opt/smrtanalysis/userdata/inputs_dropbox
cd commanddestination directory
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Pull in data
mkdir MG1655
cd MG1655
wget [xml file link]
mkdir Analysis_Results
cd Analysis_Results
wget [bas.h5 file link, + bax.h5’s if present]
commanddirectory / destination / source
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Import SMRT Cell data
Back in SMRT Portal, click through “Home” (upper left), then “Import and Manage” (third image), then “Input SMRT Cells.”
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Import another SMRT Cell (exercise)
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
SSH to AWS instance
ssh -i ~/.ssh/yourKey.pem [email protected]
ssh commandoption block (supplies private key in this case)destination (username@computername)
… (or use PuTTY) …
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
PacBio Public Datasets
https://github.com/PacificBiosciences/DevNet/wiki/Datasets
look for “E. coli size selected 20kb library” …
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
PacBio Public Datasets
Find the SMRT Cell data files “tarball,” and copy the link (don’t download; you’ll break our wireless!).
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Feeding Data to the SMRT-Portal
Back in a shell (terminal) on your instance, navigate to SMRT-Portal’s input dropbox.
cd /opt/smrtanalysis/userdata/inputs_dropboxwget [link]mkdir Ecoli20kbcd Ecoli20kbmkdir Analysis_Resultstar -xzvf ecoliK12.tar.gz
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Feeding Data to the SMRT-Portal
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Back in SMRT Portal, click through “Home” (upper left), then “Import and Manage” (third image), then “Input SMRT Cells.”
Feeding Data to the SMRT-Portal
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Via Home, Import and Manage, and [Import] SMRT cells, get to import page. Select directory, and Scan.
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
HGAP Assembly
Running HGAP
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Click Design Job, then Create New, (deal with the design wizard - I usually select “display all protocols”). You should see 9 SMRT Cells available (we just imported the 9th).
Running HGAP
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Select the “RS_HGAP_Assembly.3” Protocol from the drop-down menu, enter name and (if desired) comments, select 20kb cell and click right arrowhead to add cell to the job you’re designing, then Save and Start!
Running HGAP
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
early results just assess reads, subreads ...
Running HGAP
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Final results include pre-assembly, realigned reads, etc.
HGAP output
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Find the Polished Assembly Fasta link, right-click and Save link as … (to avoid troublesome name).
HGAP output
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Notice the BAM and BAI links; these allow you to view the original reads aligned back to the assembly (e.g. in IGV).
Check assembly via homology
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Using Mauve, we’ll align our assembled genome to the trusted E. coli K-12 MG1655 reference assembly, from GenBank (link).
Check assembly via homology
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Launch Mauve, then select File → Align with progressiveMauve. Then Add Sequence (click to add GenBank reference, then our assembly), click Align (and add a place to save output).
Check assembly via homology
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
(see Mauve site for details on viewer, etc. … we’ll explore during Workshop)
Check for circularity (if appropriate)
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Launch Gepard, Select file specifying the polished genome assembly twice (once for horizontal, once for vertical), then create dotplot.
Check for circularity (if appropriate)
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Looks fine, right? But the overlaps will be on the size scale of the reads … not visible at this scale.
Check for circularity (if appropriate)
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Use the Advanced mode, Plot tab, to specify the first ~20kb on the horizontal, and the last ~20kb on the vertical. Then Update dotplot.
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Alignment / Resequencing Protocols
Align to your own reference
In SMRT-Portal, go Home, then Import and Manage, then reference sequences. Select New to upload our down loaded reference (note there’s also a Scan option - upload first to /opt/smrtanalysis/userdata/references_dropbox/).
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Align to your own reference
Design a job using the same reads, and the RS_Resequencing.1 protocol. Specify your uploaded reference sequence, save, and start the job. (I’m using E albertii in this case, RefSeq id NZ_CP007025.1)
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
UC Davis Genome Center | Bioinformatics Core | J Fass SMRT-Portal Exercises 2015-04-16
Viewing Read Alignments with IGV