week 12, lecture 24 · arent folder 13-jun-03 38 kb dexample-programs/ 07-jun-01 2 kb 28-may-01 174...

Post on 31-Mar-2021

8 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Practical Bioinformatics for Life Scientists

Week 12, Lecture 24

István Albert

Bioinformatics Consulting CenterPenn State

Midterm project report: ReadSeqdata format conversion

Note: not all conversions are valid!

Usually you can only go from a complex format to a simpler one.

BWA vs Bowtie 1 vs Bowtie 2• Testing on simulated reads BWA and realistic ones bowtie

• For both sensitivity (mapping rate) and specificity (correct alignments)

• We’ll keep a score along the way

• We all need to remember: both tools are triumphs of human ingenuity!

bowtie2

bwa

bwa

bowtie2

Short Read Archive

Command line toolkit to convert from the sra format to fastq, ssf, csfasta formats

Clone the wgsim repository on github

Compile wgsim

get wgsim from GitHub

Install the bowties

Cannot install on cygwin! Mac Linux only

Building indices for each tool

Running the aligners

wgsim evaluation

Quality reportsget alignment reports

Redirect standard error to output

get alignment reports

An R program to plot it

bwa

bowtie2

bowtie1

Homework 24

• Install R (optionally Rstudio) (will be using them during the next two lectures)

• Install and then run wgsim_eval.pl script on one of your SAM files that came from a wgsimsimulator

• Inspect the CIGAR strings of reads that were classified incorrectly and see if there is a pattern to them

top related