perl and bioperl

Upload: s-b-mirza

Post on 09-Apr-2018

234 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 Perl and Bioperl

    1/79

    PERL AND BIOPERL

  • 8/8/2019 Perl and Bioperl

    2/79

    CONTROL STRUCTURES

    if statement - first style

    if ($porridge_temp < 40) {

    print too hot.\n;

    }

    elsif ($porridge_temp > 150) {

    print too cold.\n;

    }

    else {

    print just right\n;

    }

  • 8/8/2019 Perl and Bioperl

    3/79

    CONTROL STRUCTURES

    if statement - second style

    statement ifcondition;

    print \$index is $index if $DEBUG;

    Single statements only Simple expressions only

    unless is a reverse if

    statement unless condition;

    print millennium is here! unless $year < 2000;

  • 8/8/2019 Perl and Bioperl

    4/79

    CONTROL STRUCTURES

    for loop - first style

    for (initial; condition; increment) { code }

    for ($i=0; $i

  • 8/8/2019 Perl and Bioperl

    5/79

    THE FOR STATEMENTTHE FOR STATEMENT

    Syntax

    for (START; STOP; ACTION) { BODY}

    y Initially execute STARTstatements once.

    y

    Repeatedly execute BODYuntil STOPis false.y Execute ACTIONafter each iteration.

    Example

    for ($i=0; $i

  • 8/8/2019 Perl and Bioperl

    6/79

    THE FOREACHSTATEMENTTHE FOREACHSTATEMENT

    Syntax

    foreach SCALAR ( ARRAY) { BODY}

    y Assign ARRAYelement to SCALAR.

    y

    Execute BODY.y Repeat for each element in ARRAY.

    Example

    asTmp = qw(One Two Three);

    foreach $s (@asTmp){$s .= sy ;}

    print(@asTmp); # Onesy Twosy Threesy

  • 8/8/2019 Perl and Bioperl

    7/79

    CONTROL STRUCTURES

    while loop

    while (condition) { code }

    $cars = 7;

    while ($cars > 0) {

    print cars left: , $cars--, \n;}

    while ($game_not_over) {}

  • 8/8/2019 Perl and Bioperl

    8/79

    CONTROL STRUCTURES

    until loop is opposite of while

    until (condition) { code }

    $cars = 7;

    until ($cars

  • 8/8/2019 Perl and Bioperl

    9/79

    CONTROL STRUCTURES

    Bottom-check Loops

    do { code } while (condition);

    do { code } until (condition);

    $value = 0;

    do {

    print Enter Value: ;

    $value = ;

    } until ($value > 0);

  • 8/8/2019 Perl and Bioperl

    10/79

    SUBROUTINES (FUNCTIONS)

    Defining a Subroutine

    sub name { code }

    Arguments passed in via @_ list

    sub multiply {

    my ($a, $b) = @_;

    return $a * $b;

    }

    Last value processed is the return value

    (could have left out word return, above)

  • 8/8/2019 Perl and Bioperl

    11/79

    SUBROUTINES (FUNCTIONS)

    Calling a Subroutine

    subname; # no args, no return value

    subname (args);

    retval = &subname (args); The & is optional so long as

    subname is not a reserved word

    subroutine was defined before being called

  • 8/8/2019 Perl and Bioperl

    12/79

    SUBROUTINES (FUNCTIONS)

    Passing Arguments

    Passes the value

    Lists are expanded

    @a = (5,10,15);

    @b = (20,25);

    &mysub(@a,@b);

    this passes five arguments: 5,10,15,20,25

    mysub can receive them as 5 scalars, or one array

  • 8/8/2019 Perl and Bioperl

    13/79

    SUBROUTINES (FUNCTIONS)

    Examples

    sub good1 {

    my($a,$b,$c) = @_;

    }

    &good1 (@triplet);

    sub good2 {

    my(@a) = @_;

    }

    &good2 ($one, $two, $three);

  • 8/8/2019 Perl and Bioperl

    14/79

    DEALING WITH HASHES

    keys( ) - get an array of all keys

    foreach (keys (%hash)) { }

    values( ) - get an array of all values

    @array = values (%hash); each( ) - get key/value pairs

    while (@pair = each(%hash)) {

    print element $pair[0] has $pair[1]\n;

    }

  • 8/8/2019 Perl and Bioperl

    15/79

    DEALING WITH HASHES

    exists( ) - check if element exists

    if (exists $ARRAY{$key}) { }

    delete( ) - delete one element

    delete $ARRAY{$key};

  • 8/8/2019 Perl and Bioperl

    16/79

    OTHER USEFUL FUNCTIONS

    push( ), pop( )- stack operations on lists

    shift( ),unshift( ) - bottom-based ops

    split( ) - split a string by separator

    @parts = split(/:/,$passwd_line); while (split) # like: split (/\s+/, $_)

    splice( ) - remove/replace elements

    substr( ) - substrings of a string

  • 8/8/2019 Perl and Bioperl

    17/79

    STRING MANIPULATIONSTRING MANIPULATION

    chop

    chop(VARIABLE)

    chop(LIST)

    index(STR, SUBSTR, POSITION)

    index(STR, SUBSTR)

    length(EXPR)

  • 8/8/2019 Perl and Bioperl

    18/79

    STRING MANIPULATION (CONT.)STRING MANIPULATION (CONT.)

    substr(EXPR, OFFSET, LENGTH)

    substr(EXPR, OFFSET)

    Example: string.pl

  • 8/8/2019 Perl and Bioperl

    19/79

    PATTERN MATCHING

    See if strings match a certain pattern

    syntax: string=~ pattern

    Returns true if it matches, false if not.

    Example: match abc anywhere in string: if ($str =~ /abc/) { }

    But what about complex concepts like:

    between 3 and 5 numeric digits

    optional whitespace at beginning of line

  • 8/8/2019 Perl and Bioperl

    20/79

    PATTERN MATCHING

    Regular Expressions are a way to describe

    character patterns in a string

    Example: match john or jon

    /joh?n/

    Example: match money values

    /\$\d+\.\d\d/

    Complex Example: match times of the day

    /\d?\d:\d\d(:\d\d)? (AM|PM)?/i

  • 8/8/2019 Perl and Bioperl

    21/79

    PATTERN MATCHING

    Symbols with Special Meanings

    period . - any single character

    char set [0-9a-f] - one char matching these

    Abbreviations

    \d - a numeric digit [0-9] \w - a word character [A-Za-z0-9_]

    \s - whitespace char [ \t\n\r\f]

    \D, \W, \S - any character but \d, \w, \s

    \n, \r, \t - newline, carriage-return, tab

    \f, \e - formfeed, escape \b - word break

  • 8/8/2019 Perl and Bioperl

    22/79

    PATTERN MATCHING

    Symbols with Special Meanings

    asterisk * - zero or more occurrences

    plus sign + - one or more occurrences

    question mark ? - zero or one occurrences

    carat ^ - anchor to begin of line dollar sign $ - anchor to end of line

    quantity {n,m} - between n and m

    occurrences (inclusively)

    [A-Z]{2,4} means 2, 3, or 4 uppercase letters.

  • 8/8/2019 Perl and Bioperl

    23/79

    PATTERN MATCHING

    Ways of Using Patterns

    Matching

    if ($line =~ /pattern/) { }

    also written: m/pattern/

    Substitution

    $name =~ s/ASU/Arizona State University/;

    Translation

    $command =~ tr/A-Z/a-z/; # lowercase it

  • 8/8/2019 Perl and Bioperl

    24/79

    COMMAND LINE ARGS

    $0 = program name

    @ARGV array of arguments to program

    zero-based index (default for all arrays)

    Example yourprog -a somefile

    $0 is yourprog

    $ARGV[0] is -a

    $ARGV[1] is somefile

  • 8/8/2019 Perl and Bioperl

    25/79

    BASIC FILE I/O

    Reading a File

    open (FILEHANDLE, $filename) || die \ open of

    $filename failed: $!;

    while () {

    chop $_; # or just: chop;print $_\n;

    }

    close FILEHANDLE;

  • 8/8/2019 Perl and Bioperl

    26/79

    BASIC FILE I/O

    Writing a File

    open (FILEHANDLE, >$filename) || die \ open of

    $filename failed: $!;

    while (@data) {

    print FILEHANDLE $_\n;# note, no comma!

    }

    close FILEHANDLE;

  • 8/8/2019 Perl and Bioperl

    27/79

    BASIC FILE I/O

    Predefined File Handles

    input

    output

    output print STDERR big bad error occurred\n;

    ARGV or STDIN

  • 8/8/2019 Perl and Bioperl

    28/79

    READING WITH READING WITH

    Reading from File

    y $input = ;

    Reading from Command Line

    y

    $input = ; Reading from Standard Input

    y $input = ;

    y $input = ;

  • 8/8/2019 Perl and Bioperl

    29/79

    READING WITH (CONT.)READING WITH (CONT.)

    Reading into Array Variable

    y @an_array = ;

    y @an_array = ;

    y

    @an_array = ;

  • 8/8/2019 Perl and Bioperl

    30/79

    PACKAGES

    Collect data & functions in a separate (private)

    namespace

    Reusable code

  • 8/8/2019 Perl and Bioperl

    31/79

    PACKAGES

    Access packages by file name or path:

    require getopts.pl;

    require /usr/local/lib/perl/getopts.pl;

    require ../lib/mypkg.pl;

  • 8/8/2019 Perl and Bioperl

    32/79

    PACKAGES

    Command: package pkgname;

    Stays in effect until next package or end of block {

    } or end of file.

    Default package is main

  • 8/8/2019 Perl and Bioperl

    33/79

    PACKAGES

    Package name in variables

    $pkg::counter = 0;

    Package name in subroutines

    sub pkg::mysub ( ) { } &pkg::mysub($stuff);

    Old syntax in Perl 4

    sub pkgmysub ( ) { }

  • 8/8/2019 Perl and Bioperl

    34/79

    PACKAGES

    #

    # Get Day Of Month Package

    #

    package getDay;

    sub main::getDayOfMonth {

    local ($sec, $min, $hour, $mday) = localtime;

    return $mday;

    }1; # otherwise require or use would fail

  • 8/8/2019 Perl and Bioperl

    35/79

    PACKAGES

    Calling the package

    require /path/to/getDay.pl;

    $day = &getDayOfMonth;

    In Perl 5, you can leave off & for previously

    defined functions:

    $day = getDayOfMonth;

  • 8/8/2019 Perl and Bioperl

    36/79

    WHAT ARE PERL MODULES?

    Modules are collections of subroutines

    Encapsulate code for a related set of processes

    End in .pm so Foo.pm would be used as Foo

    Can form basis for Objects in Object Orientedprogramming

  • 8/8/2019 Perl and Bioperl

    37/79

    USING A SIMPLE MODULE

    List::Util is a set of List utilities functions

    Read the perldoc to see what you can do

    Follow the synopsis or individual function

    examples

  • 8/8/2019 Perl and Bioperl

    38/79

    LIST::UTIL

    use List::Util;my @list = 10..20;

    my $sum = List::Util::sum(@list);

    print sum (@list) is $sum\n;

    use List::Util qw(shuffle sum);

    my $sum = sum(@list);

    my @list = (10,10,12,11,17,89);print sum (@list) is $sum\n;

    my @shuff = shuffle(@list);print shuff is @shuffle\n;

  • 8/8/2019 Perl and Bioperl

    39/79

    MODULE NAMING

    Module naming is to help identify the purpose of

    the module

    The symbol :: is used to further specify a

    directory name, these map directly to a directory

    structure

    List::Util is therefore a module called Util.pm

    located in a directory called List

  • 8/8/2019 Perl and Bioperl

    40/79

    (MORE) MODULE NAMING

    Does not require inheritance or specific

    relationship between modules that all start with

    the same directory name

    Case MaTTerS! List::util will not work

    Read more about a module by doing perldoc

    Modulename

  • 8/8/2019 Perl and Bioperl

    41/79

    MODULES AS OBJECTS

    Modules are collections of subroutines

    Can also manage data (akastate)

    Multiple instances can be created (instantiated)

    Can access module routines directly on object

  • 8/8/2019 Perl and Bioperl

    42/79

    OBJECT CREATION

    To instantiate a module call new

    Sometimes there are initialization values

    Objects are registered for cleanup when they are

    set to undefined (or when they go out of scope) Methods are called using -> because we are

    dereferencing object.

  • 8/8/2019 Perl and Bioperl

    43/79

    SIMPLE MODULE AS OBJECT EXAMPLE

    #!/usr/bin/perl -w

    use strict;

    use MyAdder;

    my $adder = new MyAdder;

    $adder->add(10);

    print $adder->value, \n;$adder->add(10);

    print $adder->value, \n;

    my $adder2 = new MyAdder(12);

    $adder2->add(17);print $adder2->value, \n;

    my $adder3 = MyAdder->new(75);

    $adder3->add(7);

    print $adder3->value, \n;

  • 8/8/2019 Perl and Bioperl

    44/79

    WRITING A MODULE: INSTANTIATION

    Starts with package to define the module name

    y multiple packages can be defined in a single module

    file - but this is not recommended at this stage

    The method name new is usually used forinstantiation

    y bless is used to associate a datastructre with an

    object

  • 8/8/2019 Perl and Bioperl

    45/79

    WRITING A MODULE: SUBROUTINES

    The first argument to a subroutine from a module

    is always a reference to the object - we usuallycall it $self in the code.

    This is an implicit aspect Object-Oriented Perl

    Write subroutines just like normal, but data

    associated with the object can be accessedthrough the $self reference.

  • 8/8/2019 Perl and Bioperl

    46/79

    WRITING A MODULEpackage MyAdder;

    use strict;

    sub new {

    my ($package, $val) = @_;

    $val ||= 0;

    my $obj = bless { value => $val}, $package;return $obj;

    }

    sub add {

    my ($self,$val) = @_;

    $self->{value} += $val;

    }

    sub value {

    my $self = shift;return $self->{value};

    }

  • 8/8/2019 Perl and Bioperl

    47/79

    WRITING A MODULE II (ARRAY)package MyAdder;

    use strict;

    sub new {

    my ($package, $val) = @_;

    $val ||= 0;

    my $obj = bless [$val], $package;return $obj;

    }

    sub add {

    my ($self,$val) = @_;

    $self->[0] += $val;

    }

    sub value {

    my $self = shift;return $self->[0];

    }

  • 8/8/2019 Perl and Bioperl

    48/79

    USING THE MODULE

    Perl has to know where to find the module

    Uses a set of include paths

    y type perl -V and look at the @INC variable

    Can also add to this path with the PERL5LIB

    environment variable

    Can also specify an additional library path inscript use lib /path/to/lib;

  • 8/8/2019 Perl and Bioperl

    49/79

    USING A MODULE AS AN OBJECT

    LWP is a perl library for WWW processing

    Will initialize an agent to go out and retrieve

    web pages for you

    Can be used to process the content that itdownloads

  • 8/8/2019 Perl and Bioperl

    50/79

    LWP::USERAGENT#!/usr/bin/perl -w

    use strict;use LWP::UserAgent;

    my $url = 'http://us.expasy.org/uniprot/P42003.txt';

    my $ua = LWP::UserAgent->new(); # initialize an object

    $ua->timeout(10); # set the timeout valuemy $response = $ua->get($url);

    if ($response->is_success) {

    # print $response->content; # or whateverif( $response->content =~ /DE\s+(.+)\n/ ) {

    print "description is '$1'\n";

    }if( $response->content =~ /OS\s+(.+)\n/ ) {

    print "species is '$1'\n";}

    }else {

    die $response->status_line;

    }

  • 8/8/2019 Perl and Bioperl

    51/79

    OVERVIEW OF BIOPERL TOOLKIT

    Bioperl is...

    y A Set of Perl modules for manipulating gnomic and

    other biological data

    y An Open Source Toolkit with many contributors

    y A flexible and extensible system for doing

    bioinformatics data manipulation

  • 8/8/2019 Perl and Bioperl

    52/79

    SOME THINGS YOU CAN DO

    Read in sequence data from a file in standard

    formats (FASTA, GenBank, EMBL, SwissProt,...)

    Manipulate sequences, reverse complement,

    translate coding DNA sequence to protein.

    Parse a BLAST report, get access to every bit of

    data in the report

    Dr. Mikler will post some detailed tutorials

  • 8/8/2019 Perl and Bioperl

    53/79

    MAJOR DOMAINS COVERED

    Sequences, Features, Annotations,

    Pairwise alignment reports

    Multiple Sequence Alignments Bibliographic data

    Graphical Rendering of sequence tracks

    Database for features and sequences

  • 8/8/2019 Perl and Bioperl

    54/79

    ADDITIONAL DOMAINS

    Gene prediction parsers

    Trees, Parsing Phylogenetic and Molecular

    Evolution software output

    Population Genetic data and summarystatistics

    Taxonomy

    Protein Structure

  • 8/8/2019 Perl and Bioperl

    55/79

    Simple formats - without features

    y FASTA (Pearson), Raw, GCG

    Rich Formats - with features and annotations

    y GenBank, EMBL

    y Swissprot, GenPept

    y XML - BSML, GAME, AGAVE, TIGRXML,

    CHADO

    SEQUENCE FILE FORMATS

  • 8/8/2019 Perl and Bioperl

    56/79

    Bio::SeqIO

    y multiple drivers: genbank, embl, fasta,...

    Sequence objects

    y Bio::PrimarySeq

    y Bio::Seq

    y Bio::Seq::RichSeq

    PARSING SEQUENCES

  • 8/8/2019 Perl and Bioperl

    57/79

    Common (Bio::PrimarySeq) methods

    y seq() - get the sequence as a string

    y

    length() - get the sequence lengthy subseq($s,$e) - get a subsequence

    y translate(...) - translate to protein [DNA]

    y revcom() - reverse complement [DNA]

    y display_id() - identifier string

    y description() - description string

    LOOK AT THE SEQUENCE OBJECT

  • 8/8/2019 Perl and Bioperl

    58/79

    Bio::Seq objects have the methods

    y add_SeqFeature($feature) - attach feature(s)

    y get_SeqFeatures() - get all the attached features.

    y species() - a Bio::Species object

    y annotation() - Bio::Annotation::Collection

    DETAILED LOOK AT SEQS WITH

    ANNOTATIONS

  • 8/8/2019 Perl and Bioperl

    59/79

    Bio::SeqFeatureI - interface

    Bio::SeqFeature::Generic - basic implementation

    SeqFeature::Similarity - some score info

    SeqFeature::FeaturePair - pair of features

    FEATURES

  • 8/8/2019 Perl and Bioperl

    60/79

    Bio::SeqFeatureI - interface - GFF derived

    y start(), end(), strand() for location information

    y location() - Bio::LocationI object (to represent

    complex locations)

    y score,frame,primary_tag, source_tag - feature

    information

    y spliced_seq() - for attached sequence, get the

    sequence spliced.

    SEQUENCE FEATURES

  • 8/8/2019 Perl and Bioperl

    61/79

    Bio::SeqFeature::Generic

    y add_tag_value($tag,$value) - add a tag/value pair

    y get_tag_value($tag) - get all the values for this tag

    y has_tag($tag) - test if a tag existsy get_all_tags() - get all the tags

    SEQUENCE FEATURE (CONT.)

  • 8/8/2019 Perl and Bioperl

    62/79

    Each Bio::Seq has a Bio::Annotation::Collection via

    $seq->annotation()

    Annotations are stored with keys like comment and

    reference

    @com=$annotation->

    get_Annotations(comment)

    $annotation-> add_Annotation(comment,$an)

    ANNOTATIONS

  • 8/8/2019 Perl and Bioperl

    63/79

    Annotation::Comment

    y comment field

    Annotation::Referencey author,journal,title, etc

    Annotation::DBLink

    y database,primary_id,optional_id,comment

    Annotation::SimpleValue

    ANNOTATIONS

  • 8/8/2019 Perl and Bioperl

    64/79

    use Bio::Seq;my $seq = Bio::Seq->new(-seq => ATGGGTA,

    -display_id => MySeq,-description => a

    description);print base 4 is , $seq->subseq(4,5), \n;

    print my whole sequence is ,$seq->seq(), \n;print reverse complement is ,

    $seq->revcom->seq(), \n;

    CREATE A SEQUENCE OUT OF THIN AIR

  • 8/8/2019 Perl and Bioperl

    65/79

    use Bio::SeqIO;my $in = Bio::SeqIO->new(-format => genbank,

    -file => file.gb);while( my $seq = $in->next_seq ) {

    print sequence name is , $seq->display_id,

    length is ,$seq->length,\n;print there are ,(scalar $seq->get_SeqFeatures),

    features attached to this sequence and ,scalar $seq->annotation-

    >get_Annotations(reference), reference annotations\n;

    }

    READING IN ASEQUENCE

  • 8/8/2019 Perl and Bioperl

    66/79

    use Bio::SeqIO;# Lets convert swissprot to fasta format

    my $in = Bio::SeqIO->new(-format => swiss,-file => file.sp);

    my $out = Bio::SeqIO->new(-format => fasta,-file =>

    >file.fa);`while( my $seq = $in->next_seq ) {

    $out->write_seq($seq);}

    WRITING ASEQUENCE

  • 8/8/2019 Perl and Bioperl

    67/79

    3 Components

    y Result: Bio::Search::Result::ResultI

    y Hit: Bio::Search::Hit::HitI

    y HSP: Bio::Search::HSP::HSPI

    A DETAILED LOOK AT BLAST PARSING

  • 8/8/2019 Perl and Bioperl

    68/79

    use Bio::SearchIO;

    my $cutoff = 0.001;

    my $file = BOSS_Ce.BLASTP,my $in = new Bio::SearchIO(-format => blast,

    -file => $file);

    while( my $r = $in->next_result ) {

    print "Query is: ", $r->query_name, " ",

    $r->query_description," ",$r->query_length," aa\n";

    print " Matrix was ", $r->get_parameter(matrix), "\n";

    while( my $h = $r->next_hit ) {

    last if $h->significance > $cutoff;

    print "Hit is ", $h->name, "\n";

    while( my $hsp = $h->next_hsp ) {

    print " HSP Len is ", $hsp->length(total), " ",

    " E-value is ", $hsp->evalue, " Bit score ",

    $hsp->score, " \n"," Query loc: ",$hsp->query->start, " ",

    $hsp->query->end," "," Sbject loc: ",$hsp->hit->start, " ",

    $hsp->hit->end,"\n";}

    }

    }

    BLAST PARSING SCRIPT

    BLAST Report

  • 8/8/2019 Perl and Bioperl

    69/79

    Copyright (C) 1996-2000 Washington University, Saint Louis, Missouri USA.All Rights Reserved.

    Reference: Gish, W. (1996-2000) http://blast.wustl.edu

    Query= BOSS_DROME Bride of sevenless protein precursor.

    (896 letters)

    Database: wormpep8720,881 sequences; 9,238,759 total letters.

    Searching....10....20....30....40....50....60....70....80....90....100% done

    SmallestSum

    High Probability

    Sequences producing High-scoring Segment Pairs: Score P(N) N

    F35H10.10 CE24945 status:Partially_confirmed TR:Q20073... 182 4.9e-11 1

    M02H5.2 CE25951 status:Predicted TR:Q966H5 protein_id:... 86 0.15 1ZC506.4 CE01682 locus:mgl-1 metatrophic glutamate recept... 91 0.18 1

    BLAST Report

  • 8/8/2019 Perl and Bioperl

    70/79

    use Bio::SearchIO;use strict;

    my $parser = new Bio::SearchIO(-format => blast, -file => file.bls);while( my $result = $parser->next_result ){print query name=, $result->query_name, desc=,

    $result->query_description, , len=,$result->query_length,\n;

    print algorithm=, $result->algorithm, \n;print db name=, $result->database_name, #lets=,$result->database_letters, #seqs=,$result->database_entries, \n;print available params , join(,,

    $result->available_parameters),\n;print available stats , join(,,

    $result->available_statistics), \n;print num of hits , $result->num_hits, \n;

    }

    USING THE SEARCH::RESULT OBJECT

  • 8/8/2019 Perl and Bioperl

    71/79

    use Bio::SearchIO;use strict;

    my $parser = new Bio::SearchIO(-format => blast, -file => file.bls);while( my $result = $parser->next_result ){while( my $hit = $result->next_hit ) {print hit name=,$hit->name, desc=, $hit->description,

    \n len=, $hit->length, acc=, $hit->accession, \n;print raw score , $hit->raw_score, bits , $hit->bits,

    significance/evalue=, $hit->evalue, \n;}

    }

    USING THE SEARCH::HIT OBJECT

  • 8/8/2019 Perl and Bioperl

    72/79

    TURNING BLAST INTO HTML

    use Bio::SearchIO;use Bio::SearchIO::Writer::HTMLResultWriter;

    my $in = new Bio::SearchIO(-format => 'blast',

    -file => shift @ARGV);

    my $writer = new

    Bio::SearchIO::Writer::HTMLResultWriter();

    my $out = new Bio::SearchIO(-writer => $writer-file => >file.html);

    $out->write_result($in->next_result);

  • 8/8/2019 Perl and Bioperl

    73/79

    TURNING BLAST INTO HTML

    # to filter your outputmy $MinLength = 100; # need a variable with scope outside the metho

    sub hsp_filter {

    my $hsp = shift;

    return 1 if $hsp->length('total') > $MinLength;}

    sub result_filter {my $result = shift;

    return $hsp->num_hits > 0;

    }

    my $writer = new Bio::SearchIO::Writer::HTMLResultWriter(-filters => { 'HSP' => \&hsp_filter} );

    my $out = new Bio::SearchIO(-writer => $writer);

    $out->write_result($in->next_result);

    # can also set the filter via the writer object

    $writer->filter('RESULT', \&result_filter);

  • 8/8/2019 Perl and Bioperl

    74/79

    CUSTOM URL LINKS@args = ( -nucleotide_url => $gbrowsedblink,

    -protein_url => $gbrowsedblink

    );my $processor = new

    Bio::SearchIO::Writer::HTMLResultWriter(@args);

    $processor->introduction(\&intro_with_overview);$processor->hit_link_desc(\&gbrowse_link_desc);

    $processor->hit_link_align(\&gbrowse_link_desc);

    sub intro_with_overview {my ($result) = @_;

    my $f = &generate_overview($result,$result-

    >{"_FILEBASE"});

    $result->rewind();return sprintf(

    qq{Hit Overview

    Score: Red= (>=200), Purple 200-80, Green

    80-50, Blue 50-40, Black 'fasta');

    my $acc = AB077698;my $seq = $db->get_Seq_by_acc($acc);

    if( $seq ) {

    $out->write_seq($seq);

    } else {print STDERR "cannot find seq for acc $acc\n";

    }

    $out->close();

    SEQUENCE RETRIEVAL SCRIPT

  • 8/8/2019 Perl and Bioperl

    79/79

    use Bio::DB::Flat;

    my $db = new Bio::DB::Flat(-directory => /tmp/idx,-dbname => swissprot,

    -write_flag => 1,-format => fasta,

    -index => binarysearch);

    $db->make_index(/data/protein/swissprot);

    my $seq = $db->get_Seq_by_acc(BOSS_DROME);

    SEQUENCE RETRIEVAL FROM LOCAL

    DATABASE