integration of tools click to start this is best viewed as a slide show. to view it, click slide...

Download Integration of Tools Click to start This is best viewed as a slide show. To view it, click Slide Show on the top tool bar, then View show. Summary The

If you can't read please download the document

Upload: haden-creswell

Post on 12-Dec-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

  • Slide 1

Integration of Tools Click to start This is best viewed as a slide show. To view it, click Slide Show on the top tool bar, then View show. Summary The tour How to cope with overwhelming information described how difficult it sometimes is to get tools of genome analysis to work together. The present tour shows that the task is certainly not impossible. PhAnToMe / BioBIKE offers a common interface in which the results of tool may be used as input to the next. In this example, a set of proteins defined by the results of a Blast search are aligned, and the alignment is used to make a phylogenetic tree. Slide 2 Integration of Tools To navigate to a specific slide, type the slide number and press Enter (works only within a Slide Show) How to get to PhAnToMe / BioBIKE Problem: Find, characterize rII-like proteins Examine bacteriophage T4 genome Define set of proteins similar to rII, per Blast Align rII-like proteins Make phylogenetic tree of rII-like proteins Reflections and coming attractions 4 8 9 64 10 18 19 37 38 51 52 64 65 Slide # Slide 3 Integration of Tools There are more tools useful in studying genomes than anyone would care to learn. It is often advantageous to combine tools, but this is often difficult. This problem is illustrated in the tour: How to cope with overwhelming information? PhAnToMe/BioBIKE attempts to remove logistical barriers in combining tools, as illustrated in this tour. Blast Phylip Clustal Slide 4 www.phantome.org PhAnToMe/BioBIKE can be accessed by going to the PhAnToMe web site at www.phantome.org and mousing over the Tools menu.www.phantome.org Be sure you are using Firefox. BioBIKE will not function with other browsers. Slide 5 Then click The Phage BioBIKE Slide 6 Enter your e-mail address and click New Login Slide 7 The first time you log in, you'll be asked for identifying information. This is so that any changes you make in the database are associated with you. After filling in the fields, click Register. Slide 8 An alternate route is through the BioBIKE portal at http://biobike.csbc.vcu.eduhttp://biobike.csbc.vcu.edu Slide 9 However you get to BioBIKE, this is what youd see. Now suppose that your goal is to characterize protein similar to the rII protein of bacteriophage T4 (if youve never heard of this protein, no matter). Specifically: - Find such proteins - Align them - Make a phylogenetic tree Slide 10 First, lets take a look at phage T4. To do that, mouse over the Genome button Slide 11 and click SEQUENCE-OF. Slide 12 The SEQUENCE-OF function appears in the workspace. This function displays/returns the sequence of a gene, protein, genome, contig, replicon, or any arbitrary sequence you provide. To tell the function which sequence you want to see, click the entity box, selecting it for entry. Slide 13 The entity box turns white and a cursor appears. You can type in the box, but unless you know the exact name of the phage, it's easier to pull the name off a menu. We want an organism (which is how BioBIKE considers phages), so mouse over the Organisms button Slide 14 mouse over the bacteriophage menu. Slide 15 Scroll through the menu until you find phage T4. Note that the phages are arranged alphabetically by their host. Click T4 to bring it into the SEQUENCE-OF function. Slide 16 Now the function is complete (no open white boxes). Mouse over the functions action icon (the green wedge in the upper left corner) Slide 17 and click Execute. Slide 18 Colored gene sequences are presented within the context of the genome and its annotation. You can scroll through the genome, or search for specific genes ore sequences, but for now, just X out of the sequence viewer. (but first note or copy the name of the rIIA gene, T4p001) Slide 19 That was interesting, but... What was the problem again? OK. First step, find proteins with similar sequences to T4P001. To do this, mouse over the Strings-Sequences button Problem - Find such proteins - Align them - Make a phylogenetic tree Slide 20 and click SEQUENCE-SIMILAR-TO Slide 21 SEQUENCE-SIMILAR-TO allows a few ways of finding similar sequences, but the most common is BLAST (the default choice). Like BLAST, the function needs a query sequence. Click the query box, and type the name of the gene T4p001 (don't worry about upper/lower case). Then press Enter to close the box. Slide 22 If you executed the function as it stands, it would search (by default) for protein matches. But if you didn't know this, you could specify explicitly what kind of search you want. To do this, mouse over the Options icon Slide 23 click Protein-vs-Protein (equivalent to BlastP), and click Apply. Its possible to limit the search to different classes of proteins, but well just accept the default all proteins from all organisms and phages within PhAnToMe. Slide 24 The function is complete, so execute it. One way is to double- click the name of the function, SEQUENCE-SIMILAR-OF. But this time we'll do it the same way as before, through the action icon. Slide 25 Click Execute on the action menu. Slide 26 The function displays the results in a popup window for human consumption, but it also shows the result in the Result Pane (this shows what is available for future computation). Slide 27 There are evidently a great many proteins known that are similar to p-T4p001 (the protein encoded by the gene T4p001). Let's use this result. First X out of the pop-up display. Slide 28 The list of protein can be used directly (e.g. to make an alignment), but it is better practice to give the list a name so you can recall to you later what you did. To give it a name, mouse over the Definition button Slide 29 and click DEFINE. Slide 30 The DEFINE function asks for two things from you: the values you want to name, and the name of the variable that will contain these values. The name can be anything you'll remember (upper/lower case doesn't count). First the name of the variable. Click var to open up the variable box Slide 31 Type a name that makes sense (I chose rII-like) and press Enter to close the box. (The function cannot be executed if any box is open for entry) Slide 32 Next the values. They were given by the function I just executed. Drag that function by clicking and holding the name of the function, SEQUENCE-SIMILAR-TO. Slide 33 and dragging it towards the value box Slide 34 When it reaches the value box, the box will become highlighted in red. At that point, release the mouse Slide 35 and the function will now reside in the value box. Execute this function as you have the others, Slide 36 by clicking Execute on the function's Action menu. Be careful not to use the action menu of the inner function SEQUENCE-SIMILAR-TO. That will work -- eliciting the sequence comparison but no definition will take place. Slide 37 Nothing drastic seems to have happened, but if you look carefully, you'll note two changes. First, a list of phages has appeared in the Result pane. Second, a new Variables button has appeared. We'll use it momentarily. Slide 38 We wanted to use the Blast results, now stored in rII-like. for what? Ah yes! The time has come to align the protein sequences. To do that, mouse over the Strings-Sequences menu Problem - Find such proteins - Align them - Make a phylogenetic tree Slide 39 and mouse over Bioinformatic-Tools. Slide 40 and click ALIGNMENT-OF. Slide 41 The ALIGNMENT-OF function asks for a sequence list. Fortunately, you now have one. Click the sequence-list box Slide 42 and mouse over your new Variables button Slide 43 and click your new variable rII-like button to bring it into the box. Slide 44 The function is now ready for execution, but there are two ways you can tweak the function settings to make the output more useful. To make these changes, mouse over the Options icon Slide 45 and click colored to produce a graphical alignment rather than pure text Slide 46 and click Label-with-organism to cause the alignment lines to be labeled with the names of the proteins' organisms rather the proteins themselves. Slide 47 Finally, click Apply Slide 48 and go to the action icon Slide 49 to execute the completed function. Slide 50 The graphical output is produced by a Java Applet called Jalview. Activate the applet. It might take several seconds to complete the alignment Slide 51 A useful alignment, perhaps. Now on to the phylogenetic tree. First, X out of the alignment. Slide 52 Back to the Strings-Sequences menu Slide 53 Go to the Phylogenetic Tree submenu Slide 54 and click TREE-OF. Slide 55 Note that TREE-OF is asking for an alignment. Provide one by dragging the completed ALIGNMENT-OF function into the alignment box. Click and hold the ALIGNMENT-OF box Slide 56 and drag it towards its target, the alignment box. Slide 57 You'll know you've gotten there when it becomes highlighted. Release the function. Slide 58 The Colored option is no longer useful (the output it provides is just for human consumption, not for TREE-OF). Get rid of it by clicking its Delete icon. Slide 59 You may have noticed that the alignment you produced before had many columns that were mostly gapped. These are given too much weight by phylogeny programs. To remove those columns, modify the behavior of ALIGNMENT-OF by mousing over its Option icon Slide 60 clicking the No-gapped-columns option Slide 61 and finally clicking Apply. Slide 62 Now you're ready to execute in the usual way. Slide 63 (This will take longer than the alignment perhaps a few dozen seconds) Slide 64 You should soon receive in separate popup windows a phylogenetic tree based on the no-gaps alignment of the rII-like sequences. As one might expect, the rII proten from phage T4 clusters with proteins from other enterobacteriophage. Slide 65 Integration of Tools Reflections and Coming Attractions This tour presented three of the most bioinformatic common tools employed by biological researchers: searching by local alignment (Blast), multiple sequence alignment, and construction of phylogentic trees. There are, of course, many, many more tools a researcher may find valuable, and the collective burden can be overwhelming. The case was presented that much is gained by putting the tools within a single interface, BioBIKE. Granted, BioBIKE has its own idiosyncrasies to learn, but at least its just one set. The interface that permits access to multiple tools and databases also permits the creation of new tools conceived by a research to address an immediate need, and this topic is explored in the tour, Creating New Tools.