persephone | persephone - features &...

27
Features & Functions

Upload: others

Post on 17-Oct-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

Features & Functions

Page 2: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

1

Table of Contents

Table of Contents ...................................................................................................................................... 1

Introduction .............................................................................................................................................. 2

A Note on Performance ........................................................................................................................ 2

How to Use this Guide .......................................................................................................................... 2

Navigation Controls .................................................................................................................................. 3

Comparative Genomics ............................................................................................................................. 4

Multi-species Physical Map Comparison .............................................................................................. 4

Genetic & Physical Map Comparison .................................................................................................... 5

Global Synteny ...................................................................................................................................... 5

Example - Scaffold Maps ....................................................................................................................... 6

Example - Real-time BLAST ................................................................................................................... 7

Feature Cards and Properties ............................................................................................................... 8

Horizontal Viewing ................................................................................................................................ 9

Integration of polygenic traits (QTLs and Associations) ......................................................................... 11

QTL Study Selection ............................................................................................................................ 11

Integration of Genotype-by-Sequencing multi-sample comparison ...................................................... 13

Sample SNP Selection ......................................................................................................................... 13

Real-time Gene Translation ................................................................................................................ 14

Integrating with Sample SNP Data ...................................................................................................... 15

SNP Assay Design ................................................................................................................................ 17

Integration of Functional Genomics (expression data) .......................................................................... 19

Expression Sample Selection ............................................................................................................... 19

Multi-Dimensional Scaling & MA plot ................................................................................................. 21

Search and Export Functionality ............................................................................................................. 23

Search .................................................................................................................................................. 23

Export .................................................................................................................................................. 24

Import functionality (simple excel) ......................................................................................................... 26

Page 3: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

2

Introduction Persephone™ is a next-generation genomics visualization platform.

Just as Google Maps offers easy access to large amounts of geographic information, Persephone provides this same type of functionality for genomic information. Persephone makes visualizing and exploring through diverse genomics data types easy and intuitive.

A Note on Performance

The Persephone CloudService uses Amazon Web Services (AWS) for data storage, so the physical distance between this storage location and the client running on your computer may affect the responsiveness of the software. Your responsiveness can be determined by mousing over the latency symbol on top right corner of screen. The latency shown here (483 ms) is an example of the client running in Los Angeles, CA USA and the data being stored at an AWS location in Ireland. On our local, internal server, an organization could expect latency of less than 30 ms, or 20+ times faster than seen in this example.

The Persephone CloudService data stored in AWS represents a selected subset of publically available information which will be continually added to. Data was selected to show the majority of features and functionality that exist in Persephone and the ease of exploration through the different data types. The software has been optimized to handle very large diverse datasets with minimal to no lag on performance.

How to Use this Guide

This tutorial guide is intended to be an example of different data that can be stored in the system and the use of this data to illustrate common functions and best practices, and also to familiarize you with the location and use of exploration tools.

Page 4: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

3

Navigation Controls **It is highly recommended that a mouse with a scroll wheel be used with Persephone for increased performance and ease-of-use.

Persephone has several routine navigation controls that have been developed so users can have a more fluid interaction with the data.

• Left Mouse – Single click

o normal select • Left Mouse - Double-click

o Double click on plate will zoom in • Left Mouse – Hold

o In vertical view, hold down on plate will allow user to move plates around o In horizontal view, hold down on tracks will allow user to move tracks around

• Right Mouse – Click

o Provides available additional menu items for what the arrow key is hovering over (for certain content no additional menu items are available using this command)

• Control + Left mouse – Hold both simultaneously

o Drag across a selected region which will select that region (identified by a green arrow) for viewing horizontally or exporting selected features in region

• Shift + Left Mouse – Hold both simultaneously

o Drag across a selected region - zoom into that selected region (both vertical and horizontal views)

• Mouse wheel – Zoom

o Zoom is available for plates the arrow key is hovering over. o Arrow key hovering over a single vertical plate the wheel will zoom in and out only that

plate. Arrow key hovering between or among plates (not over a plate) the wheel will zoom in and out all plates

o Wheel command works for both vertical and horizontal views

Shift

Page 5: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

4

Comparative Genomics A major feature which drove the development of Persephone was the ability to show synteny between species and also different map types. Synteny for physical maps is determined based on homologous genes, while genetic and physical maps are aligned based on DNA marker sequence.

Multi-species Physical Map Comparison

Visualize synteny between whole chromosomes and easily zoom into any region.

Select Sorghum bicolor/Sorghum annotation Map Set and then select Chr. 1 map (map of sorghum chromosome one is displayed. The purple colored track is the track with gene annotations and the other track is with marker information. Tracks can be turned on and off by clicking the yellow circle at the top or using Right Mouse – click menu over the track to remove. Any number of tracks can be added, these 2 tracks were selected).

To bring up syntenic chromosomes in other species Right Mouse – click on the sorghum track and select “Find Synteny” in the menu. Select Zea mays in the drop-down list. Results are provided for the number of homologous genes available. Select Chr. 1. This displays corn Chr. 1 with a red gene track and also a marker track. Connector lines can be seen linking homologous genes. Zooming is possible at any time on both tracks simultaneously or each track separately.

Add even another species. Right Mouse – click on the corn track and perform same steps as above. Select Oryza sativa and Chr.3. Once again this is displayed with a blue gene track, a green alternate gene annotation track, and a QTL track (QTL track can be filtered by using Right Mouse – click over the QTL track). Continue bringing up additional syntenic chromosomes.

All plates can be scaled for width on the screen by hovering over the edge of the plate and Left Mouse – click and hold and pull plate to widen (for example if you wanted to widen out the QTL track to see more of it).

Page 6: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

5

Genetic & Physical Map Comparison

Visualize relationship between genetic maps and physical maps (same species or between species).

Select Map Set Zea mays/Maize IBMn 2008 and then select chrom 1 map (this displays a map with a single track of markers. The markers are dense, so to visualize actual marker names try zooming in on the plate). Select Zea mays/Corn Annotation Map set and then Chr. 1 map (this will display a red gene annotation track and a marker track which has some of the markers from the IBM genetic map. Connectors are shown between these two map types.) As before any number of track types can be added to any type of map or map set.

Global Synteny

Global synteny is an important feature for understanding, at a high level, which chromosomes and regions of chromosomes are syntenic among different species. This allows the user to get a quick understanding of the whole genome comparison. Persephone also enables zooming and selection of interested regions.

On the main screen, select the Tools button from the top toolbar. In the Tools menu, select Synteny Matrix. In the example above, the user selected Map Set Oryza sativa/Rice annotation and the 2nd Map Set was Sorghum bicolor/ Sorghum annotation. After selecting these two map sets, click the Display Synteny button

at the top of the page. This will now display the synteny of the 12 rice chromosomes vs. the 10 sorghum chromosomes in a matrix dot plot. The user can now easily visualize which chromosomes are syntenic and also see chromosomes breaks, inversions, duplications, etc.

Page 7: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

6

Zoom and Additional Views - The user can zoom into a selected syntenic chromosome (in this example Rice 4 and Sorghum 6) using the same mouse functionality (Shift + Left Mouse hold as you drag over the area to Zoom. Also mouse wheel will work as well). Once a syntenic region has been identified for further scrutiny, Left Mouse – double click on the region of interest and this will launch the Maps tab vertical view with synteny connectors. In addition, the region that was selected will be highlighted with a green selected arrow, so the user can further move into the horizontal view by Left Mouse - click on the green arrow or can Right Mouse – click on the green arrow and bring up the menu to Export any features available within this area (could be gene annotation and functions, markers, QTLs, etc.).

Example - Scaffold Maps

Scaffold maps are an important part of new species sequencing (especially complicated species). These maps usually are smaller in nature and also consist of thousands of unique maps. Persephone handles scaffold maps in a unique way. The 100 largest scaffolds are available and displayed in the maps field, but also the user can search for additional scaffold maps if they know the name of the map. Due to the possibility of thousands or tens of thousands of scaffolds for a given species, Persephone requires the user to have some knowledge of the naming scheme and the scaffold they are interested in. Persephone always enables BLAST functionality, so if there is a sequence associated with a Scaffold (or a user wants to see which scaffold holds a sequence – align with a marker on a genetic map for example), a user can BLAST this sequence and find the matching sequence within the scaffold maps.

Select Map Set Glycine max /Glycine max annotations /Soybean scaffolds and the user will now see on the Maps the first 100 scaffolds from soybean. Also, please notice there is a search box at the bottom of the Maps view which enables searching for additional scaffolds using the Map Name field. In the example above, the 4 largest scaffolds were selected for soybean and displayed simultaneously. These scaffolds have been annotated and a few gene annotations are present. Similarly to other plates and tracks, any number of additional tracks can be added if additional information is mapped to scaffold sequences. All zooming, selection and export functionality are available as normal.

Page 8: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

7

A user can also run the Run BLAST functionality (next section) to evaluate a sequence of interest aligned to scaffolds. In this example a protein sequence was the query for tBLASTn against soybean genome and multiple hits were seen across both scaffolds and the genome for this sequence. The BLAST results can be exported or selection can be made for viewing. In this example, the best BLAST match came from a soybean scaffold sequence which is displayed next to the BLAST results.

Example - Real-time BLAST

BLAST is a highly used functionality for many different research methods within genomics and other disciplines. Persephone enables users to run BLAST against any species within the database so that researchers can locate features of interest quickly and visualize these features along with synteny and other available content (discussed in next sections). In addition, several sequences can be BLASTed simultaneously and results can be filtered based on the query sequence (so users do not have to perform BLAST one by one, if they have multiple sequences they are working with)

Select Run Blast from the main toolbar (several BLAST parameters can be changed). Select a protein sequence and cut-n-paste it into the screen. Select Zea mays and run a tBLASTn. Results are shown graphically with the green line being the best hit, but other hits are shown in yellow across the genome.

Paste multi-fasta formatted sequences

If multiple queries, can sort with this column

Page 9: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

8

Select Display Map & Close which will bring up the selected chromosome and zoom into the gene selected. Clicking on the green bar next to the annotated gene information brings up additional

information about alignment (alignment type data is only available when using tBLASTn)

On alignment screen clicking on different sections of alignment provides detailed amino acid sequence information on the bottom of the alignment screen.

Feature Cards and Properties

All features within the Persephone application have additional information associated with them. You can find this information by Left Mouse – clicking on the feature (gene annotation, marker, QTL, etc.). On these features cards you can find sequence information, homolog/ortholog information for other species, and also URLs that will link out to external data sources (example, phytozome.org for sorghum).

In addition by using Right Mouse – click on features, maps, and tracks this will bring up a menu and properties can be selected which will provide additional information about the selected item.

All cards and property displays are highly configurable based on the data stored in the database.

Page 10: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

9

Example of feature card for Gene Annotation, Marker Details, QTL details, and Property information for Brachypodium annotation.

Horizontal Viewing

Persephone allows visualization vertically (considered a macro view) and then once a region is chosen for finer detailed evaluation, this region can be selected and viewed in a horizontal fashion. In the horizontal view, zooming down to the single nucleotide level is made possible.

In the above example, a plate for Oryza sativa annotation chr. 1 was selected (included 3 tracks, 2 different gene annotation tracks and 1 QTL track). The plate was zoomed into a region and this region was selected by dragging the arrow key by holding Left Mouse + Control key (selected region can be seen by the green arrow). Left Mouse – click on this green arrow will launch the horizontal viewer. All

Page 11: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

10

the same tracks are maintained from the vertical view (if turned off on vertical view will be off in horizontal view and vice-versa).

Tracks can be turned on and off and moved around in different configurations as needed (moving involves Left Mouse – click and hold on left track and move). In addition, motifs can be searched through this region as seen above (“tataa” searched) and these results are then visualized. Additional sequence can be added by adjusting the coordinates at the top of the screen (trying to visualize regions of more than 1 million nucleotides, will result in the DNA sequence field at the bottom not being populated, but all other features will be available).

Any tracks that are seen on the vertical view can be viewed in this horizontal manner with selection of a targeted region. This includes QTL, GBS (genotype-by-sequencing) and Expression data as well.

Page 12: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

11

Integration of polygenic traits (QTLs and Associations) Persephone was developed to allow researchers to pull data from a variety of studies from different species, which can provide a more comprehensive interpretation of results and data. In addition, if researchers know what traits they are interested in and have no pre-conceived knowledge of genome coordinates, Persephone allows the researchers to start their search at the trait level as opposed to the genome coordinate level.

QTL Study Selection

A user launched the QTL Search screen by navigating to the QTL Search tab near the top toolbar. The QTL search screen provides a variety of ways to filter the search criteria. Typing into the box above each column is also an easy way to filter for specific terms. As the user clicks on and off filtering criteria, all other columns are updated automatically to be in sync with selection. When the user finally locates a few studies of interest (in this example 3 studies were selected for Panicle Length in Rice), the user clicks on Show Selected Map Sets in the right bottom corner of the screen.

This now brings up all Rice chromosomes that have a QTL from these selected studies. Additional QTL filtering is possible by clicking on Show/Hide QTLs on the left bottom of the screen. In this example, only QTLs from these 3 studies which are identified as Panicle length were selected. As the user selects QTLs in the Show/Hide QTLs screen,

they will notice the QTLs changing on the maps behind the filtering screen automatically.

Page 13: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

12

After filtering the user selects Chromosome 8 from Rice (stays highlighted, all other chromosomes un-highlight) and clicks on Display Selected Maps on the bottom Left of QTL Search screen. This now launches the Maps tab and displays the QTL on Rice chromosomes 8 with all other track types available. The user can now perform many different types of functions and in this scenario, a region around

the QTL was selected (green arrow) and the user has selected to export the features within the region (genes, annotations, functions, protein sequence, etc.). Also, the user could select the green arrow region and move to the Horizontal view for a more micro resolution of the information. In addition the user has also brought up a syntenic chromosome from sorghum which also may help to better interpret the data (remember, any number of tracks can be available in these maps). Export and synteny can both be brought up by Right Mouse – click and bring up the menu and selecting the appropriate function to use.

Page 14: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

13

Integration of Genotype-by-Sequencing multi-sample comparison Whole Genome Sequencing (WGS) and Genotype-By-Sequencing (GBS) studies have become prevalent within the genomics fields. Persephone was designed to manage this type of data and store and visualize this data for individual samples (genotypes). These datasets are important to capture since comparison of SNP data across sets of samples is a critical piece of the overall interpretation of results. In addition to this, trait data can also be stored per sample, so simple single marker association analysis type activities can be done on the fly to provide possible targets for further scrutiny.

In addition, the effects of these allele differences per genotype are important to understand, especially if the mutation is within a gene. Real-time translation of genes makes evaluation of amino acid sequences per sample more fluid for any changes nucleotide mutations may cause.

Allowing researchers to start their research ideas/hypotheses on a larger dataset (all sorghum GBS data for example) as opposed to a single selected region of a genome provides significant new opportunities on interpreting results and finding new research ideas/results in the data.

The SNP Search tab also can contain data from SNP-arrays (Rice and Arabidopsis data). Major difference between SNP array and GBS data is SNP array data does not have coverage data.

Sample SNP Selection

Select Sorghum and then select all sorghum samples in the system. Each of the samples is displayed individually. By default all color coding of alleles is done using the reference sequence. In the above example, all blue is the same nucleotide as the reference, all red is the alternate nucleotide and green signifies a heterozygous position. Users can look at more than one chromosome at a time by selecting and deselecting what they wish to see.

Color coding can be adjusted by highlighting one of the samples that was selected and dragging this sample into the use as reference section. Now all samples will be re-colored and blue will now indicate a similar allele to this new reference, etc. Also, 2 parents can be selected in the same manner and all samples will be re-colored based on these selected 2 parents.

Also, selection of different viewing styles is possible as well.

Once selection is made, the data only becomes useful once it is mapped to the actual reference genome so zooming, layering, etc. is available. By Left Mouse – click on the colored graphic moves this data to

Page 15: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

14

the Maps tab with other available track data (in example below only showing gene annotation track and GBS sample track).

The user then zooms into a region on the chromosome and selects this region (Left Mouse + Control, hold and drag). The horizontal view is displayed now with all the same tracks that were seen on the vertical view. Coverage data is now also displayed on the bottom of the screen and when zoomed in can be evaluated per nucleotide. All horizontal track functionality exists (zooming, moving, hiding tracks, etc.)

Real-time Gene Translation

If a gene is identified that has multiple alleles across the samples, a real-time translation can be done for each sample. Hovering over the GBS data under a gene, the screen will turn grey which indicates additional functionality can be used. Right Mouse – click on this area and select “Generate Translation for Gene” and a new screen will pop-up with each sample having their own amino acid translation. If a

Page 16: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

15

SNP has caused an anomaly (change in AA), a red flag will be shown and by clicking on this flag, a zoomed in view of the actual AA sequence can be further evaluated. This was developed as a fluid mechanism to better understand the effect a SNP may have on a gene per sample.

Integrating with Sample SNP Data

In this next example all sorghum data was removed from the previous activities. The user now selects Rice and sorts the Rice list based on Panicle Length trait measurement. The user selects the 375 Rice samples and these are displayed. As before the color coding of samples can be changed, but default coloring is based on reference sequence. The user now clicks on the colored marker area and launches the Maps display tab.

Page 17: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

16

Now all samples & markers are visualized on the physical chromosome sequence of Rice. This vertical view can be zoomed. An additional QTL track is now available for this rice chromosome, so by Right mouse – click on the QTL track brings up the menu feature which the user can now select Filter QTLs. In the above example, QTLs were filtered for Panicle Length to be consistent with the Trait used to select samples.

Zooming into a region that has QTL now the user can select this region for further scrutiny. The green arrow indicates the region that was selected and the user has decided to export/review the features that are available within this region (example, view the functions of the genes within this selected region).

Page 18: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

17

In addition, the user can select this green highlighted region by Left Mouse – click and launch the horizontal view (which maintains all tracks and filtering that was done on the vertical view). A user can now identify genes within this selected region that have segregating alleles across the samples set selected. Once these genes have been identified the user can Right Mouse –click on this region (over the marker data, will see a grey shaded area) and select Generate Translation for Gene. This will now provide an amino acid sequence for each of the samples being viewed (375 in this example). If any of the SNPs cause an amino acid change in the sequence, these changes will be flagged. The user can Left Mouse – click on this flag and further zoom into this change and appropriate sample.

SNP Assay Design

If this SNP is identified as something of interest an assay can be designed from Persephone. The user Right Mouse – click on the SNP of interest and from the menu select Design SNP assay. The sequence

Page 19: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

18

that is made available is created from all samples within the database not just the samples that were selected (this is important, so that the user can see all mutations around the selected SNP to better design a more robust assay. Only using the selected samples may miss near-by mutations which may cause an assay not to work if using on a larger sample set). The Design SNP assay screen also enables the user to select the amount of sequence they would like to evaluate up and downstream of the selected SNP.

Page 20: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

19

Integration of Functional Genomics (expression data) Expression data is a key dataset in understanding and interpreting results from experiments. Overlaying expression data onto SNP and QTL data provides a more powerful set of information to interpret and garner new ideas/results from the data.

The key importance with expression data is to understand the difference in gene expression between selected samples being analyzed. Using a static expression value is not as helpful as actually looking at samples, selecting what samples to analyze, and being able to identify a set of genes that can be further viewed and scrutinized.

Enabling researchers to begin research of data in this way allows each user to evaluate and integrate the data in unique ways as opposed to a more static view of data where tracks are loaded onto a selected region of a genome (which is also possible with Persephone).

The Expression functionality (like the GBS functionality) enables researchers to sift through data at a macro level (experiment or whole genome level) which leads to insights not seen if the data is only available at a micro level.

Expression Sample Selection

User navigates to the Expression tab (found underneath the top toolbar). On the Select Map Set drop down (near top of screen), the user selects Sorghum annotation (Sorghum bicolor). A select samples screen is launched and now the user can select the samples they would like to further investigate. Similarly to the SNP search page a variety of information can be associated with samples and this information can be used for selecting sample sets (For example, sample sets can be loaded by experiment and selection can happen in this way). However, for this example, all the data in the system for sorghum is for a single experiment. The user selects the samples (all or a subset) from the Select Samples view and clicks Add. The user can now select Get Expression (underneath the chromosome column on the Right of the screen) and this will provide a view of the expression on Chr. 1 for each sample (black shading above). Even at this Macro expression view, differences can be seen among samples. The user than Left Mouse – click on the black shading which launches the Maps tab view.

Page 21: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

20

Now the user can see the expression on the physical chromosome still at the macro level, but in conjunction with any other track data that is available (in the example above, marker data was also brought in from the SNP Search tab for this chromosome and select samples). As the user zoom into the chromosome, the expression data will become more meaningful (a red and blue scale is used – red represents a gene that is overexpressed in that samples and blue represents

a gene that is under-expressed for that sample. Also the width of the bars represent how over or under-expressed the sample is – wider the bar the larger the gene is over or under expressed for that sample).

The user then selected a region (identified by the green arrow) and Left Mouse – clicked this green region which launched the Horizontal view. All tracks were maintained from the vertical view and now at the micro level, the user has gene annotation information, marker information for selected samples, sequence coverage information for selected samples, and expression information for selected samples. All zooming, track, feature cards, and SNP functionality is available as normal.

Page 22: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

21

Multi-Dimensional Scaling & MA plot

As an alternate to the above functionality, the user wants to identify specifically which genes are differentially expressed among the selected samples. The user goes through the sample selection as described above and Adds to the Expression Tab. Once samples are added the user clicks on MDS Plot button which launches a MDS analysis of the selected samples. Once the analysis is completed, the user clicks on the Enable 3D button which enables the user to move the MDS plot around to better differentiate clusters (this is done by Left Mouse – click and hold as you spin the 3D image). The user can also enable labels if desired.

Once the 3D plot is visualized per users interpretation, the user can lasso clusters by Left mouse – Hold + Control and drawing a circle around the cluster. The default color is green. Once these samples are colored green, the user must select another cluster with an alternate color for differential analysis. Click on the screen to clear the circle that was drawn around the green samples and then select red (1) from the menu. Perform the same circling as before and color code the next cluster red.

Once there are 2 clusters color-coded, Persephone will automatically enable the Scatter Plot analysis. Click on the Play button (arrow) and analysis will be run. The result is an MA plot that can be viewed in a few different ways (display mode radial buttons). The user can now lasso the selected samples that are under-expressed (in this example above). Once lassoed, the information about each gene is provided including coordinates, functions, level of expression, etc. The user can select the blue link to view the gene feature card or the user can select the row and select Display Map (bottom right corner).

Page 23: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

22

Once a gene is selected the chromosomes is automatically launched and zoomed into the specific gene selected. All tracks and previous information can now be viewed along with this selected gene. In addition, if a gene has RNA-seq type expression data, on the gene feature card, there is an RNA-seq values tab which holds data associated with expression. Selecting this tab and clicking Expression Analysis brings up the data associated with the expression study. The columns can be sorted and coloring of the results on the Right hand side will adjust according to sorting and column selection. In this example above, column Tissue was sorted and the user can clearly distinguish major expression difference between leaf and stem tissue types.

Page 24: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

23

Search and Export Functionality

Search

Persephone has been developed to have multiple search mechanisms which can be found under the Search button on the top toolbar. In additional to single query, Persephone enables searching with lists of names as well.

The Search function on the main toolbar provides the user with a wealth of searching options including lists and use of wildcards when the exact search criteria is not known. The search screen enables users to search for markers, annotations, maps, and QTLs (all separate tabs on screen). In the above example the user types in Tar* in the annotation tab Annotation Keyword and also selected Gene Function and Organism: Sorghum bicolor. The user is provided results that can be copy-n-pasted into excel for export functionality and is also provided a view of location and matches mapped onto the genome of sorghum. As you select through the chart, the correlated view is highlighted blue. Once a selection is made, the user selects Display Map or Display Map & Close, and the chromosome and selection is zoomed into automatically.

As a second example, a QTL was searched for (on the QTL search tab). A user inputs AQG* as the QTL Keyword with no further filtering criteria. A list of all QTL that contain these letters is provided. Due to no Organism being selected, the rows are not visualized on a map, but once a feature is selected, the map with launch and will zoom into the feature selected as before.

Click on the “…” to input a list

Page 25: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

24

Export

In addition, exporting data is a key need for researchers as they will want to move certain datasets and information into other possible applications for further study, etc. Persephone enables a wealth of export functionality from a whole genome down to a single nucleotide.

Export is possible from a variety of locations in Persephone and is either a copy-n-paste functionality or a formal export into excel (possible through Right Mouse – click, bring up menu and if export is available for that selection the menu item Export will be seen). Export can be done on a whole genome annotation (caution as this will take some time), a whole chromosome, a selected region, etc. All tracks available will be seen as tabs in the export dialogue box (as seen here – TIGR, mRNA, and QTL export).

The user has selected a region of a genome (green arrow) and has Right Mouse – click on this green arrow and selected Export in the menu. Now the user can decide what data to export and what information needs to be exported with that data. The Fields list in the Export view is populated with all fields associated with that data type in the database. Many more Fields can be available for export.

Also, SNP x sample data is needed for a variety of analysis type activities so having the ability to export this data is important. A user has selected a set of samples in the SNP Search tab and has then selected a region to more thoroughly

scrutinize in the Maps view. The user can now Right Mouse – click on the SNP data and select the Export SNPs feature on the menu.

Page 26: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

25

This provides the above excel type chart with sample in columns and SNP position in rows. The matrix provides the allele for each combination if available (some GBS data will be missing for some samples). This whole chart can be copy-n-pasted into excel for further manipulation.

Page 27: Persephone | Persephone - Features & Functionspersephone.net/wp-content/uploads/2015/01/Persephone... · 2017. 1. 24. · Persephone always enables BLAST functionality, so if there

26

Import functionality (simple excel) Simple excel import is enabled which is a feature many researchers appreciate. A simple excel file of features can be made and drag-n-dropped onto the Persephone application. The application will decipher the excel file and data will be visualized with other tracks already in the system. This is a perfect feature for new QTL or SNP data that a researcher would like to evaluate with the other data in the system, but is not ready to submit to the shared data repository for all to see. Any feature that has coordinates on a physical sequence can be loaded and viewed in this manner.

This Excel file is formatted into 4 columns with the 1st column: Chromosome, 2nd column: Feature, 3rd column: Start, and 4th column: End (4th column is not necessary if the feature is a single nucleotide). The user saves this file (.xls) and drags-n-drops this file onto the Persephone main Maps tab screen. This will launch the above view which displays what Persephone has deciphered from the provided excel sheet. The user can load this data as a New Map Set by selecting this and also selecting 1 of the chromosomes to view. After selecting a chromosome (or multiple chromosomes if you hold down the Control key while selecting) the user selects Show Selected Maps which brings up the map in the Maps view tab like normal.

The more useful approach is to show your excel data integrated with all the other tracks available in the system. When the above screen pops up like before, this time select

the species and annotation that the data belongs to (in this example, Sorghum bicolor/Sorghum annotation was selected). Once you select this, Persephone will identify and match the nomenclature in your excel file to what is available in the database (in this example Chromosome 1 was matched). The user then selected Chr.1 or multiple chromosomes and then clicks Show Selected Maps which now provides your excel data integrated in with database information.

Persephone now displays a new track that was generated using the excel file data (in this example a marker track) alongside a previous public marker track and the gene annotations track. All zooming functionality still exists, and viewing horizontally will also maintain the excel file data so that single nucleotide resolution is possible.

New Track of marker data