biohealthbase: a web-based database and analysis resource for francisella shubhada godbole 1, jyothi...

3
BioHealthBase: A Web-based Database and Analysis Resource for Francisella Shubhada Godbole 1 , Jyothi Noronha 1 , Burke Squires 1 , Victoria Hunt 1 , Ed Klem 2 , Aihui Wang 2 , Chris Larsen 3 , Barbara Mann 4 and Richard H. Scheuermann 1 1 Department of Pathology and Division of Biomedical Informatics, University of Texas Southwestern Medical Center, Dallas, TX 75390, 2 Northrop Grumman Information Technology, Rockville, MD 20850, 3 Vecna Technologies Inc. College Park, MD 20740, 4 University of Virginia, Charlottesville, VA 22904. BioHealthBase Bioinformatics Resource Center (www.biohealthbase.org ) provides a comprehensive genomic and proteomic data repository for five groups of pathogens that pose a threat to public health. The bacterial pathogens in BioHealthBase include Mycobacterium tuberculosis, the causative agent of TB, and Francisella tularensis, the causative agent of tularemia. The BioHealthBase includes genome sequences for seven Francisella strains including 3 type A, 3 type B, and F. novicida. New genomes will be added as they are available. Each genome can be searched for protein motifs and by predicted protein localization. Comprehensive protein functional annotations for each locus are available which include EC numbers and gene ontology (GO) annotations, protein structures, protein domains and motifs, orthologous groups, protein cellular localization, metabolic and signaling pathways, immune epitopes, etc. In addition to the integrated genomic data, BioHealthBase also provides user friendly interfaces for data retrieval, data analysis and visualization to assist the biologists in making the best use of the available information. The goal of the BioHealthBase is to provide a resource to the scientific research community to facilitate bioinformatics analyses, and for the development of vaccines, diagnostics and therapeutics for these pathogens. An overview of BioHealthBase database will be presented with a focus on current resources for Francisella. Supported by NIH N01AI40041 Abstract Basic genome data from Genbank Genome sequence, gene predictions Protein Functional annotations Data enhancements in BioHealthBase Glimmer gene predictions EC numbers, GO annotations Operon predictions Orthologous groups Protein cellular localization Protein domains, motifs Protein secondary and 3-D structure Immune epitopes Mutant phenotype, mutation sites and links to mutant clone library resources Sequence similarity search (Blast) Multiple sequence alignment (MUSCLE) Protein structure viewer (JMol) Genome Visualization (GBrowse) Bacterial Genome Annotation Types of Data in BioHealthBase Tools in BioHealthBase Protein Structure Visualization BioHealthBase Data Summary Home Pages Francisella genomes in BHB New Developments Query Interface 1. Select a strain(S) Schu 4 2. Select a data type Gene Product Name 3. Select a search term DNA Gyrase 4. Select data fields to view Query Results Display Users can 1. Click on ‘Details’ to view the gene/protein information 2. Select genes for downloading information 3. Select genes for further analysis using Workbench Analysis Downloa d Genome Browser BioHealthBase Gene Details Taxonomy Gene Information Operon Protein information EC number Localization Domains Gene Ontology Ortholog s Gene Ontology Sequence Similarity Search Results Future Development Francisella SNP data Protein molecular weight and isoelectric pH predictions Metabolic pathway data and visualization Enhanced query interface Comparative genomics tools (synteny viewer, whole genome alignment viewer) Immune epitope predictions Advanced functionalities for protein structure viewer (display epitopes, protein functional sites) Community annotation

Upload: roland-griffin

Post on 01-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: BioHealthBase: A Web-based Database and Analysis Resource for Francisella Shubhada Godbole 1, Jyothi Noronha 1, Burke Squires 1, Victoria Hunt 1, Ed Klem

BioHealthBase: A Web-based Database and Analysis Resource for Francisella

Shubhada Godbole1, Jyothi Noronha1, Burke Squires1, Victoria Hunt1, Ed Klem2, Aihui Wang2, Chris Larsen3, Barbara Mann4 and Richard H. Scheuermann1 1Department of Pathology and Division of Biomedical Informatics, University of Texas Southwestern Medical Center, Dallas, TX 75390, 2Northrop Grumman Information Technology, Rockville, MD 20850,

3Vecna Technologies Inc. College Park, MD 20740, 4University of Virginia, Charlottesville, VA 22904.

BioHealthBase Bioinformatics Resource Center (www.biohealthbase.org) provides a comprehensive genomic and proteomic data repository for five groups of pathogens that pose a threat to public health. The bacterial pathogens in BioHealthBase include Mycobacterium tuberculosis, the causative agent of TB, and Francisella tularensis, the causative agent of tularemia. The BioHealthBase includes genome sequences for seven Francisella strains including 3 type A, 3 type B, and F. novicida. New genomes will be added as they are available. Each genome can be searched for protein motifs and by predicted protein localization. Comprehensive protein functional annotations for each locus are available which include EC numbers and gene ontology (GO) annotations, protein structures, protein domains and motifs, orthologous groups, protein cellular localization, metabolic and signaling pathways, immune epitopes, etc. In addition to the integrated genomic data, BioHealthBase also provides user friendly interfaces for data retrieval, data analysis and visualization to assist the biologists in making the best use of the available information. The goal of the BioHealthBase is to provide a resource to the scientific research community to facilitate bioinformatics analyses, and for the development of vaccines, diagnostics and therapeutics for these pathogens. An overview of BioHealthBase database will be presented with a focus on current resources for Francisella.Supported by NIH N01AI40041

Abstract

Basic genome data from GenbankGenome sequence, gene predictions Protein Functional annotations

Data enhancements in BioHealthBaseGlimmer gene predictionsEC numbers, GO annotationsOperon predictionsOrthologous groupsProtein cellular localizationProtein domains, motifsProtein secondary and 3-D structureImmune epitopesMutant phenotype, mutation sites and links to mutant clone library resources

Sequence similarity search (Blast)Multiple sequence alignment (MUSCLE)Protein structure viewer (JMol)Genome Visualization (GBrowse)Bacterial Genome Annotation

Types of Data in BioHealthBase

Tools in BioHealthBase

Protein Structure Visualization

BioHealthBase Data Summary

Home Pages

Francisella genomes in BHBNew Developments

Query Interface

1. Select a strain(S)Schu 4

2. Select a data typeGene Product Name

3. Select a search termDNA Gyrase

4. Select data fields to view

Query Results Display

Users can1. Click on ‘Details’ to view the gene/protein information 2. Select genes for downloading information3. Select genes for further analysis using Workbench

Analysis Download

Genome Browser

BioHealthBase Gene Details

Taxonomy

Gene Information

Operon

Protein information

EC number Localization

Domains

Gene Ontology

Orthologs

Gene Ontology

Sequence Similarity Search Results

Future Development• Francisella SNP data• Protein molecular weight and isoelectric pH

predictions• Metabolic pathway data and visualization• Enhanced query interface• Comparative genomics tools (synteny viewer,

whole genome alignment viewer)• Immune epitope predictions• Advanced functionalities for protein structure

viewer (display epitopes, protein functional sites)• Community annotation

Page 2: BioHealthBase: A Web-based Database and Analysis Resource for Francisella Shubhada Godbole 1, Jyothi Noronha 1, Burke Squires 1, Victoria Hunt 1, Ed Klem
Page 3: BioHealthBase: A Web-based Database and Analysis Resource for Francisella Shubhada Godbole 1, Jyothi Noronha 1, Burke Squires 1, Victoria Hunt 1, Ed Klem

Query Interface

1. Select a strain(S)CDC1551, H37Ra, H37Rv

2. Select a data typeGene Product Name

3. Select a search termShikimate Kinase

4. Select data to viewAll

Query Results Display

Users can1. Click on ‘Details’ to view the gene/protein information 2. Select genes for downloading information3. Select genes for further analysis

DownloadAnalysis

BioHealthBase Gene Details

BioHealthBase Gene Details

Genome Browser