data retrieval

Post on 17-Jul-2015

127 Views

Category:

Education

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Data Retrieval

Access to Distributed dataBiological data is widely distributed over the WWW.

Data can be retrieved by, 1. Search engines

2. Data retrieval tools

Search EnginesExamples for Search Engines, Google Yahoo! Search LeapFish

Bing Using Search engines 1. Can find relevant web pages 2. It is difficult to find desired information 3. Difficult to find specific information.

Leapfish

bing

Data retrieval toolsDedicated to access information for molecular biologists.Most widely used are, 1. Entrez 2. DBGET 3. SRS Each of these allows, - Text based searching of a no. of linked DBs. - Sequence searching.They differ in, - The DBs they cover - How the retrieved information is accessed and presented.

Entrez- WWW-based data retrieval system.

- Developed by NCBI (National Centre for Biotechnology Information).

- Integrates information held in different DBs.

EntrezData bases covered by Entrez are,Nucleic acid - GenBank, RefSeq, PDB.Protein seqs - SWISS-PROT, PIR.3D structures – MMDBGenomes – Many sourcesPopSet – From GenBankOMIM – OMIMTaxonomy – NCBI taxonomy databaseBooks- BookshelfProbeSet – GEO (Gene Expression Omnibus)Literature - PubMed

Entrez

Entrez

DBGETAn integrated data retrieval system developed and maintained by, - The Institute for Chemical Research (Kyoto University) - The Human Genome Center (University of Tokyo)Data bases covered are, Nucleic acid Seqs – GenBank, EMBL Protein Seqs – SWISS-PROT, PIR 3D structures – PDB Seq motifs – PROSITE Enzyme reactions – LIGAND Literature – LITDB Medline etc.,

DBGET

SRSSRS - Sequence Retrieval System - Data retrieval tool developed by EBI - Integrates 80 molecular biology DBs - An Open source software (Can be installed locally)

SRS has an associated scripting language called Icarus

SRSSRS - Sequence Retrieval System - Data retrieval tool developed by EBI - Integrates 80 molecular biology DBs - An Open source software (Can be installed locally)

SRS has an associated scripting language called Icarus

Genomics

GenomicsWhat is Genomics? The study of genomes.

In addition to the coding regions (genes), genomics comprise: Control elements Introns and exons Gene clusters Elements common to all chromosomes Episomal elements

GenomicsBenefits of Genomics:Genome sequencing helps in, - Identifying new genes (Gene discovery) - Looking at chromosome organization and structure - Finding gene regulatory seqs - Comparative genomicsThese in turn lead to advances in, - Medicine - Agriculture - Animal husbandry - Biotech - Evolution

GenomicsBranches of Genomics,1. Structural Genomics – Building genomic maps, 3D structures.2. Functional Genomics – Transcriptomics, Proteomics, Metabolimomics, Enzymes3. Comparative Genomics – Population distribution and Phenotypic associations4. Evolutionary Genomics – Phylogenetic relationships5. Pharmacogenomics – Interaction of drugs with genomes, Drug discovery

GenomicsTools required for Genomics,Robotics- SequencingStatistics- SoftwareHigh throughput assays- MicroarraysHigh speed computing- Database workBioinformatics- Algorithms, Graphics

Proteomics Proteome is the protein complement of the genome Proteomics is the study of proteomes Human genome = 30,000 to 60,000 genes Human proteome = 300,000 to 12,00,000 Reasons for Proteome>Genome: - Multiple ORFs - PTM - Internal peptide products

ProteomicsGoal: Identify all the proteins expressed by a cell or tissue.

Why to study proteomics? Analysis of mRNA does not always correlate with expressed

proteins Some samples – Serum, Urine – can't be used for mRNA

studies. PTM can not be detected from mRNA Location of proteins can not be known from mRNA

ProteomicsSpecialized proteomics1. Expression Proteomics2. Cell Map Proteomics3. PTM4. Protein- Protein interactions5. Protein- Ligand Interactions6. Protein structure

ProteomicsProteomics approach, Separation of proteins using 2D electrophoresis. Stain gel Excise spots of interest Digest with trypsin Characterize peptides by MS/MALDI TOF Compare peptide seqs with database of seqs. Identify the class of proteins

ProteomicsMethods to study Protein-Protein interactions1. Yeast 2 Hybrid2. AP-MS (Affinity purification-MS)

Protein Microarrays can use immobilized- Proteins- Peptides- Carbohydrates- Antibodies- Small molecules to study other interactions.

Proteomics

Applications:1. Protein mining2. Differential expression profiling3. Network mapping4. Study protein modifications

top related