using the gene ontology: gene product annotation

34
Using The Gene Ontology: Gene Product Annotation

Upload: marylou-bryan

Post on 28-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

Using The Gene

Ontology:Gene Product Annotation

• Compile structured vocabularies describing aspects of molecular biology

• Describe gene products using vocabulary terms (annotation)

• Develop tools:• to query and modify the vocabularies and annotations• annotation tools for curators

GO Project Goals

GO provides two bodies of data:

• Terms with definitions and cross- references

• Gene product annotations with supporting data

GO Data

•Molecular Function — elemental activity or task

nuclease, DNA binding, transcription factor

•Biological Process — broad objective or

goalmitosis, signal transduction, metabolism

•Cellular Component — location or complexnucleus, ribosome, origin recognition complex

The Three Ontologies

DAG Structure

Directed acyclic graph: each child may have one or more

parents

Every path from a node back to the root must be biologically accurate

The True Path Rule

• Association between gene product and applicable GO terms

• Provided by member databases

• Made by manual or automated methods

GO Annotation

DAG Structure

Annotate to any level within DAG

DAG Structure

Annotate to any level within DAG

mitotic chromosome condensation

S.c. BRN1, D.m. barren

DAG Structure

Annotate to any level within DAG

mitosisS.c. NNF1

mitotic chromosome condensation

S.c. BRN1, D.m. barren

• Database object: gene or gene product

• GO term ID

• Reference

• publication or computational method

• Evidence supporting annotation

GO Annotation: Data

IDA - Inferred from Direct Assay

IMP - Inferred from Mutant Phenotype

IGI - Inferred from Genetic Interaction

IPI - Inferred from Physical Interaction

IEP - Inferred from Expression Pattern

GO Evidence Codes

TAS - Traceable Author Statement

NAS - Non-traceable Author Statement

IC - Inferred by Curator

ISS - Inferred from Sequence or structural Similarity

IEA - Inferred from Electronic Annotation

ND - Not Determined

IDA - Inferred from Direct Assay

IMP - Inferred from Mutant Phenotype

IGI - Inferred from Genetic Interaction

IPI - Inferred from Physical Interaction

IEP - Inferred from Expression Pattern

GO Evidence Codes

TAS - Traceable Author Statement

NAS - Non-traceable Author Statement

IC - Inferred by Curator

ISS - Inferred from Sequence or structural Similarity

IEA - Inferred from Electronic Annotation

ND - Not Determined

From primary literature

IDA - Inferred from Direct Assay

IMP - Inferred from Mutant Phenotype

IGI - Inferred from Genetic Interaction

IPI - Inferred from Physical Interaction

IEP - Inferred from Expression Pattern

GO Evidence Codes

TAS - Traceable Author Statement

NAS - Non-traceable Author Statement

IC - Inferred by Curator

ISS - Inferred from Sequence or structural Similarity

IEA - Inferred from Electronic Annotation

ND - Not Determined

From reviews or introductions

From primary literature

IDA - Inferred from Direct Assay

IMP - Inferred from Mutant Phenotype

IGI - Inferred from Genetic Interaction

IPI - Inferred from Physical Interaction

IEP - Inferred from Expression Pattern

GO Evidence Codes

TAS - Traceable Author Statement

NAS - Non-traceable Author Statement

IC - Inferred by Curator

ISS - Inferred from Sequence or structural Similarity

IEA - Inferred from Electronic Annotation

ND - Not Determined

From reviews or introductions

From primary literature

IDA - Inferred from Direct Assay

IMP - Inferred from Mutant Phenotype

IGI - Inferred from Genetic Interaction

IPI - Inferred from Physical Interaction

IEP - Inferred from Expression Pattern

GO Evidence Codes

TAS - Traceable Author Statement

NAS - Non-traceable Author Statement

IC - Inferred by Curator

ISS - Inferred from Sequence or structural Similarity

IEA - Inferred from Electronic Annotation

ND - Not Determined

From reviews or introductions

From primary literature automated

• Manual

• Automated• sequence similarity• transitive annotation• nomenclature, other text matching

GO Annotation: Methods

Experiment 1 - Purification and enzyme assayPurified His-tagged Ylr209cp; can convert various nucleoside substrates to bases + Pi; inosine and guanosine are substrates

Experiment 2 - Knockout of YLR209Cnull mutant excretes inosine and guanosine into medium (compounds in medium separated by chromatography and identified by HPLC separation profiles)

Literature-Based Manual Annotation: Experimental Evidence

CodesLecoq, K., et al. (2001) YLR209C Encodes

Saccharomyces cerevisiae Purine Nucleoside Phosphorylase. J. Bacteriology 183(16): 4910-4913.

Experiment 1 - Purification and enzyme assayPurified His-tagged Ylr209cp; can convert various nucleoside substrates to bases + Pi; inosine and guanosine are substrates

Experiment 2 - Knockout of YLR209Cnull mutant excretes inosine and guanosine into medium (compounds in medium separated by chromatography and identified by HPLC separation profiles)

Literature-Based Manual Annotation: Experimental Evidence

CodesLecoq, K., et al. (2001) YLR209C encodes

Saccharomyces cerevisiae purine nucleoside phosphorylase. J. Bacteriol. 183(16): 4910–4913.

IDA

Experiment 1 - Purification and enzyme assayPurified His-tagged Ylr209cp; can convert various nucleoside substrates to bases + Pi; inosine and guanosine are substrates

Experiment 2 - Knockout of YLR209Cnull mutant excretes inosine and guanosine into medium (compounds in medium separated by chromatography and identified by HPLC separation profiles)

Literature-Based Manual Annotation: Experimental Evidence

CodesLecoq, K., et al. (2001) YLR209C encodes

Saccharomyces cerevisiae purine nucleoside phosphorylase. J. Bacteriol. 183(16): 4910–4913.

FUNCTION:

purine nucleoside phosphorylase

IDA

Experiment 1 - Purification and enzyme assayPurified His-tagged Ylr209cp; can convert various nucleoside substrates to bases + Pi; inosine and guanosine are substrates

Experiment 2 - Knockout of YLR209Cnull mutant excretes inosine and guanosine into medium (compounds in medium separated by chromatography and identified by HPLC separation profiles)

Literature-Based Manual Annotation: Experimental Evidence

CodesLecoq, K., et al. (2001) YLR209C ncodes Saccharomyces cerevisiae purine nucleoside phosphorylase. J. Bacteriol.

183(16): 4910–4913.

FUNCTION:

purine nucleoside phosphorylase

IDA

IMP

Experiment 1 - Purification and enzyme assayPurified His-tagged Ylr209cp; can convert various nucleoside substrates to bases + Pi; inosine and guanosine are substrates

Experiment 2 - Knockout of YLR209Cnull mutant excretes inosine and guanosine into medium (compounds in medium separated by chromatography and identified by HPLC separation profiles)

Literature-Based Manual Annotation: Experimental Evidence

CodesLecoq, K., et al. (2001) YLR209C encodes

Saccharomyces cerevisiae purine nucleoside phosphorylase. J. Bacteriol. 183(16): 4910–4913.

FUNCTION:

purine nucleoside phosphorylase

IDA

IMP

Experiment 1 - Purification and enzyme assayPurified His-tagged Ylr209cp; can convert various nucleoside substrates to bases + Pi; inosine and guanosine are substrates

Experiment 2 - Knockout of YLR209Cnull mutant excretes inosine and guanosine into medium (compounds in medium separated by chromatography and identified by HPLC separation profiles)

Literature-Based Manual Annotation: Experimental Evidence

CodesLecoq, K., et al. (2001) YLR209C encodes

Saccharomyces cerevisiae purine nucleoside phosphorylase. J. Bacteriol. 183(16): 4910–4913.

FUNCTION:

purine nucleoside phosphorylase

IDA

PROCESS:purine nucleoside catabolism

IMP

Experiment 1 - Purification and enzyme assayPurified His-tagged Ylr209cp; can convert various nucleoside substrates to bases + Pi; inosine and guanosine are substrates

Experiment 2 - Knockout of YLR209Cnull mutant excretes inosine and guanosine into medium (compounds in medium separated by chromatography and identified by HPLC separation profiles)

Literature-Based Manual Annotation: Experimental Evidence

CodesLecoq, K., et al. (2001) YLR209C encodes

Saccharomyces cerevisiae purine nucleoside phosphorylase. J. Bacteriol. 183(16): 4910–4913.

FUNCTION:

purine nucleoside phosphorylase

IDA

PROCESS:purine nucleoside catabolism

IMP

This paper has no data for cellular component.

InterPro2go links InterPro entries and GO terms

Automated Annotation: InterPro Example

YFP

InterProentry

GOentry

InterPro2go links InterPro entries and GO terms

Automated Annotation: InterPro Example

YFP

InterProentry

GOentry

Run InterProScan to link YFP and InterPro entry

InterPro2go links InterPro entries and GO terms

Automated Annotation: InterPro Example

YFPInfer GO term from the other two links

InterProentry

GOentry

Run InterProScan to link YFP and InterPro entry

detailed view of term

AmiGO Browser

AmiGO Browser

gene productsannotated to term

• FlyBase • WormBase• Saccharomyces Genome Database • DictyBase• Mouse Genome Informatics • Gramene• The Arabidopsis Information Resource • Compugen, Inc.• Swiss-Prot/TrEMBL/InterPro

• Pathogen Sequencing Unit (Sanger Institute)

• PomBase (Sanger Institute)

• Rat Genome Database

• The Institute for Genomic Research

GO Annotation: Contributors

• Fruit fly (Drosophila melanogaster)• Budding yeast (Saccharomyces cerevisiae)

• Fission yeast (Schizosaccharomyces pombe)• Human (Homo sapiens)

• Mouse (Mus musculus) • Rice (Oryza sativa)

• Rat (Rattus norvegicus) • Tsetse fly (G. morsitans)

• Caenorhabditis elegans • Arabidopsis thaliana

• Vibrio cholerae • Dictyostelium discoideum

GO Annotation: Organisms

Current GO Annotations

• FlyBase & Berkeley Drosophila Genome Project • WormBase• Saccharomyces Genome Database • DictyBase• Mouse Genome Informatics • Gramene• The Arabidopsis Information Resource • Compugen, Inc.• Swiss-Prot/TrEMBL/InterPro

• Pathogen Sequencing Unit (Sanger Institute)

• PomBase (Sanger Institute)

• Rat Genome Database

• Genome Knowledge Base (CSHL)

• The Institute for Genomic Research

www.geneontology.org

The Gene Ontology Consortium is supported by NHGRI grant HG02273 (R01). The Gene Ontology project thanks AstraZeneca for financial support. The Stanford group acknowledges a gift from Incyte Genomics.

Conference:

Standards and Ontologies for Functional Genomics (SOFG)

Towards unified ontologies for describing biology and biomedicine

17 – 20 November 2002

Hinxton Hall Conference CentreHinxton, Cambridge, UK

www.wellcome.ac.uk/hinxton/sofg