gene ontology john pinney [email protected]

of 36 /36
Gene Ontology John Pinney [email protected]

Author: harriet-day

Post on 18-Dec-2015

216 views

Category:

Documents


1 download

Embed Size (px)

TRANSCRIPT

  • Slide 1
  • Gene Ontology John Pinney [email protected]
  • Slide 2
  • Gene annotation Goal: transfer knowledge about the function of gene products from model organisms to other genomes
  • Slide 3
  • Gene annotation Problem: keyword systems are different between research communities
  • Slide 4
  • Gene annotation Solution: controlled vocabulary
  • Slide 5
  • Ontology structured controlled vocabulary
  • Slide 6
  • Ontology: a collection of terms and their definitions and the logical relationships between them
  • Slide 7
  • Gene Ontology (GO): a collection of terms and their definitions and the logical relationships between them describing gene products
  • Slide 8
  • nucleus A membrane-bounded organelle of eukaryotic cells in which chromosomes are housed and replicated. In most cells, the nucleus contains all of the cell's chromosomes except the organellar chromosomes, and is the site of RNA synthesis and processing. In some species, or in specialized cell types, RNA metabolism or DNA replication may be absent. GO:0005634
  • Slide 9
  • nucleus cell nuclear membrane nucleoplasm nucleolus part of
  • Slide 10
  • nucleus intracellular membrane- bounded organelle pronucleus intracellular organelle is a membrane-bounded organelle
  • Slide 11
  • A term may have more than one parent term and more than one child term. => The gene ontology is not a tree
  • Slide 12
  • The gene ontology has a structure known as a Directed Acyclic Graph (DAG). relationships are not symmetrical there are no directed loops mathematical term for a network
  • Slide 13
  • GO is actually made up of 3 different ontologies: cellular component molecular function biological process
  • Slide 14
  • cellular component The part of a cell or its extracellular environment in which a gene product is located. A gene product may be located in one or more parts of a cell.
  • Slide 15
  • cellular component examples: cohesin core heterodimer extracellular region laminin-1 complex replication fork transcription factor complex
  • Slide 16
  • molecular function Elemental activities, such as catalysis or binding, describing the actions of a gene product at the molecular level. A given gene product may exhibit one or more molecular functions.
  • Slide 17
  • molecular function examples: transcription factor binding enzyme activator activity 3'-nucleotidase activity metallopeptidase activity hexokinase activity
  • Slide 18
  • biological process Those processes specifically pertinent to the functioning of integrated living units: cells, tissues, organs, and organisms. A process is a collection of molecular events with a defined beginning and end.
  • Slide 19
  • biological process examples: para-aminobenzoic acid biosynthetic process protein localization establishment of blood-nerve barrier circadian rhythm posterior midgut development
  • Slide 20
  • geneontology.org
  • Slide 21
  • search and browse the ontologies
  • Slide 22
  • geneontology.org search and browse the ontologies
  • Slide 23
  • geneontology.org download ontologies
  • Slide 24
  • geneontology.org download mappings from other databases enzyme functions (EC, KEGG, MetaCyc) protein domains (Pfam, SMART, PRINTS,) other controlled vocabularies of functions (E. coli functions, MIPS FunCat)
  • Slide 25
  • geneontology.org download annotations for various genomes
  • Slide 26
  • NCBI_NP NP_354299.2 lolD GO:0043190 ISS "ABC transporter, nucleotide binding/ATPase protein (lipoprotein)" taxon:176299 20070612 PAMGO_GAT geneontology.org download annotations for various genomes database gene product ID gene symbol GO term ID evidence code
  • Slide 27
  • evidence codes Allow curators to indicate the type of evidence for each gene-term annotation. experimental computational author statement e.g. IMPInferred from mutant phenotype IDAInferred from direct assay e.g. ISSInferred from sequence similarity IGCInferred from genome context e.g. TASTraceable author statement
  • Slide 28
  • NCBI_NP NP_354299.2 lolD GO:0043190 ISS "ABC transporter, nucleotide binding/ATPase protein (lipoprotein)" taxon:176299 20070612 PAMGO_GAT geneontology.org download annotations for various genomes database gene product ID gene symbol GO term ID evidence code description organism (taxon) ID date annotation project ID
  • Slide 29
  • geneontology.org repository of analysis tools that use GO search, edit and and browse ontologies / annotations software libraries statistical analysis text mining protein interactions enrichment analysis
  • Slide 30
  • Enrichment analysis
  • Slide 31
  • significant expression change in a microarray experiment cluster from a protein interaction network some other experiment / analysis gene set whole genome (annotated) Which GO terms occur significantly more often than expected in this gene set? BiNGO GOstat ArrayTrack
  • Slide 32
  • Advantages of GO single set of terms to describe the function of gene products from all organisms. DAG structure provides a logical framework to represent knowledge at whatever level of detail is available. continually revised to reflect the state of current knowledge. can quantify strength of relationships between terms (semantic similarity). many statistical analysis tools available.
  • Slide 33
  • Limitations of GO GO is limited in scope: it does not cover processes that are not normal functions of gene products (e.g. oncogenesis). sequence attributes (e.g. introns/exons) protein structures or interactions evolution gene expression
  • Slide 34
  • Summary (1) The gene ontology (GO) is a structured, controlled vocabulary to describe the function of gene products. Terms in GO have logical relationships (is a, part of) with one another. Together these form a structure called a Directed Acyclic Graph (DAG). GO is formed of 3 separate ontologies describing different aspects of gene function: cellular component, molecular function and biological process.
  • Slide 35
  • Summary (2) geneontology.org is the central resource for downloading ontology, annotation and mapping files. evidence codes are used in annotations to show the experimental, computational or literature support for each function.
  • Slide 36
  • Summary (3) many software tools are available to support GO analysis of experimental data, including enrichment analysis by ArrayTrack (microarray expression data) BiNGO (protein interaction clusters) GOstat (any data in the form of gene sets)