gene ontology john pinney [email protected]

of 36 /36
Gene Ontology John Pinney [email protected]

Author: druce

Post on 24-Feb-2016

54 views

Category:

Documents


0 download

Embed Size (px)

DESCRIPTION

Gene Ontology John Pinney [email protected] Gene annotation. G oal: transfer knowledge about the function of gene products from model organisms to other genomes. Gene annotation. Problem: keyword systems are different between research communities. Gene annotation. - PowerPoint PPT Presentation

TRANSCRIPT

PowerPoint Presentation

Gene Ontology

John [email protected]

Gene annotationGoal: transfer knowledge about the function of gene products from model organisms to other genomesGene annotationProblem:keyword systems are different between research communitiesGene annotationSolution:controlled vocabularyOntologystructuredcontrolled vocabularyOntology:a collection of terms

and their definitions

and the logical relationships between themGene Ontology (GO):a collection of terms

and their definitions

and the logical relationships between themdescribing gene productsnucleusA membrane-bounded organelle of eukaryotic cells in which chromosomes are housed and replicated. In most cells, the nucleus contains all of the cell's chromosomes except the organellar chromosomes, and is the site of RNA synthesis and processing. In some species, or in specialized cell types, RNA metabolism or DNA replication may be absent.GO:0005634nucleuscellnuclear membranenucleoplasmnucleoluspart ofnucleusintracellular membrane-bounded organellepronucleusintracellular organelleis amembrane-bounded organelleA term may have more than one parent termandmore than one child term.=>The gene ontology is not a treeThe gene ontology has a structure known as a Directed Acyclic Graph (DAG).relationships are not symmetricalthere are no directed loopsmathematical term for a networkGO is actually made up of 3 different ontologies:

cellular componentmolecular functionbiological processcellular component

The part of a cell or its extracellular environment in which a gene product is located. A gene product may be located in one or more parts of a cell.

cellular componentexamples:

cohesin core heterodimerextracellular regionlaminin-1 complexreplication forktranscription factor complexmolecular function

Elemental activities, such as catalysis or binding, describing the actions of a gene product at the molecular level. A given gene product may exhibit one or more molecular functions.

molecular functionexamples:

transcription factor bindingenzyme activator activity3'-nucleotidase activitymetallopeptidase activityhexokinase activity

biological process

Those processes specifically pertinent to the functioning of integrated living units: cells, tissues, organs, and organisms. A process is a collection of molecular events with a defined beginning and end.

biological processexamples:

para-aminobenzoic acid biosynthetic processprotein localizationestablishment of blood-nerve barriercircadian rhythmposterior midgut developmentgeneontology.orggeneontology.orgsearch and browse the ontologies

geneontology.orgsearch and browse the ontologies

geneontology.orgdownload ontologies

geneontology.orgdownload mappings from other databasesenzyme functions (EC, KEGG, MetaCyc)protein domains(Pfam, SMART, PRINTS,)other controlled vocabularies of functions(E. coli functions, MIPS FunCat)geneontology.orgdownload annotations for various genomes

NCBI_NPNP_354299.2lolDGO:0043190ISS"ABC transporter, nucleotide binding/ATPase protein (lipoprotein)"taxon:17629920070612PAMGO_GATgeneontology.orgdownload annotations for various genomesdatabasegene product IDgene symbolGO term IDevidence codeevidence codes

Allow curators to indicate the type of evidence for each gene-term annotation.

experimental

computational

author statement

e.g. IMPInferred from mutant phenotype IDAInferred from direct assay e.g. ISSInferred from sequence similarityIGCInferred from genome context e.g. TASTraceable author statementNCBI_NPNP_354299.2lolDGO:0043190ISS"ABC transporter, nucleotide binding/ATPase protein (lipoprotein)"taxon:17629920070612PAMGO_GATgeneontology.orgdownload annotations for various genomesdatabasegene product IDgene symbolGO term IDevidence codedescriptionorganism (taxon) IDdateannotation project IDgeneontology.orgrepository of analysis tools that use GOsearch, edit and and browse ontologies / annotationssoftware librariesstatistical analysistext miningprotein interactionsenrichment analysisEnrichment analysissignificant expression change in a microarray experimentcluster from a protein interaction networksome other experiment / analysisgene setwhole genome (annotated)Which GO terms occur significantly more often than expected in this gene set?BiNGOGOstatArrayTrackAdvantages of GOsingle set of terms to describe the function of gene products from all organisms.DAG structure provides a logical framework to represent knowledge at whatever level of detail is available.continually revised to reflect the state of current knowledge.can quantify strength of relationships between terms (semantic similarity).many statistical analysis tools available.

Limitations of GOGO is limited in scope: it does not cover processes that are not normal functions of gene products (e.g. oncogenesis).sequence attributes (e.g. introns/exons)protein structures or interactionsevolutiongene expression

Summary (1)The gene ontology (GO) is a structured, controlled vocabulary to describe the function of gene products.

Terms in GO have logical relationships (is a, part of) with one another. Together these form a structure called a Directed Acyclic Graph (DAG).

GO is formed of 3 separate ontologies describing different aspects of gene function: cellular component, molecular function and biological process.

Summary (2)geneontology.org is the central resource for downloading ontology, annotation and mapping files.

evidence codes are used in annotations to show the experimental, computational or literature support for each function.

Summary (3)many software tools are available to support GO analysis of experimental data, including enrichment analysis byArrayTrack (microarray expression data)BiNGO (protein interaction clusters)GOstat (any data in the form of gene sets)