+ multi-organism go annotation david osumi-sutherland gene ontology

28
+ Multi-organism GO annotation David Osumi-Sutherland Gene Ontology

Upload: corey-mitchell

Post on 02-Jan-2016

224 views

Category:

Documents


0 download

TRANSCRIPT

+

Multi-organism GO annotationDavid Osumi-SutherlandGene Ontology

+

+

“rif... [is] expressed on the infected red [blood] cell surface.”

Plasmodium falciparum

GP of one organism in CC of another

+

Plasmodium falciparum| Homo sapiens

• The species in the interaction are recorded... by entering two taxon IDs in the Taxon column:• First taxon ID: species encoding the gene product• First taxon ID: other species in the interaction.

GP of one organism in CC of another

+

Plasmodium falciparum| Homo sapiens

• The species in the interaction are recorded... by entering two taxon IDs in the Taxon column:• First taxon ID: species encoding the gene product• First taxon ID: other species in the interaction.

Why not just plasma membrane?

GP of one organism in CC of another

+A duplicate cell component branch is growing under host-cell

Why? Keeps annotations of gene

products located in CCs of host organism separate from regular GO CC annotations.

+A duplicate cell component branch is growing under host-cell

Many terms duplicate the regular cell component hierarchy. THIS IS NOT SUSTAINABLE

Some terms refer to components that are formed from the interaction of 2 organisms. Add new parent class for

these: multi-organism cell component

+Proposed solution

Plasmodium falciparumparasite_of Homo sapiens

located_in

GO:0005886 plasma membrane

+ annotation extension: part_of Homo sapiens

+Proposed solution

Plasmodium falciparumparasite_of Homo sapiens

located_in

GO:0005886 plasma membrane

• When used, replaces part_of as default GP to CC relation in GPAD• Tools used for grouping annotations need to group these annotations

separately. Precedent: NOT

New qualifier / GP to CC relation: located_in

+ annotation extension: part_of Homo sapiens

+Proposed solution

Plasmodium falciparumparasite_of Homo sapiens

located_in

GO:0005886 plasma membrane

New qualifier / GP to CC relation: located_in

+ annotation extension: part_of Homo sapiens

Replace host CC term with regular one, recording the taxon of the organism it is part of in the annotation extension.

+Proposed solution

Plasmodium falciparumparasite_of Homo sapiens

located_in

GO:0005886 plasma membrane

Taxon extension – using standard relations between organisms :

host_of, symbiont_of, parasite_of vector_for…

Relations defined with population and community ontology (PCO)

+ annotation extension: part_of Homo sapiens

+Proposed solution – retrofit when host not recorded

Plasmodium falciparumsymbiont_of host organism *

located_in

GO:0005886 plasma membrane

+ annotation extension: part_of host organism *

* Generic host organism class from PCO

+

Homo sapiens |Human rhinovirus 3

Multi-organism process annotation

+

Homo sapiens host_ofHuman rhinovirus 3

Proposed solution

+ annotation extension: occurs_in Homo sapiens

Optionally record where the process takes place

+

mate_of

host_of, parasite_of, vector_for

host_of/parasite_of

bitten_by ?

pregnant_mother_of ?/ fetus_of

?

+What does annotation mean?

Too strong: All instances of the RIF gene product are located in some

plasma membrane that is part of some Homo sapiens. All instances of the ICAM-1 gene product are involved in

some HRV3 virion attachment to (human) host cell.

+Capabilities are not necessarily realised

All sperm capable_of (some) fertilization But not all sperm do. In fact most do not.

All ICAM-1 gene product capable_of_part_of some ‘virion attachment to host cell’ But most don’t.

+LEGO-ish class level formalisation of ‘lego’ brick

Gask3 SubClassOf: capable_of some ('protein kinase activity'

that (part_of some 'canonical wnt signalling’)and (occurs_in some cytosol))

protein kinase activity

canonical Wnt signaling

P

Gask3b

cytosol

+The same pattern works for three separate annotations

Gask3 SubClassof:capable_of some 'protein kinase activity’

capable_of some('molecular_function' that (part_of some 'canonical wnt signalling'))

capable_of some ('molecular_function' that (occurs_in some cytosol))

protein kinase activity

canonical Wnt signaling

P

Gask3b

cytosol

+Using this pattern for localisation to CC of an interacting organismRIF subClassOf: capable_of some ( 'molecular function' that occurs_in_location some ( 'plasma membrane' that part_of some ( 'Homo sapiens' that host_of some 'Plasmodium falciparum 3D7’ ) ))

occurs_in_location = more general version of occurs_in that entails location rather than parthod

+Using this pattern for multi-organism process

ICAM1 subClassOf: capable_of some ( 'molecular function' that part_of some ( 'receptor-mediated virion attachment to host cell' that occurs_in some ( ‘Homo sapiens' that host_of some 'Human rhinovirus 3' ) ))

ICAM1 encoded_bysome Homo sapiens

+

Ba71V-98 subClassOf: capable_of some ( 'molecular function' that part_of some ( 'virion attachment to host cell' that occurs_in some ( Cercopithecus aethiops that host_of some 'African swine fever virus’ ) ))

Ba71V-98 encoded_by some‘African swine fever virus’

ASFV virion

Using this pattern for multi-organism process

+Multi-organism lego brick

ICAM1 subClassOf: capable_of some ( ’protein binding' that part_of some ( 'receptor-mediated virion attachment to host cell' that occurs_in some ( ‘plasma membrane’ that part_of some ( ‘Homo sapiens' that host_of some 'Human rhinovirus 3’ ) ) ))

ICAM1 encoded_bysome Homo sapiens

+

OWL DEMO

+But this pattern can’t work directly for Lego construction

Visualization

protein kinase activity

canonical Wnt signaling

P

Gask3b

cytosol

protein binding

Btrc

cytosol

required_for

Individual: gobp-0000000001Types: ‘cannonical wnt signaling’#Individual: gomf-0000012345Types: ‘protein kinase activity occurs_in some cytosol, enabled_by some PR:P87654 Facts: required_for gomf-0000012346Individual: gomf-0000012346Types: ‘protein binding’, occurs_in some cytosol, enabled_by some Btrc

+Pattern is incompatible with current Lego pattern

Visualization

protein kinase activity

canonical Wnt signaling

P

Gask3b

cytosol

protein binding

Btrc

cytosol

required_for

Individual: gobp-0000000001Types: ‘cannonical wnt signaling’#Individual: gomf-0000012345Types: ‘protein kinase activity occurs_in some cytosol, enabled_by some PR:P87654 Facts: required_for gomf-0000012346Individual: gomf-0000012346Types: protein binding, occurs_in some cytosol, enabled_by some Btrc

Note – CC & GP are classes

+Possible Solutions:

Parallel translations? Same basic pattern used in each case Lego bricks capture a set of related findings in a single

paper Class level assertion makes sense for this Lego construction captures a model of how these fit

together under some circumstances…

Change LEGO pattern CC & GP both become individuals

+Acknowledgments

Jane Lomax

Rebecca Foulger

Chris Mungall

Ramona Walls (PCO – relation defs)

Jie Zheng (PCO – relation defs)