sharing microarray experiment knowledge chips to hits oct. 28, 2002 chris stoeckert, ph.d. dept. of...

31
Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of Pennsylvania

Upload: meagan-norton

Post on 12-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

Sharing Microarray Experiment Knowledge

Chips to Hits Oct. 28, 2002

Chris Stoeckert, Ph.D.

Dept. of Genetics & Center for Bioinformatics

University of Pennsylvania

Page 2: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

Nature, October 3, 2002

http://plasmodb.org/David Roos, Jessie Kissinger, Bindu Gajria, Martin Fraunholz, Jules Milgram, Phil

Labo, Amit Bahl, Dave Pearson, Dinesh Gupta, Hagai GinsburgJonathan Crabtree, Jonathan Schug, Brian Brunk, Greg Grant, Trish Whetzel, Matt

Mailman, Li Li

Page 3: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of
Page 4: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

Desirable Microarray Queries

• Return all experiments using developmental stage X.– Sort by platform type– Which are untreated? Treated?

• Treated by what

• How comparable are these?

• What can these experiments tell me?

Page 5: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

Microarray Information to be Shared

Figure from:David J. Duggan et al. (1999) Expression Profiling using cDNA microarrays. Nature Genetics 21: 10-14

Page 6: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

The Computational View of Microarray Information

Need an ontology to unambiguously represent this information.

Page 7: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

What is an Ontology?• In philosophy, an ontology is a systematic account of

Existence.• In AI, an ontology is a systematic account of what can

be represented.• The knowledge of a domain is represented in a

declarative formalism.– Classes, relations, functions, or other objects are defined

with human-readable text describing what the names mean, and formal axioms that constrain the interpretation.

• A common ontology defines the vocabulary with which queries and assertions are exchanged.

Excerpted and adapted from: http://www-ksl.stanford.edu/kst/what-is-an-ontology.html

Page 8: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

An Experimental Ontology

• An ontology for microarray experiments– Not an ontology of life but of experiments – Parts are applicable to describing experiments in

general

• Our approach to interfacing with other ontologies is “experimental”– Not mapping terms from related ontologies– Provide a framework to hang other ontologies off of

• Know where to find different types of annotation• How to interpret that annotation

Page 9: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

http://www.mged.org

Page 10: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

Relationship of MGED Efforts

MAGEMIAMEDB

MIAMEDBExternal

Ontologies/CVs

MGED Ontology

Software and database developers

Investigators annotating experiments

Page 11: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

The MGED Ontology Home Page

http://www.cbil.upenn.edu/Ontology

Page 12: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

The MGED Ontology Home Page

http://mged.sourceforge.net/ontologies/

Page 13: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

The MGED Ontology Provides a Listing of Resources for Many Species

Page 14: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

The MGED Ontology Organizes the Resources According to Concepts

Page 15: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

The MGED Ontology is Structured in DAML+OIL using OILed 3.4

Page 16: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

MGED Ontology: BiomaterialDescription: BiosourceProperty: Age

Page 17: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

MGED Ontology: BiosourceOntologyEntry: DiseaseState

Page 18: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

External References ©-BioMaterialDescription

©-Biosource Property

©-Organism

©-Age

©-DevelopmentStage

©-Sex

©-StrainOrLine

©-BiosourceProvider

©-OrganismPart

©-BioMaterialManipulation

©-EnvironmentalHistory

©-CultureCondition

©-Temperature

©-Humidity

©-Light

©-PathogenTests

©-Water

©-Nutrients

©-Treatment

©-CompoundBasedTreatment

(Compound)

(Treatment_application)

(Measurement)

MGED Ontology Instances

NCBI TaxonomyNCBI Taxonomy

Mouse Anatomical DictionaryMouse Anatomical Dictionary

International Committee on Standardized Genetic Nomenclature for Mice

International Committee on Standardized Genetic Nomenclature for Mice

Mouse Anatomical DictionaryMouse Anatomical Dictionary

ChemIDplusChemIDplus

Mus musculus musculus id: 39442

7 weeks after birth

Stage 28

Female

C57BL/6N

Charles River, Japan

Liver

22 2C

55 5%

12 hours light/dark cycle

Specified pathogen free conditions

ad libitum

MF, Oriental Yeast, Tokyo, Japan

Fenofibrate, CAS 49562-28-9

in vivo, oral gavage

100mg/kg body weight

An example of microarray sample annotation using the MGED ontology Susanna A. Sansone, Helen Parkinson, Philippe Rocca-Serra,

Chris Stoeckert and Alvis Brazma

Page 19: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

The MGED Ontology in Action: MIAMExpress

Page 20: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

Journals are Adopting the MGED Standards

Use of Minimal Information About Microarray Experiment (MIAME)

Page 21: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

The MGED Ontology in Action: RAD

Page 22: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of
Page 23: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

Generating Forms from the MGED Ontology

OntologyEntry

ExternalDatabases

PHP/SQL WWW

RAD Forms

MGED OntologyAnatomy

DevelopmentalStageDiseaseLineage

PATOAttributePhenotype

Taxon

SRES

RAD3

MGED Ontology

Page 24: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

Using the MGED standards in RAD• RAD: RNA Abundance Database

– Stoeckert et al.(2000) Bioinformatics

• RAD 3.0– MIAME compliant and MAGE supportive– Building Importers, exporters for MAGE

• Incorporates MGED ontology– Uses OntologyEntry to point to internal tables and

external resources

• Expand processing and analysis information storage– Driven by experience and new approaches

Page 25: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

ElementAnnotation

Analysis

AnalysisImplementationParam

AnalysisInput

AnalysisImplementation1

0..*1

0..*

1 0..*1 0..*

AnalysisInvocationParamAnalysisInvocation1

0..*1

0..*

1

0..*

1

0..*

1 0..*1 0..*

AnalysisOutput

1

0..*

1

0..*

CompositeElementAnnotation

ArrayAnnotation

CompositeElementImp

0..*0..1 0..*0..1

1

0..*

1

0..*

ElementResultImp CompositeElementResultImp

1

0..*

1

0..*

0..10..* 0..10..*

QuantificationParam

RelatedQuantification

Study

StudyDesignDescription

StudyAssay10..* 10..*

StudyDesignAssay

StudyFactorValueAssayLabeledExtract

BioMaterialImp1

0..*

1

0..*

LabelMethod

0..1

0..*

0..1

0..*

ProtocolParam

MAGEDocumentation

MAGE_ML

0..*

1

0..*

1

AcquisitionParam

Assay

1

0..*

1

0..*

1

0..*

1

0..*

1

0..*

1

0..*

1

0..*

1

0..*

Channel

1

0..*

1

0..*

0..*0..1

0..*0..1

Quantification1

0..*

1

0..*1

0..*

1

0..*

10..*

10..*

1 0..*1 0..*1 0..*1 0..*

Acquisition1

0..*

1

0..*

1

0..*

1

0..*

1

0..*

1

0..*

1

0..*

1

0..*

RelatedAcquisition1 0..*1 0..*1 0..*1 0..*

ProcessImplementationParam

ProcessIO

ProcessInvocation

1

0..*

1

0..*

ProcessInvocationParam10..* 10..*

Array

1

0..*

1

0..*

10..*

10..* 1 0..*1 0..*

BioMaterialMeasurement1 0..*1 0..*

Protocol

1

0..*

1

0..*

1

0..*

1

0..*

0..1

0..*

0..1

0..*

0..1

0..*

0..1

0..*Treatment

1

0..*

1

0..*

1

0..*

1

0..*

0..1

0..*

0..1

0..*

StudyDesign

1

0..*

1

0..*10..* 10..*

1 0..*1 0..*

BioMaterialCharacteristic1

0..*1

0..*

ProcessImplementation10..* 10..*

1

0..*

1

0..*

ElementImp

0..10..* 0..10..*

1

0..*

1

0..*

1

0..*

1

0..*

1

0..*

1

0..*

Control

1

0..*

1

0..*

ProcessResult1 0..*1 0..*

StudyFactor

1

0..*

1

0..*

10..* 10..*

OntologyEntry10..* 10..*

0..*0..1

0..*0..1

1

0..*

1

0..*

RAD schema uses MAGE/MIAMEMAGE

ExperimentArray

BioMaterialBioAssay

BioAssayData Protocol, Descr.

HigherLevelAnalysis

MAGEExperiment

ArrayBioMaterial

BioAssayBioAssayData

Protocol, Descr.HigherLevelAnalysis

MIAMEExperimental Design

Array designSamples

Hybridization, MeasureNormalization

.

MIAMEExperimental Design

Array designSamples

Hybridization, MeasureNormalization

.

Page 26: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

RAD is now part of GUS-3.0 GUS has 5 name spaces compartmentalizing different

types of information.

Namespace Domain Features

Core Data Provenance Workflows

Sres Shared resorurces Ontologies

DoTSsequence and

annotationCentral dogma

RAD Gene expresssion MIAME/MAGE

TESS Gene regulation Grammars

Page 27: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

Data Integration

• GO• Species• Tissue• Dev. Stage

Ontologies

SRes

acute myeloid leukemia

Data Provenance

• Ownership• Protection• Algorithms• Similarity• Versioning• Workflow

Core

with sequence similarity to c-fos

GenomicSequence

• Genes, gene models• STSs, repeats, etc• Cross-species analysis

TranscribedSequence

• Characterize transcripts• RH mapping• Library analysis • Cross-species analysis• DOTS

ProteinSequence

• Domains• Function• Structure• Cross-species analysis

DoTS

Transcription factors

•Arrays•SAGE•Conditions

TranscriptExpression

RAD

up-regulated in

• Binding Sites• Patterns• Grammars

Gene Regulation

TESS

and common promoter motifs

Page 28: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

GUS Supports Multiple ProjectsAllGenesAllGenes PlasmoDBPlasmoDB

EPConDBEPConDB

CoreSRESTESSRADDoTS

Oracle RDBMS Object Layer for Data Loading

Java ServletsOther sites,Other projects,e.g. GeneDB

Other sites,Other projects,e.g. GeneDB

Available at http://www.gusdb.org

Page 29: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

Summary• The MGED ontology is being developed within the microarray

community to provide consistent terminology for experiments.– Make it easier and more accurate to annotate a microarray experiment. – Use structured fields and controlled terms to query databases.

• This community effort has resulted in a list of multiple resources for many species and a machine-readable document of microarray concepts, definitions, and values.– The MGED Ontology is a work in progress but can be used now to

build forms for databases• RAD has incorporated the MGED ontology for forms

– Can export data from RAD into MAGE– RAD as part of GUS provides integration of gene expression,

annotation, and sequence.

Page 30: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

Acknowledgements

• MGED Ontology– Helen Parkinson (EBI)

– Trish Whetzel

– The MGED Ontology Working Group

– MAGE working group

• RAD/GUS– Brian Brunk– Jonathan Crabtree– Steve Fischer– Yongchang Gan– Greg Grant – Hongxian He– Li Li– Junmin Liu – Matt Mailman– Elizabetta Manduchi– Joan Mazzarelli– Shannon McWeeney (OHSU) – Debbie Pinney– Angel Pizarro– Jonathan Schug– Trish Whetzel

www.mged.org www.cbil.upenn.edu

Page 31: Sharing Microarray Experiment Knowledge Chips to Hits Oct. 28, 2002 Chris Stoeckert, Ph.D. Dept. of Genetics & Center for Bioinformatics University of

http://www.ebi.ac.uk/SOFG