10-jun-15anatoly sorokin existing standards in systems biology anatoly sorokin computation systems...

57
Mar 17, 2022 Anatoly Sorokin Existing Standards in Systems Biology Anatoly Sorokin Computation Systems Biology Group University of Edinburgh

Upload: cornelius-sims

Post on 19-Dec-2015

224 views

Category:

Documents


6 download

TRANSCRIPT

Apr 18, 2023 Anatoly Sorokin

Existing Standards in Systems Biology

Anatoly Sorokin

Computation Systems Biology Group

University of Edinburgh

Standard

• 2000-2010 is decade of standards in biology– 31 MIBI standard– 56 OBO ontologies– About 80 exchange formats

• Scope of interest

• Language

• Controlled vocabulary

Standards and Languages

• CML – description of chemical structure• MathML – representation of mathematical

formulas• PSI – standard description of protein

interaction data• AnatML – language to describe interaction

at organ level• GeneOntology – standard and ontology to

describe gene function and regulation

Standards for Computational System Biology

• BioPAX – language for database of biological networks exchange

• SBML – language of biochemical model exchange

• CellML – language to describe mathematical models

• SBGN – visual language for biological model description

MI standards

• Reporting guidelines specify the minimum amount of meta data (information) and data required to meet a specific aim

• Aim is to provide enough meta data and data to enable the unambiguous reproduction and interpretation of an experiment.

• Normally informal human readable specifications that inform the development of formal data models (e.g. XML or UML), data exchange formats

Apr 18, 2023 Anatoly Sorokin

Exchange format

• Strict structure to exchange data of model

• Mainly XML

• Well defined meta-model, often supported by software API

Apr 18, 2023 Anatoly Sorokin

Ontologies

• “ontology deals with questions concerning what entities exist or can be said to exist, and how such entities can be grouped, related within a hierarchy, and subdivided according to similarities and differences” Wikipedia

• Often used as controlled vocabulary and description support framework

• GeneOntologyApr 18, 2023 Anatoly Sorokin

BioPAX

• “Biological PAthway eXchange - A data exchange ontology and format for biological pathway integration, aggregation and inference”

BioPAX Goals

• BioPAX = Biological PAthway eXchange• Data exchange format for pathway data• Include support for these pathway types:

– Metabolic pathways– Signaling pathways– Protein-protein, molecular interactions– Gene regulatory pathways– Genetic interactions

• Accommodate representations used in existing databases such as BioCyc, BIND, WIT, aMAZE, KEGG, Reactome, etc.

• PathwayCommons – collection of pathways in BioPAX– http://www.pathwaycommons.org

BioPAX

• BioPAX ontology and format in OWL (XML)• Ontology built using GKB Editor and Protégé• Semantic mapping still an issue• Level 1 represents metabolic pathway data• Level 2 adds support for molecular interactions,

post-translational modifications, experimental description from PSI-MI model (Backwards compatible)

• Level 3 adds support for generics, protein states, rearrange reaction representation

BioPAX Ontology: Top Level

• Pathway– A set of interactions– E.g. Glycolysis, MAPK, Apoptosis

• Interaction– A set of entities and some relationship between them– E.g. Reaction, Molecular Association, Catalysis

• Physical Entity– A building block of simple interactions– E.g. Small molecule, Protein, DNA, RNA

Entity

Pathway

Interaction

Physical Entity

Subclass (is a)Contains (has a)

BioPAX Ontology: InteractionsInteraction

Control Conversion

Catalysis BiochemicalReaction

ComplexAssembly

Modulation Transport

TransportWithBiochemicalReaction

Physical Interaction

BioPAX Ontology: Physical Entities

PhysicalEntity

Complex RNAProtein Small MoleculeDNA

BioPAX and other standards

BioPAX

PSI-MI 2

SBML,CellML

GeneticInteractions

Molecular InteractionsPro:Pro All:All

Interaction NetworksMolecular Non-molecularPro:Pro TF:Gene Genetic

Regulatory PathwaysLow Detail High Detail

Database ExchangeFormats

Simulation ModelExchange Formats

RateFormulas

Metabolic PathwaysLow Detail High Detail

Biochemical Reactions

Small MoleculesLow Detail High Detail

Simulation-related standards

Apr 18, 2023 Anatoly Sorokin

Model Simulation Result

?

SED-ML SBRML

MinimalRequirements

Exchange format

Ontology

implements implements

Makes sense of

Makes sense of

SBML

• “The Systems Biology Markup Language (SBML) is a computer-readable format for representing models of biochemical reaction networks. SBML is applicable to metabolic networks, cell-signaling pathways, regulatory networks, and many others. ”

SBML

– Reaction• container for rate law

– Species• reactants, products, or modifiers of reaction

– Compartment• container for species

– Parameter, Rule, Event

Characteristics of SBML

• Many top-level types, little nesting– Units, Compartment, Species, Parameter, Reaction, Rule, Function,

Event

• Non-modular structure– Next SBML ‘Level’ (3) will introduce modularity

• Emphasis on reactions• Some math implicit

– Explicit rate equations; implicit integration– Implicit concentration conversion between compartments

• Compartments are physical containers for species– Spatial dimensions (volume, surface)

Structure of SBML

Structure of SBML

• Note field of SBase intended to store information for human to read

• Annotation field of SBase provide a container for software-generated annotations that are not intended to be seen by humans

• The id field is usually required for most structures and is used to identify a component within the model definition.

• The name field is optional and provide a human-readable label for the component.

Apr 18, 2023 Anatoly Sorokin

Model Simulation Result

?

SED-ML SBRML

MinimalRequirements

Data model

Ontology

implements implements

Makes sense of

Makes sense of

MIRIAM

• Model description require extra information– Biological

• Description of elements of model

– Mathematical• Definition of math concepts

– Referential• Author name• Paper reference etc.

• http://www.ebi.ac.uk/compneur-srv/miriam/Apr 18, 2023 Anatoly Sorokin

Reference correspondence

• The model must be encoded in a public, standardized, machine-readable format (SBML, CellML, GENESIS ...)

• The model must comply with the standard in which it is encoded! • The model must be clearly related to a single reference description.

If a model is composed from different parts, there should still be a description of the derived/combined model.

• The encoded model structure must reflect the biological processes listed in the reference description.

• The model must be instantiated in a simulation: All quantitative attributes have to be defined, including initial conditions.

• When instantiated, the model must be able to reproduce all results given in the reference description within an epsilon (algorithms, round-up errors)

Apr 18, 2023 Anatoly Sorokin

Attribution annotation

• The model has to be named. • A citation of the reference description must be joined

(completecitation, unique identifier, unambigous URL). The citation should permit to identify the authors of the model.

• The name and contact of model creators must be joined.• The date and time of creation and last modification

should be specified. An history is useful but not required.• The model should be linked to a precise statement about

the terms of distribution. MIRIAM does not require “freedom of use” or “no cost”.

Apr 18, 2023 Anatoly Sorokin

External resource annotation

• The annotation must permit to unambiguously relate a piece of knowledge to a model constituent.

• The referenced information should be described using a triplet {data-type, identifier, qualifier} – The data-type should be written as a Unique Resource Identifier

(URI) – The identifier is analysed within the framework of the data-type.– Data-type and Identifier can be combined in a single URI

http://www.myResource.org/#myIdentifier urn:lsid:myResource.org:myIdentifier

– Qualifiers (optional) should refine the link between the model constitutent and the piece of knowledge: “has a”, “is version of”, “is homolog to” etc.

Apr 18, 2023 Anatoly Sorokin

Apr 18, 2023 Anatoly Sorokin

Apr 18, 2023 Anatoly Sorokin

Model Simulation Result

?

SED-ML SBRML

MinimalRequirements

Data model

Ontology

implements implements

Makes sense of

Makes sense of

SBO

• Part of OBO Foundry• Assign meanings to

mathematical elements of SBML

• Allows automatic validation of semantic consistency of math part of model

• http://www.ebi.ac.uk/sbo

Apr 18, 2023 Anatoly Sorokin

SBO

• Types and roles of reaction participants, including terms like “substrate”, “catalyst” etc., but also “macromolecule”, or “channel”.

• Parameter used in quantitative models. This vocabulary includes terms like “Michaelis constant” , “forward unimolecular rate constant”etc. A term may contain a precise mathematical expression stored as a MathML lambda function. The variables refer to other parameters.

• Mathematical expressions. Examples of terms are “mass action kinetics”, “Henri-Michaelis-Menten equation” etc. A term may contain a precise mathematical expression stored as a MathML lambda function. The variables refer to the other vocabularies.

• Modelling framework to precise how to interpret the rate-law. E.g. “continuous modelling”, “discrete modelling” etc.

• Event type, such as “catalysis” or “addition of a chemical group”.Apr 18, 2023 Anatoly Sorokin

SBO

Apr 18, 2023 Anatoly Sorokin

Apr 18, 2023 Anatoly Sorokin

Model Simulation Result

?

SED-ML SBRML

MinimalRequirements

Data model

Ontology

implements implements

Makes sense of

Makes sense of

MIASE• Minimum Information About a Simulation

Experiment– What base model to use & which modifications to apply

– What simulation task to run on those models (algorithms, see KiSAO; simulation parameters)

– How to post-process the numerical results and to present them

• http://www.ebi.ac.uk/compneur-srv/miase/• Subset of MISE bould be encoded in

SED-ML

Apr 18, 2023 Anatoly Sorokin

Description of models

Apr 18, 2023 Anatoly Sorokin

Description of models

Apr 18, 2023 Anatoly Sorokin

Simulations

Apr 18, 2023 Anatoly Sorokin

Simulation task

Apr 18, 2023 Anatoly Sorokin

Data generation

Apr 18, 2023 Anatoly Sorokin

Data generation

Apr 18, 2023 Anatoly Sorokin

Production of results

Apr 18, 2023 Anatoly Sorokin

Apr 18, 2023 Anatoly Sorokin

Model Simulation Result

?

SED-ML SBRML

MinimalRequirements

Data model

Ontology

implements implements

Makes sense of

Makes sense of

KiSAO

• Kinetic Simulation Algorithm Ontology– Classification of simulation algorithms &

methods – Definition, literature references – Relations between different simulation

algorithms & methods

• http://www.ebi.ac.uk/compneur-srv/kisao/index.html

Apr 18, 2023 Anatoly Sorokin

KiSAO

Apr 18, 2023 Anatoly Sorokin

http://bioportal.bioontology.org/visualize/40844

Apr 18, 2023 Anatoly Sorokin

Model Simulation Result

?

SED-ML SBRML

MinimalRequirements

Data model

Ontology

implements implements

Makes sense of

Makes sense of

SBRML

• Systems Biology Results Markup Language

• A new markup language for specifying the results from operations on SBML models

• http://www.comp-sys-bio.org/tiki-index.php?page=SBRML

Apr 18, 2023 Anatoly Sorokin

SBRML

Apr 18, 2023 Anatoly Sorokin

SBRML

Apr 18, 2023 Anatoly Sorokin

Apr 18, 2023 Anatoly Sorokin

Apr 18, 2023 Anatoly Sorokin

Apr 18, 2023 Anatoly Sorokin

Dimension example

Apr 18, 2023 Anatoly Sorokin

Apr 18, 2023 Anatoly Sorokin

Dimension example

Apr 18, 2023 Anatoly Sorokin

Apr 18, 2023 Anatoly Sorokin

Model Simulation Result

?

SED-ML SBRML

MinimalRequirements

Data model

Ontology

implements implements

Makes sense of

Makes sense of

TEDDY

• The TErminology for the Description of DYnamics (TEDDY) project aims to provide an ontology for dynamical behaviours, observable dynamical phenomena, and control elements of bio-models and biological systems in Systems Biology and Synthetic Biology.

• http://www.ebi.ac.uk/compneur-srv/teddy/

Apr 18, 2023 Anatoly Sorokin

TEDDY top-level structure

• Temporal Behaviour (concrete behaviours of a model, more or less the same as trajectories): – Oscillation, Steady State, Fixed Point, Cycle, ...

• Behaviour Characteristic (properties to characterise concrete behaviours): – Period, Amplitude, ...

• Behaviour Diversification (system properties describing the ability of systems to exhibit different behaviours): – Bifurcation, Bi-Stability

• Functional Motif (structural features of a system necessary for specific function):– Negative Feedback, FFL, ...

Apr 18, 2023 Anatoly Sorokin

TEDDY

Apr 18, 2023 Anatoly Sorokin

Questions

Apr 18, 2023 Anatoly Sorokin