sbml (the systems biology markup language), model databases, and other resources

93
SBML (the Systems Biology Markup Language), model databases, and other resources Michael Hucka, Ph.D. Department of Computing + Mathematical Sciences California Institute of Technology Pasadena, CA, USA CCB 2012, August 2012, Cold Spring Harbor Laboratory, NY, USA Email: [email protected] Twitter: @mhucka

Upload: mike-hucka

Post on 11-May-2015

662 views

Category:

Technology


3 download

DESCRIPTION

Tutorial given at the 2012 Computational Cell Biology Summer School at Cold Spring Harbor Laboratory, New York, USA, in August, 2012.

TRANSCRIPT

Page 2: SBML (the Systems Biology Markup Language), model databases, and other resources

Outli

ne

General background and motivations

Brief summary of SBML features

A selection of resources for the SBML-oriented modeler

Annotations, connections and semantics

Current and upcoming developments in community standards

Closing

Page 3: SBML (the Systems Biology Markup Language), model databases, and other resources

Outli

ne

General background and motivations

Brief summary of SBML features

A selection of resources for the SBML-oriented modeler

Annotations, connections and semantics

Current and upcoming developments in community standards

Closing

Page 4: SBML (the Systems Biology Markup Language), model databases, and other resources

Research today: experimentation, computation, cogitation

Page 5: SBML (the Systems Biology Markup Language), model databases, and other resources

The many roles of computation in biological researchInstrument/device control, data management, data processing, database applications, statistical analysis, pattern matching, image processing, text mining, chemical structure prediction, genomic sequence analysis, proteomics, other *omics, molecular modeling, molecular dynamics, kinetic simulation, simulated evolution, phylogenetics, ... (to name only a subset)!

Focus here: modeling and simulation

Page 6: SBML (the Systems Biology Markup Language), model databases, and other resources

Usually, there are at least two scientific outcomes:

• One or more models (+ associated claims about their behaviors)

• Publication of the results (in some form)

What are the outcomes of modeling and simulation?

Models comein many forms

Page 7: SBML (the Systems Biology Markup Language), model databases, and other resources

Models are resultsModels serve as statements of our current understanding of the phenomena being studied*

• A computational model documents your theory in a concrete form

Model can—

• Reduce ambiguity in communication

• Offer a concrete framework for adding new data and theories

• Support direct evaluation of relationships between theories

Bower & Bolouri, Computational modeling of genetic and biochemical networks, MIT Press, 2001

Page 8: SBML (the Systems Biology Markup Language), model databases, and other resources

But only if the modeling results are reproducible

Page 9: SBML (the Systems Biology Markup Language), model databases, and other resources

Many models have traditionally been published this way

Problems:

• Errors in printing

• Missing information

• Dependencies onimplementation

• Outright errors

• Can be a hugeeffort to recreate

Is it enough to describe the model & equations in a paper?

Page 10: SBML (the Systems Biology Markup Language), model databases, and other resources

Is it enough to make your (software X) script available?It’s vital for good science:

• Someone with access to the same software can try to run it, understand it, verify the computational results, build on them, etc.

• Opinion: you should always do this in any case

Page 11: SBML (the Systems Biology Markup Language), model databases, and other resources

Is it enough to make your (software X) code available?It’s vital for good science—

• Someone with access to the same software can try to run it, understand it, build on it, etc.

• Opinion: you should always do this in any case

But it’s still not ideal for communication of scientific results:

• What if they don’t have access to that software?

• And anyway, how will people find the model?

• And how will people be able to relate the model to other work?

Page 12: SBML (the Systems Biology Markup Language), model databases, and other resources

Different tools ⇒ different interfaces & languages

Page 13: SBML (the Systems Biology Markup Language), model databases, and other resources

Communication is better with interoperable data formats

Page 14: SBML (the Systems Biology Markup Language), model databases, and other resources

Outli

ne

General background and motivations

Brief summary of SBML features

A selection of resources for the SBML-oriented modeler

Annotations, connections and semantics

Current and upcoming developments in community standards

Closing

Page 15: SBML (the Systems Biology Markup Language), model databases, and other resources

SBML: a lingua fra

nca

for software

Page 16: SBML (the Systems Biology Markup Language), model databases, and other resources

Format for representing computational models of biological processes

• Data structures + usage principles + serialization to XML

Neutral with respect to modeling framework

• E.g., ODE, stochastic systems, etc.

Development started in 2000, with first specification distributed in 2001

SBML = Systems Biology Markup Language

Page 17: SBML (the Systems Biology Markup Language), model databases, and other resources

The process is central

• Called a “reaction” in SBML

• Participants are pools of entities (species)

Models can further include:

• Other constants & variables

• Compartments

• Explicit math

• Discontinuous events

Basic SBML concepts are fairly simple

• Unit definitions

• Annotations

Page 18: SBML (the Systems Biology Markup Language), model databases, and other resources

Well-stirred compartments

c

n

Page 19: SBML (the Systems Biology Markup Language), model databases, and other resources

Species pools are located in compartments

c

n

protein A protein B

gene mRNAn mRNAc

Page 20: SBML (the Systems Biology Markup Language), model databases, and other resources

Reactions can involve any species anywhere

c

n

protein A protein B

gene mRNAn mRNAc

Page 21: SBML (the Systems Biology Markup Language), model databases, and other resources

Reactions can cross compartment boundaries

c

n

protein A protein B

gene mRNAn mRNAc

Page 22: SBML (the Systems Biology Markup Language), model databases, and other resources

Reaction/process rates can be (almost) arbitrary formulas

c

n

protein A protein B

gene mRNAn mRNAc

f1(x)

f2(x)

f3(x)f4(x)

f5(x)

Page 23: SBML (the Systems Biology Markup Language), model databases, and other resources

“Rules”: equations expressing relationships in addition to reaction sys.

c

n

protein A protein B

gene mRNAn mRNAc

f1(x)

f2(x)

f3(x)

g1(x)g2(x)

.

.

.

f4(x)

f5(x)

Page 24: SBML (the Systems Biology Markup Language), model databases, and other resources

“Events”: discontinuous actions triggered by system conditions

c

n

protein A protein B

gene mRNAn mRNAc

f1(x)

f2(x)

f3(x)

g1(x)g2(x)

.

.

.

Event1: when (...condition...), do (...assignments...)

Event2: when (...condition...), do (...assignments...)

...

f4(x)

f5(x)

Page 25: SBML (the Systems Biology Markup Language), model databases, and other resources

Annotations: machine-readable semantics and links to other resources

Event1: when (...condition...), do (...assignments...)

Event2: when (...condition...), do (...assignments...)

...

c

n

protein A protein B

gene mRNAn mRNAc

f1(x)

f2(x)

f3(x)

g1(x)g2(x)

.

.

.

f4(x)

f5(x)

“This event represents ...”

“This is identified by GO id # ...”

“This is an enzymatic reaction with EC # ...”

“This is a transport into the nucleus ...” “This compartment

represents the nucleus ...”

Page 26: SBML (the Systems Biology Markup Language), model databases, and other resources

Today: spatially homogeneous models

• Metabolic network models

• Signaling pathway models

• Conductance-based models

• Neural models

• Pharmacokinetic/dynamics models

• Infectious diseases

Coming: SBML Level 3 packages to support other types

• E.g.: Spatially inhomogeneous models, also qualitative/logical

Scope of SBML encompasses many types of models

Find examples inBioModels Databasehttp://biomodels.net/biomodels

Page 27: SBML (the Systems Biology Markup Language), model databases, and other resources

2342 reactions

NATURE BIOTECHNOLOGY VOLUME 26 NUMBER 10 OCTOBER 2008 1155

of their parameters. Armed with such information, it is then possible to provide a stochastic or ordinary differential equation model of the entire metabolic network of interest. An attractive feature of metabolism, for the purposes of modeling, is that, in contrast to signaling pathways, metabo-lism is subject to direct thermodynamic and (in particular) stoichiometric constraints3. Our focus here is on the first two stages of the reconstruction process, especially as it pertains to the mapping of experimental metabo-lomics data onto metabolic network reconstructions.

Besides being an industrial workhorse for a variety of biotechnological products, S. cerevisiae is a highly developed model organism for biochemi-cal, genetic, pharmacological and post-genomic studies5. It is especially attractive because of the availability of its genome sequence6, a whole series of bar-coded deletion7,8 and other9 strains, extensive experimental ’omics data10–14 and the ability to grow it for extended periods under highly con-trolled conditions15. The very active scientific community that works on S. cerevisiae has a history of collaborative research projects that have led to substantial advances in our understanding of eukaryotic biology6,8,13,16,17. Furthermore, yeast metabolic physiology has been the subject of inten-sive study and most of the components of the yeast metabolic network are relatively well characterized. Taken together, these factors make yeast metabolism an attractive topic to test a community approach to build models for systems biology.

Several groups18–21 have reconstructed the metabolic network of yeast from genomic and literature data and made the reconstructions freely available. However, due to different approaches used to create them, as well as different interpretations of the literature, the existing reconstruc-tions have many differences. Additionally, the naming of metabolites and enzymes in the existing reconstructions was, at best, inconsistent, and there were no systematic annotations of the chemical species in the form of links to external databases that store chemical compound informa-tion. This lack of model annotation complicated the use of the models for data analysis and integration. Members of the yeast systems biology community therefore recognized that a single ‘consensus’ reconstruction and annotation of the metabolic network was highly desirable as a starting point for further investigations.

A crucial factor that enabled the building of a consensus network recon-struction is the ability to describe and exchange biochemical network

Genomic data allow the large-scale manual or semi-automated assembly of metabolic network reconstructions, which provide highly curated organism-specific knowledge bases. Although several genome-scale network reconstructions describe Saccharomyces cerevisiae metabolism, they differ in scope and content, and use different terminologies to describe the same chemical entities. This makes comparisons between them difficult and underscores the desirability of a consolidated metabolic network that collects and formalizes the ‘community knowledge’ of yeast metabolism. We describe how we have produced a consensus metabolic network reconstruction for S. cerevisiae. In drafting it, we placed special emphasis on referencing molecules to persistent databases or using database-independent forms, such as SMILES or InChI strings, as this permits their chemical structure to be represented unambiguously and in a manner that permits automated reasoning. The reconstruction is readily available via a publicly accessible database and in the Systems Biology Markup Language (http://www.comp-sys-bio.org/yeastnet). It can be maintained as a resource that serves as a common denominator for studying the systems biology of yeast. Similar strategies should benefit communities studying genome-scale metabolic networks of other organisms.

Accurate representation of biochemical, metabolic and signaling net-works by mathematical models is a central goal of integrative systems biology. This undertaking can be divided into four stages1. The first is a qualitative stage in which are listed all the reactions that are known to occur in the system or organism of interest; in the modern era, and especially for metabolic networks, these reaction lists are often derived in part from genomic annotations2,3 with curation based on literature (‘bibliomic’) data4. A second stage, again qualitative, adds known effectors, whereas the third and fourth stages—essentially amounting to molecular enzymology—include the known kinetic rate equations and the values

A consensus yeast metabolic network reconstruction obtained from a community approach to systems biologyMarkus J Herrgård1,19,20, Neil Swainston2,3,20, Paul Dobson3,4, Warwick B Dunn3,4, K Yalçin Arga5, Mikko Arvas6, Nils Blüthgen3,7, Simon Borger8, Roeland Costenoble9, Matthias Heinemann9, Michael Hucka10, Nicolas Le Novère11, Peter Li2,3, Wolfram Liebermeister8, Monica L Mo1, Ana Paula Oliveira12, Dina Petranovic12,19, Stephen Pettifer2,3, Evangelos Simeonidis3,7, Kieran Smallbone3,13, Irena Spasi!2,3, Dieter Weichart3,4, Roger Brent14, David S Broomhead3,13, Hans V Westerhoff3,7,15, Betül Kırdar5, Merja Penttilä6, Edda Klipp8, Bernhard Ø Palsson1, Uwe Sauer9, Stephen G Oliver3,16, Pedro Mendes2,3,17, Jens Nielsen12,18 & Douglas B Kell*3,4

*A list of affiliations appears at the end of the paper.

Published online 9 October 2008; doi:10.1038/nbt1492

P E R S P E C T I V E

©20

08 N

atur

e Pu

blis

hing

Gro

up h

ttp://

ww

w.n

atur

e.co

m/n

atur

ebio

tech

nolo

gyHerrgård et al., Nature Biotech., 26:10, 2008

Model scale & complexity have been increasingMany significant and popular models are in SBML form

Page 28: SBML (the Systems Biology Markup Language), model databases, and other resources

SBML Level 1 SBML Level 2 SBML Level 3

predefined math functions user-defined functions user-defined functions

text-string math notation MathML subset MathML subset

reserved namespaces for annotations

no reserved namespaces for annotations

no reserved namespaces for annotations

no controlled annotation scheme

RDF-based controlled annotation scheme

RDF-based controlled annotation scheme

no discrete events discrete events discrete events

default values defined default values defined no default values

monolithic monolithic modular

Page 29: SBML (the Systems Biology Markup Language), model databases, and other resources

General background and motivations

Brief summary of SBML features

A selection of resources for the SBML-oriented modeler

Annotations, connections and semantics

Current and upcoming developments in community standards

Closing

Outli

ne

Page 30: SBML (the Systems Biology Markup Language), model databases, and other resources

You want models? We got models.

Page 31: SBML (the Systems Biology Markup Language), model databases, and other resources

Stores & serves quantitative models of biological interest

• Free, public resource

• Models must be described in peer-reviewed publication(s)

Hundreds of models are curated by hand

Imports & exports models in several formats

BioModels Database

Figure courtesy of Camille Laibe

Page 33: SBML (the Systems Biology Markup Language), model databases, and other resources

Contents of BioModels DatabaseContents today:

• 142,000+ pathway models (converted from KEGG)

• 400+ hand-curated quantitative models

• 400+ non-curated quantitative models

9%3%

3%5%

6%

8%

9%

9%

23%

25%

signal transductionmetabolic processmulticelullar organismal processrhythmic processcell cyclehomeostatic processresponse to stimuluscell deathlocalizationothers (e.g., developmental process)

Database data from 2012-08-10

Page 34: SBML (the Systems Biology Markup Language), model databases, and other resources

How can you check that a given SBML file is valid?

Page 35: SBML (the Systems Biology Markup Language), model databases, and other resources

The Online SBML Validator

Page 37: SBML (the Systems Biology Markup Language), model databases, and other resources

Where can you find more software?

Page 38: SBML (the Systems Biology Markup Language), model databases, and other resources

Find software in the SBML Software Guide

Page 39: SBML (the Systems Biology Markup Language), model databases, and other resources

Find SBML software

Find software in the SBML Software Guide

Page 41: SBML (the Systems Biology Markup Language), model databases, and other resources

Question: Which of the following categories best describe your software? (Check all that apply.)

Results of 2011 survey of SBML-compatible software

Out of 81 responses

Simulation software

Analysis s/w (in addition, or instead of, simulation)

Creation/model development software

Visualization/display/formatting software

Utility software (e.g., format conversion)

Data integration and management software

Repository or database

Framework or library (for use in developing s/w)

S/w for interactive env. (e.g., MATLAB, R, ...)

Annotation software0 20 40 60 80

11

13

13

14

16

23

31

31

40

42

Page 42: SBML (the Systems Biology Markup Language), model databases, and other resources

What about libraries for writing SBML-compatible software?

Page 43: SBML (the Systems Biology Markup Language), model databases, and other resources

libSBMLReads, writes, validates SBML

Can check & convert units

Written in portable C++

Runs on Linux, Mac, Windows

APIs for C, C++, C#, Java, Octave, Perl, Python, R, Ruby, MATLAB

Well documented API

Open-source (LGPL)

http://sbml.org/Software/libSBML

Page 44: SBML (the Systems Biology Markup Language), model databases, and other resources

JSBMLPure Java implementation

API is compatible with libSBML but more Java-like

Functionality is subset of libSBML

Open source (LGPL)

http://sbml.org/Software/JSBML

Page 45: SBML (the Systems Biology Markup Language), model databases, and other resources

How can you stay informed of new developments?

Page 46: SBML (the Systems Biology Markup Language), model databases, and other resources

Resources for news, questions and discussions

Page 47: SBML (the Systems Biology Markup Language), model databases, and other resources

Front-page news

Resources for news, questions and discussions

Page 48: SBML (the Systems Biology Markup Language), model databases, and other resources

Twitter & RSS feeds

Resources for news, questions and discussions

Page 49: SBML (the Systems Biology Markup Language), model databases, and other resources

Mailing lists/forums

Resources for news, questions and discussions

Page 50: SBML (the Systems Biology Markup Language), model databases, and other resources

General background and motivations

Brief summary of SBML features

A selection of resources for the SBML-oriented modeler

Annotations, connections and semantics

Current and upcoming developments in community standards

Closing

Outli

ne

Page 51: SBML (the Systems Biology Markup Language), model databases, and other resources

SBML itself provides syntax and only limited semantics

Page 52: SBML (the Systems Biology Markup Language), model databases, and other resources

SBML itself provides syntax and only limited semantics

No standard identifiers

Page 53: SBML (the Systems Biology Markup Language), model databases, and other resources

SBML itself provides syntax and only limited semantics

Low info content

No standard identifiers

Page 54: SBML (the Systems Biology Markup Language), model databases, and other resources

Raw models alone are insufficient

Need standard schemes for machine-readable annotations

• Identify entities

• Mathematical semantics

• Links to other data resources

• Authorship & pub. info

SBML itself provides syntax and only limited semantics

Low info content

No standard identifiers

Page 55: SBML (the Systems Biology Markup Language), model databases, and other resources

Element in the model

Entity elsewhere (e.g., in a database)

relationship qualifier(optional)

Annotations at their simplest

Page 56: SBML (the Systems Biology Markup Language), model databases, and other resources

Annotations can answer questions:

• “What exactly is the process represented by equation ‘r17’?”

• “What other identities (synonyms) does this entity have?”

• “What role does constant ‘k3’ play in equation ‘r17’?”

• “What organism are we talking about?”

• ... etc. ...

Multiple annotations on same entity are common

Annotations add meaning and connections

Page 57: SBML (the Systems Biology Markup Language), model databases, and other resources

SBML supports two annotation schemesSBO (Systems Biology Ontology)

• For mathematical semantics

• One SBML object ← one SBO term

• Short, compact, tightly coupled but limited scope

MIRIAM (Minimum Information Requested In the Annotation of Models)

• For any kind of annotation

• One SBML object ← multiple MIRIAM annotations

• Larger, more free-form, wider scope

Both are externalized and independent of SBML

Page 58: SBML (the Systems Biology Markup Language), model databases, and other resources

Systems Biology Ontology (SBO)

http://biomodels.net/sbo

Page 59: SBML (the Systems Biology Markup Language), model databases, and other resources

<sbml ...> ... <listOfCompartments> <compartment id="cell" size="1e-15" /> </listOfCompartments> <listOfSpecies> <species compartment="cell" id="S1" initialAmount="1000" /> <species compartment="cell" id="S2" initialAmount="0" /> <listOfSpecies> <listOfParameters> <parameter id="k" value="0.005" sboTerm="SBO:0000339" /> <listOfParameters> <listOfReactions> <reaction id="r1" reversible="false"> <listOfReactants> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000010" /> </listOfReactants> <listOfProducts> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000011" /> </listOfProducts> <kineticLaw sboTerm="SBO:0000052"> <math> ... <math> ...</sbml>

Page 60: SBML (the Systems Biology Markup Language), model databases, and other resources

<sbml ...> ... <listOfCompartments> <compartment id="cell" size="1e-15" /> </listOfCompartments> <listOfSpecies> <species compartment="cell" id="S1" initialAmount="1000" /> <species compartment="cell" id="S2" initialAmount="0" /> <listOfSpecies> <listOfParameters> <parameter id="k" value="0.005" sboTerm="SBO:0000339" /> <listOfParameters> <listOfReactions> <reaction id="r1" reversible="false"> <listOfReactants> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000010" /> </listOfReactants> <listOfProducts> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000011" /> </listOfProducts> <kineticLaw sboTerm="SBO:0000052"> <math> ... <math> ...</sbml>

SBO:0000339

Page 61: SBML (the Systems Biology Markup Language), model databases, and other resources

<sbml ...> ... <listOfCompartments> <compartment id="cell" size="1e-15" /> </listOfCompartments> <listOfSpecies> <species compartment="cell" id="S1" initialAmount="1000" /> <species compartment="cell" id="S2" initialAmount="0" /> <listOfSpecies> <listOfParameters> <parameter id="k" value="0.005" sboTerm="SBO:0000339" /> <listOfParameters> <listOfReactions> <reaction id="r1" reversible="false"> <listOfReactants> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000010" /> </listOfReactants> <listOfProducts> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000011" /> </listOfProducts> <kineticLaw sboTerm="SBO:0000052"> <math> ... <math> ...</sbml>

SBO:0000339

“forward bimolecular rate constant, continuous case”

Page 62: SBML (the Systems Biology Markup Language), model databases, and other resources

semanticSBML

SBMLsqueezer

Software can use SBO terms to help you work with models

Page 63: SBML (the Systems Biology Markup Language), model databases, and other resources

Addresses 2 general areas of annotation needs:

MIRIAM is not specific to SBML

MIRIAM (Minimum Information Requested In the Annotation of Models)

Requirements for reference correspondence

Scheme for encoding annotations

Annotations for attributing model creators & sources

Annotations for referring to external

data resources

Page 64: SBML (the Systems Biology Markup Language), model databases, and other resources

Addresses 2 general areas of annotation needs:

MIRIAM is not specific to SBML

MIRIAM (Minimum Information Requested In the Annotation of Models)

Requirements for reference correspondence

Scheme for encoding annotations

Annotations for attributing model creators & sources

Annotations for referring to external

data resources

Requirements for reference correspondence

Page 65: SBML (the Systems Biology Markup Language), model databases, and other resources

Annotations for attributing model creators and sources

Goal: permit tracing model’s origins & people involved in its creation

Minimal info required:

• Name for the model

• Citation for a description of what is being modeled & its author

• Contact info for the model creator(s)

• Creation date & time

• Last modification date & time

• Statement of the model’s terms of distribution

- Specific terms not mandated, just a statement of the terms

Page 66: SBML (the Systems Biology Markup Language), model databases, and other resources

Addresses 2 general areas of annotation needs:

MIRIAM is not specific to SBML

MIRIAM (Minimum Information Requested In the Annotation of Models)

Requirements for reference correspondence

Scheme for encoding annotations

Annotations for attributing model creators & sources

Annotations for referring to external

data resources

Page 67: SBML (the Systems Biology Markup Language), model databases, and other resources

Addresses 2 general areas of annotation needs:

MIRIAM is not specific to SBML

MIRIAM (Minimum Information Requested In the Annotation of Models)

Requirements for reference correspondence

Scheme for encoding annotations

Annotations for attributing model creators & sources

Annotations for referring to external

data resources

Annotations for referring to external

data resources

Page 68: SBML (the Systems Biology Markup Language), model databases, and other resources

Annotations for external referencesGoal: link model constituents to corresponding entities in bioinformatics resources (e.g., databases, controlled vocabularies)

• Supports:

- Precise identification of model constituents

- Discovery of models that concern the same thing

- Comparison of model constituents between different models

MIRIAM approach avoids putting data content directly in the model; instead, it points at external resources that contain the knowledge.

Page 69: SBML (the Systems Biology Markup Language), model databases, and other resources

Why might you care?

http://www.ebi.ac.uk/chebi

Low info content

Page 70: SBML (the Systems Biology Markup Language), model databases, and other resources

Why might you care?

http://www.ebi.ac.uk/chebi

Low info content

Known by different names – do you want to write all of

them into your model?

salicylic acid

Page 71: SBML (the Systems Biology Markup Language), model databases, and other resources

Identifying resources has its own challengesFor linking to data, need:

• Globally unique, unambiguous identifiers

• ... that are persistent despite resource changes (e.g., changed URLs)

• ... that are maintained by the community

Problem: different resources have different identification schemes

• E.g.: entity “16480”

- In ChEBI: entry 16480 is nitrous oxide

- In PubMed: entry 16480 is the 1977 paper “Effect of gallstone-dissolution therapy on human liver structure”

- In PubChem: entry 16480 is 1-chloro-4-isothiocyanatobenzene

Page 72: SBML (the Systems Biology Markup Language), model databases, and other resources

How do we create globally unique identifiers consistently?Long story short:

• Create unique resource identifiers (URIs) by combining 2 parts:

• Create registry for namespaces

- Allows people & software to use same namespace identifiers

• Create service for URI resolution

- Allows people & software to take a given resource identifier and figure out what it points to

namespace entity identifier{ {

Identifies a dataset Identifies a datumwithin the dataset

Page 73: SBML (the Systems Biology Markup Language), model databases, and other resources

Resolving resource identifiersMIRIAM Registry supports the creation of globally unique identifiers

• Example MIRIAM identifier:urn:miriam:ec-code:1.1.1.1

• Provides various data about theresource, including alternate servers

• Provides web services

identifiers.org is layered on top of that and provides resolvable URIs

• Can type it in a web browser!

• Example identifiers.org URI:http://identifiers.org/ec-code/1.1.1.1

Page 74: SBML (the Systems Biology Markup Language), model databases, and other resources

BioModels Database: example of using the annotations

Page 75: SBML (the Systems Biology Markup Language), model databases, and other resources

Annotations enable many interesting possibilitiesAnnotations enable many interesting possibilities

Figure courtesy of Wolfram Leibermeister

semanticSBML

Page 76: SBML (the Systems Biology Markup Language), model databases, and other resources

Summary: why care about standard ways of writing annotations?

Structured, machine-readable annotations increase your model’s utility

• Allow more precise identification of model components

- Understand model structure

- Search/discover models

- Compare models

• Adds a semantic layer—integrates knowledge into the model

- Helps recipients understand the underlying biology

- Allows for better reuse of models

- Supports conversion of models from one form to another

Page 77: SBML (the Systems Biology Markup Language), model databases, and other resources

General background and motivations

Brief summary of SBML features

A selection of resources for the SBML-oriented modeler

Annotations, connections and semantics

Current and upcoming developments in community standards

Closing

Outli

ne

Page 78: SBML (the Systems Biology Markup Language), model databases, and other resources

Mathematical semantics

Biological semantics

Visual interpretation

Discrete stochastic entities

Continuous lumped parameter

State transition

Mean field approximation

Model type

Model creation

Model annotation

Model analysis

Numerical results

Model life-cycle

Model representation level

Conc

ept d

ue to

Nic

olas

Le N

ovèr

e

Major dimensions of a computational model

Page 79: SBML (the Systems Biology Markup Language), model databases, and other resources

What about other kinds of models?

Page 80: SBML (the Systems Biology Markup Language), model databases, and other resources

SBML Level 3: Supporting more categories of models

An SBML Level 3 package adds constructs & capabilities

Models declare which packages they use

• Applications tell users which packages they support

Package development can be decoupled

SBML Level 3 Core

Package X Package Y Package Z

Package W

(dependencies)

Page 81: SBML (the Systems Biology Markup Language), model databases, and other resources

Level 3 package What it enablesHierarchical composition Models containing submodels

Flux balance constraints Flux balance analysis models

Qualitative models Petri net models, Boolean models

Spatial Nonhomogeneous spatial models

Multicomponent species Entities with structure & state; rule-based models

Graph layout Diagrams of models

Graph rendering Diagrams of models

Distribution & ranges Nonscalar values

Annotations Richer annotation syntax

Groups Arbitrary grouping of model components

Dynamic structures Creation & destruction of model components

Arrays & sets Arrays or sets of entities

Page 82: SBML (the Systems Biology Markup Language), model databases, and other resources

How can we capture the simulation/analysis procedures?

Page 83: SBML (the Systems Biology Markup Language), model databases, and other resources

Software can’t read figure legends

?

BIOMD0000000319 in BioModels Database

Decroly & Goldbeter, PNAS, 1982

Page 84: SBML (the Systems Biology Markup Language), model databases, and other resources

Application-independent format to capture procedures, algorithms, parameter values

• Neutral format for encoding the steps to go from model to output

Can be used for

• Simulation experiments encoding parametrizations & perturbations

• Simulations using more than one model

• Simulations using more than one method

• Data manipulations to produce plot(s)

libSedML project developing API library

SED-ML = Simulation Experiment Description ML

http://www.biomodels.net/sed­ml

Page 85: SBML (the Systems Biology Markup Language), model databases, and other resources

What about visual diagrams?

Page 86: SBML (the Systems Biology Markup Language), model databases, and other resources

Graphical representation of modelsToday: broad variation in graphical notation used in biological diagrams

• Between authors, between journals, even people in same group

However, standard notations would offer benefits:

• Consistency = easier to read diagrams with less ambiguity

• Software support: verification of correctness, translation to math

Page 87: SBML (the Systems Biology Markup Language), model databases, and other resources

SBGN = Systems Biology Graphical NotationGoal: standardize the graphical notation in diagrams of biological processes

• Community-based development, à la SBML

Many groups participating

3 sublanguages to describe different facets of a model

http://sbgn.org

Page 88: SBML (the Systems Biology Markup Language), model databases, and other resources

Outli

ne

General background and motivations

Brief summary of SBML features

A selection of resources for the SBML-oriented modeler

Annotations, connections and semantics

Current and upcoming developments in community standards

Closing

Page 89: SBML (the Systems Biology Markup Language), model databases, and other resources

Such standards are the work of a great communityAttendees at SBML 10th Anniversary Symposium, Edinburgh, 2010

Page 90: SBML (the Systems Biology Markup Language), model databases, and other resources

COMBINE (Computational Modeling in Biology Network)

• SBML, SBGN, BioPAX, SED-ML, CellML, NeuroML

Upcoming meeting: August 15–19 in Toronto, Canada

• Right before ICSB (International Conference on Systems Biology)

Get involved and make things better!

http://co.mbine.org

Page 91: SBML (the Systems Biology Markup Language), model databases, and other resources

SBML http://sbml.org

BioModels Database http://biomodels.net/biomodels

COMBINE http://co.mbine.org

identifiers.org http://identifiers.org

MIRIAM http://biomodels.net/miriam

SED-ML http://biomodels.net/sed-ml

SBO http://biomodels.net/sbo

SBGN http://sbgn.org

URLs

Page 92: SBML (the Systems Biology Markup Language), model databases, and other resources

I’d like your feedback!You can use this anonymous form:

http://tinyurl.com/mhuckafeedback

Page 93: SBML (the Systems Biology Markup Language), model databases, and other resources

National Institute of General Medical Sciences (USA) European Molecular Biology Laboratory (EMBL)JST ERATO Kitano Symbiotic Systems Project (Japan) (to 2003)JST ERATO-SORST Program (Japan)ELIXIR (UK)Beckman Institute, Caltech (USA)Keio University (Japan)International Joint Research Program of NEDO (Japan)Japanese Ministry of AgricultureJapanese Ministry of Educ., Culture, Sports, Science and Tech.BBSRC (UK)National Science Foundation (USA)DARPA IPTO Bio-SPICE Bio-Computation Program (USA)Air Force Office of Scientific Research (USA)STRI, University of Hertfordshire (UK)Molecular Sciences Institute (USA)

SBML was made possible thanks to funding from: