finding common ground between modelers and simulation software in systems biology
DESCRIPTION
Slides from presentation given at the Merging Knowledge workshop in Trento, Italy, December 2010.TRANSCRIPT
Finding common ground between modelersand simulation software in systems biology
Michael Hucka(On behalf of many people)
Senior Research FellowCalifornia Institute of Technology
Pasadena, California, USA
2
So much is known, and yet, not nearly enough...3
Must weave solutions using different methods & tools
4
Common side-effect: compatibility problems5
Models represent knowledge to be exchanged
6
SBML
7
Format for representing computational models
• Defines object model + rules for its use
- Serialized to XML
Neutral with respect to modeling framework
• ODE vs. stochastic vs. ...
A lingua franca for software
• Not procedural
SBML = Systems Biology Markup Language
8
The reaction is central: a process occurring at a given rate
• Participants are pools of entities (species)
Models can further include:
• Other constants & variables
• Compartments
• Explicit math
• Discontinuous events
Basic SBML concepts are simple
naA + nbBf([A],[B],[P ],...)−−−−−−−−−−−−→npP
ncCf(...)−−−→ ndD + neE + nfF
...
• Unit definitions
• Annotations
9
The reaction is central: a process occurring at a given rate
• Participants are pools of entities (species)
Models can further include:
• Other constants & variables
• Compartments
• Explicit math
• Discontinuous events
Basic SBML concepts are simple
naA + nbBf([A],[B],[P ],...)−−−−−−−−−−−−→npP
ncCf(...)−−−→ ndD + neE + nfF
...
Can be anything conceptually compatible
• Unit definitions
• Annotations
9
Scope of SBML is not limited to metabolic models
Signaling pathway models Fernandez et al. (2006)
DARPP-32 Is a Robust Integrator of Dopamine and Glutamate Signals
PLoS Computational Biology
BioModels Database model#BIOMD0000000153
10
Scope of SBML is not limited to metabolic models
Signaling pathway models
Conductance-based models
• “Rate rules” for temporal evolution of quantitative parameters
Hodgkin & Huxley (1952)
A quantitative description of membrane current and its application to conduction and excitation in nerve
J. Physiology 117:500–544
BioModels Database model#BIOMD0000000020
11
Scope of SBML is not limited to metabolic models
Signaling pathway models
Conductance-based models
• “Rate rules” for temporal evolution of quantitative parameters
Neural models
• “Events” for discontinuous changesin quantitative parameters
Izhikevich EM. (2003)
Simple model of spiking neurons.
IEEE Trans Neural Net.
BioModels Database model#BIOMD0000000127
12
Scope of SBML is not limited to metabolic models
Signaling pathway models
Conductance-based models
• “Rate rules” for temporal evolution of quantitative parameters
Neural models
• “Events” for discontinuous changesin quantitative parameters
Pharmacokinetic/dynamics models
• “Species” is not required to be abiochemical entity
Tham et al. (2008)
A pharmacodynamic model for the time course of tumor shrinkage by gemcitabine + carboplatin in non-small cell lung cancer patients
Clin. Cancer Res. 14
BioModels Database model#BIOMD0000000234
13
Scope of SBML is not limited to metabolic models
Signaling pathway models
Conductance-based models
• “Rate rules” for temporal evolution of quantitative parameters
Neural models
• “Events” for discontinuous changesin quantitative parameters
Pharmacokinetic/dynamics models
• “Species” is not required to be abiochemical entity
Infectious diseases
Munz et al. (2009 )
When zombies attack!: Mathematical modelling of an outbreak of zombie infection
Infectious Disease Modelling Research Progress, eds. Tchuenche et al., p. 133–150
BioModels Database model#MODEL1008060001
14
0
50
100
150
200
2001 2002 2003 2004 2005 2006 2007 2008 2009 2010
(counted in middle of each year)
205 as of Nov. 28 ↓
Number of software systems supporting SBML
15
2342 reactions
NATURE BIOTECHNOLOGY VOLUME 26 NUMBER 10 OCTOBER 2008 1155
of their parameters. Armed with such information, it is then possible to provide a stochastic or ordinary differential equation model of the entire metabolic network of interest. An attractive feature of metabolism, for the purposes of modeling, is that, in contrast to signaling pathways, metabo-lism is subject to direct thermodynamic and (in particular) stoichiometric constraints3. Our focus here is on the first two stages of the reconstruction process, especially as it pertains to the mapping of experimental metabo-lomics data onto metabolic network reconstructions.
Besides being an industrial workhorse for a variety of biotechnological products, S. cerevisiae is a highly developed model organism for biochemi-cal, genetic, pharmacological and post-genomic studies5. It is especially attractive because of the availability of its genome sequence6, a whole series of bar-coded deletion7,8 and other9 strains, extensive experimental ’omics data10–14 and the ability to grow it for extended periods under highly con-trolled conditions15. The very active scientific community that works on S. cerevisiae has a history of collaborative research projects that have led to substantial advances in our understanding of eukaryotic biology6,8,13,16,17. Furthermore, yeast metabolic physiology has been the subject of inten-sive study and most of the components of the yeast metabolic network are relatively well characterized. Taken together, these factors make yeast metabolism an attractive topic to test a community approach to build models for systems biology.
Several groups18–21 have reconstructed the metabolic network of yeast from genomic and literature data and made the reconstructions freely available. However, due to different approaches used to create them, as well as different interpretations of the literature, the existing reconstruc-tions have many differences. Additionally, the naming of metabolites and enzymes in the existing reconstructions was, at best, inconsistent, and there were no systematic annotations of the chemical species in the form of links to external databases that store chemical compound informa-tion. This lack of model annotation complicated the use of the models for data analysis and integration. Members of the yeast systems biology community therefore recognized that a single ‘consensus’ reconstruction and annotation of the metabolic network was highly desirable as a starting point for further investigations.
A crucial factor that enabled the building of a consensus network recon-struction is the ability to describe and exchange biochemical network
Genomic data allow the large-scale manual or semi-automated assembly of metabolic network reconstructions, which provide highly curated organism-specific knowledge bases. Although several genome-scale network reconstructions describe Saccharomyces cerevisiae metabolism, they differ in scope and content, and use different terminologies to describe the same chemical entities. This makes comparisons between them difficult and underscores the desirability of a consolidated metabolic network that collects and formalizes the ‘community knowledge’ of yeast metabolism. We describe how we have produced a consensus metabolic network reconstruction for S. cerevisiae. In drafting it, we placed special emphasis on referencing molecules to persistent databases or using database-independent forms, such as SMILES or InChI strings, as this permits their chemical structure to be represented unambiguously and in a manner that permits automated reasoning. The reconstruction is readily available via a publicly accessible database and in the Systems Biology Markup Language (http://www.comp-sys-bio.org/yeastnet). It can be maintained as a resource that serves as a common denominator for studying the systems biology of yeast. Similar strategies should benefit communities studying genome-scale metabolic networks of other organisms.
Accurate representation of biochemical, metabolic and signaling net-works by mathematical models is a central goal of integrative systems biology. This undertaking can be divided into four stages1. The first is a qualitative stage in which are listed all the reactions that are known to occur in the system or organism of interest; in the modern era, and especially for metabolic networks, these reaction lists are often derived in part from genomic annotations2,3 with curation based on literature (‘bibliomic’) data4. A second stage, again qualitative, adds known effectors, whereas the third and fourth stages—essentially amounting to molecular enzymology—include the known kinetic rate equations and the values
A consensus yeast metabolic network reconstruction obtained from a community approach to systems biologyMarkus J Herrgård1,19,20, Neil Swainston2,3,20, Paul Dobson3,4, Warwick B Dunn3,4, K Yalçin Arga5, Mikko Arvas6, Nils Blüthgen3,7, Simon Borger8, Roeland Costenoble9, Matthias Heinemann9, Michael Hucka10, Nicolas Le Novère11, Peter Li2,3, Wolfram Liebermeister8, Monica L Mo1, Ana Paula Oliveira12, Dina Petranovic12,19, Stephen Pettifer2,3, Evangelos Simeonidis3,7, Kieran Smallbone3,13, Irena Spasi!2,3, Dieter Weichart3,4, Roger Brent14, David S Broomhead3,13, Hans V Westerhoff3,7,15, Betül Kırdar5, Merja Penttilä6, Edda Klipp8, Bernhard Ø Palsson1, Uwe Sauer9, Stephen G Oliver3,16, Pedro Mendes2,3,17, Jens Nielsen12,18 & Douglas B Kell*3,4
*A list of affiliations appears at the end of the paper.
Published online 9 October 2008; doi:10.1038/nbt1492
P E R S P E C T I V E
©20
08 N
atur
e Pu
blis
hing
Gro
up h
ttp://
ww
w.n
atur
e.co
m/n
atur
ebio
tech
nolo
gyHerrgård et al., Nature Biotech., 26:10, 2008
Model scale & complexity have been increasing16
30,965 reactions!Aho et al., PLoS One, May 14;5(5), 2010.
Today’s largest models are over 10x bigger!17
SBML continues to evolve
18
SBML Level 3—A modular SBML
SBML Level 3 Core
Package X Package Y
Package Z
A package adds constructs & capabilities
Models declare which packages they use
• Applications tell users which packages they support
Package development can be decoupled
19
Package Specification status
Graph layout Level 3 version defined; in review
Multicomponent species Level 3 version defined; in review
Hierarchical composition Level 3 specification under discussion
Groups Level 3 specification under discussion
Qualitative models Level 3 specification under discussion
Spatial geometry Level 3 specification under discussion
Arrays & sets Specification proposed
Distribution & ranges Specification proposed
Steady-state models Specification proposed
Graph rendering Specification proposed
Spatial diffusion Specification needed
Dynamic structures Specification needed
20
Package Specification status
Graph layout Level 3 version defined; in review
Multicomponent species Level 3 version defined; in review
Hierarchical composition Level 3 specification under discussion
Groups Level 3 specification under discussion
Qualitative models Level 3 specification under discussion
Spatial geometry Level 3 specification under discussion
Arrays & sets Specification proposed
Distribution & ranges Specification proposed
Steady-state models Specification proposed
Graph rendering Specification proposed
Spatial diffusion Specification needed
Dynamic structures Specification needed
Extends SBML species to represent:• Entities that can exist under
different states affecting their behaviors
• Entities that are complexes of other entities
20
Package Specification status
Graph layout Level 3 version defined; in review
Multicomponent species Level 3 version defined; in review
Hierarchical composition Level 3 specification under discussion
Groups Level 3 specification under discussion
Qualitative models Level 3 specification under discussion
Spatial geometry Level 3 specification under discussion
Arrays & sets Specification proposed
Distribution & ranges Specification proposed
Steady-state models Specification proposed
Graph rendering Specification proposed
Spatial diffusion Specification needed
Dynamic structures Specification needed
Models composed of submodels
20
Package Specification status
Graph layout Level 3 version defined; in review
Multicomponent species Level 3 version defined; in review
Hierarchical composition Level 3 specification under discussion
Groups Level 3 specification under discussion
Qualitative models Level 3 specification under discussion
Spatial geometry Level 3 specification under discussion
Arrays & sets Specification proposed
Distribution & ranges Specification proposed
Steady-state models Specification proposed
Graph rendering Specification proposed
Spatial diffusion Specification needed
Dynamic structures Specification needed
Grouping model entities together, for conceptual and annotation purposes
20
Package Specification status
Graph layout Level 3 version defined; in review
Multicomponent species Level 3 version defined; in review
Hierarchical composition Level 3 specification under discussion
Groups Level 3 specification under discussion
Qualitative models Level 3 specification under discussion
Spatial geometry Level 3 specification under discussion
Arrays & sets Specification proposed
Distribution & ranges Specification proposed
Steady-state models Specification proposed
Graph rendering Specification proposed
Spatial diffusion Specification needed
Dynamic structures Specification needed
Models in which entity variables are not quantities; e.g., boolean models
20
Package Specification status
Graph layout Level 3 version defined; in review
Multicomponent species Level 3 version defined; in review
Hierarchical composition Level 3 specification under discussion
Groups Level 3 specification under discussion
Qualitative models Level 3 specification under discussion
Spatial geometry Level 3 specification under discussion
Arrays & sets Specification proposed
Distribution & ranges Specification proposed
Steady-state models Specification proposed
Graph rendering Specification proposed
Spatial diffusion Specification needed
Dynamic structures Specification needed
2-D and 3-D geometry of physical objects (compartments & species)
20
21
Is enough?
21
Growing community, greater challenges22
Representationformat
Model Procedures Results
Minimal inforequirements
Semantics—
Mathematical
Other
SBRML
?
annotations annotations annotations
23
Representationformat
Model Procedures Results
Minimal inforequirements
Semantics—
Mathematical
Other
SBRML
?
annotations annotations annotations
23
Annotations can answer questions:
• “What exactly is this entity you call X?”
• “What other identities does this entity have?”
• “What exactly is the process represented by equation ‘e17’?”
• “What role does constant ‘k3’ play in equation ‘e17’?”
• “What mathematical framework is being assumed?”
• “What organism is this in?”
• ... etc. ...
Multiple annotations on same entity are common
Annotations add semantics and connections
24
For semantics of a model’s math
Human- & program-accessible
• Browser interface
• Web services
Math formulas in MathML
Systems Biology Ontology (SBO)
25
For semantics of a model’s math
Human- & program-accessible
• Browser interface
• Web services
Math formulas in MathML
Systems Biology Ontology (SBO)
25
For semantics of a model’s math
Human- & program-accessible
• Browser interface
• Web services
Math formulas in MathML
Systems Biology Ontology (SBO)
25
<sbml ...> ... <listOfCompartments> <compartment id="cell" size="1e-15" /> </listOfCompartments> <listOfSpecies> <species compartment="cell" id="S1" initialAmount="1000" /> <species compartment="cell" id="S2" initialAmount="0" /> <listOfSpecies> <listOfParameters> <parameter id="k" value="0.005" sboTerm="SBO:0000339" /> <listOfParameters> <listOfReactions> <reaction id="r1" reversible="false"> <listOfReactants> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000010" /> </listOfReactants> <listOfProducts> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000011" /> </listOfProducts> <kineticLaw sboTerm="SBO:0000052"> <math> ... <math> ...</sbml>
26
<sbml ...> ... <listOfCompartments> <compartment id="cell" size="1e-15" /> </listOfCompartments> <listOfSpecies> <species compartment="cell" id="S1" initialAmount="1000" /> <species compartment="cell" id="S2" initialAmount="0" /> <listOfSpecies> <listOfParameters> <parameter id="k" value="0.005" sboTerm="SBO:0000339" /> <listOfParameters> <listOfReactions> <reaction id="r1" reversible="false"> <listOfReactants> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000010" /> </listOfReactants> <listOfProducts> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000011" /> </listOfProducts> <kineticLaw sboTerm="SBO:0000052"> <math> ... <math> ...</sbml>
SBO:0000339
26
<sbml ...> ... <listOfCompartments> <compartment id="cell" size="1e-15" /> </listOfCompartments> <listOfSpecies> <species compartment="cell" id="S1" initialAmount="1000" /> <species compartment="cell" id="S2" initialAmount="0" /> <listOfSpecies> <listOfParameters> <parameter id="k" value="0.005" sboTerm="SBO:0000339" /> <listOfParameters> <listOfReactions> <reaction id="r1" reversible="false"> <listOfReactants> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000010" /> </listOfReactants> <listOfProducts> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000011" /> </listOfProducts> <kineticLaw sboTerm="SBO:0000052"> <math> ... <math> ...</sbml>
SBO:0000339
“forward bimolecular rate constant, continuous case”
26
Le Novère et al., Nature Biotech., 23(12), 2005.
27
MIRIAM cross-references are simple triples
Data type identifier
Data item identifier
Annotation qualifier
Model element
Entity referenced
relationship qualifier(optional)
{ }(Required) (Required) (Optional)
URI chosen from agreed-upon list
Syntax & value space depends on data type
Format:
Controlled vocabulary term
28
“Term #1.1.1.1 (alcohol dehydrogenase) in the Enzyme Commission’s Enzyme Nomenclature database”
⇒ urn:miriam:ec-code:1.1.1.1{URN scheme established
by the MIRIAM project
{Chosen by the creator of theentry in MIRIAM Resources
29
http://www.ebi.ac.uk/miriam
MIRIAM Resources provides URI dictionary & resolver
Community-maintained
30
http://www.ebi.ac.uk/miriam
MIRIAM Resources provides URI dictionary & resolver
Community-maintained
30
<species metaid="metaid_0000009" id="species_3" compartment="c_1"> <annotation> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bqbiol="http://biomodels.net/biology-qualifiers/" > <rdf:Description rdf:about="#metaid_0000009"> <bqbiol:is> <rdf:Bag> <rdf:li rdf:resource="urn:miriam:obo.chebi:CHEBI%3A15996"/> <rdf:li rdf:resource="urn:miriam:kegg.compound:C00044"/> </rdf:Bag> </bqbiol:is> </rdf:Description> </rdf:RDF> </annotation> </species>
SBML defines a syntax for annotations
31
<species metaid="metaid_0000009" id="species_3" compartment="c_1"> <annotation> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bqbiol="http://biomodels.net/biology-qualifiers/" > <rdf:Description rdf:about="#metaid_0000009"> <bqbiol:is> <rdf:Bag> <rdf:li rdf:resource="urn:miriam:obo.chebi:CHEBI%3A15996"/> <rdf:li rdf:resource="urn:miriam:kegg.compound:C00044"/> </rdf:Bag> </bqbiol:is> </rdf:Description> </rdf:RDF> </annotation> </species>
SBML defines a syntax for annotations
<rdf:Bag> <rdf:li rdf:resource="urn:miriam:obo.chebi:CHEBI%3A15996"/> <rdf:li rdf:resource="urn:miriam:kegg.compound:C00044"/> </rdf:Bag>
Data references
31
<species metaid="metaid_0000009" id="species_3" compartment="c_1"> <annotation> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:bqbiol="http://biomodels.net/biology-qualifiers/" > <rdf:Description rdf:about="#metaid_0000009"> <bqbiol:is> <rdf:Bag> <rdf:li rdf:resource="urn:miriam:obo.chebi:CHEBI%3A15996"/> <rdf:li rdf:resource="urn:miriam:kegg.compound:C00044"/> </rdf:Bag> </bqbiol:is> </rdf:Description> </rdf:RDF> </annotation> </species>
SBML defines a syntax for annotations
<bqbiol:is>
</bqbiol:is>
Relationship qualifier
31
Annotations permit inter-database linking
32
Annotations permit inter-database linking
32
Even more interesting capabilities are possible
http://www.semanticsbml.org
33
MIRIAM identifiers now in use by many other projects
Data resources• BioModels Database (kinetic models)• PSI Consortium (protein interaction)• Reactome (pathways)• Pathway Commons (pathways)• SABIO-RK (reaction kinetics)• Yeast consensus model database• E-MeP (structural genomics)
Application software• ARCADIA• BioUML• COPASI• libAnnotationSBML• libSBML• Saint• SBML2BioPAX• SBML2LaTeX• SBMLeditor• semanticSBML• Snazer• SBW• The Virtual Cell
34
Representationformat
Model Procedures Results
Minimal inforequirements
Semantics—
Mathematical
Other
SBRML
?
annotations annotations annotations
35
Representationformat
Model Procedures Results
Minimal inforequirements
Semantics—
Mathematical
Other
SBRML
?
annotations annotations annotations
35
SED-ML = Simulation Experiment Description ML
Application-independent format
Captures procedures, algorithms, parameter values
• Steps to go from model to output
libSedML project developing API library
<sbml ...> ... <listOfCompartments> <compartment id="cell" size="1e-15" /> </listOfCompartments> <listOfSpecies> <species compartment="cell" id="S1" initialAmount="1000" /> <species compartment="cell" id="S2" initialAmount="0" /> <listOfSpecies> <listOfParameters> <parameter id="k" value="0.005" sboTerm="SBO:0000339" /> <listOfParameters> <listOfReactions> <reaction id="r1" reversible="false"> <listOfReactants> <speciesReference species="S1" stoichiometry="2" sboTerm="SBO:0000010" /> ...
?
36
Getting closer to the ideal
37
People on SBML Team & BioModels Team
SBML Team BioModels.net TeamMichael Hucka Nicolas Le NovèreSarah Keating Camille Laibe
Frank Bergmann Nicolas RodriguezLucian Smith Nick Juty
Nicolas Rodriguez Lukas EndlerLinda Taddeo Vijayalakshmi ChelliahAkiya Joukarou Chen LiAkira Funahashi Harish Dharuri
Kimberley Begley Lu LiBruce Shapiro Enuo HeAndrew Finney Mélanie CourtotBen Bornstein Alexander Broicher
Ben Kovitz Arnaud HenryHamid Bolouri Marco DonizelliHerbert SauroJo Matthews
Maria Schilstra
VisionariesHiroaki Kitano
John Doyle
38
National Institute of General Medical Sciences (USA)
European Molecular Biology Laboratory (EMBL)ELIXIR (UK)
Beckman Institute, Caltech (USA)
Keio University (Japan)
JST ERATO Kitano Symbiotic Systems Project (Japan) (to 2003)
National Science Foundation (USA)
International Joint Research Program of NEDO (Japan)
JST ERATO-SORST Program (Japan)
Japanese Ministry of Agriculture
Japanese Ministry of Educ., Culture, Sports, Science and Tech.
BBSRC (UK)
DARPA IPTO Bio-SPICE Bio-Computation Program (USA)
Air Force Office of Scientific Research (USA)
STRI, University of Hertfordshire (UK)
Molecular Sciences Institute (USA)
Agencies to thank for supporting SBML & BioModels.net39
Where to find out more
Thank you for listening!
SBML http://sbml.org
BioModels Database http://biomodels.net/biomodels
MIRIAM http://biomodels.net/miriam
MIASE http://biomodels.net/miase
SED-ML http://biomodels.net/sed-ml
SBO http://biomodels.net/sbo
KiSAO http://www.ebi.ac.uk/compneur-srv/kisao/
TEDDY http://www.ebi.ac.uk/compneur-srv/teddy/
SBRML http://tinyurl.com/sbrml
40