lecture: computational systems biology … computational systems biology universität des...

42
Lecture: Computational Systems Biology Universität des Saarlandes, SS 2012 04 Standards, software, databases Dr. Jürgen Pahle 22.5.2012

Upload: nguyendieu

Post on 17-Apr-2018

222 views

Category:

Documents


8 download

TRANSCRIPT

Lecture: Computational Systems BiologyUniversität des Saarlandes, SS 2012

04 Standards, software, databases

Dr. Jürgen Pahle

22.5.2012

Recap

● Equilibrium constant Keq (remember: enzymes don't change Keq)

● Enzyme kinetic laws● Michaelis-Menten (irreversible, reversible)● Competitive, uncompetitive, noncompetitive, substrate

inhibition, and others● Hill equation, Monod-Wyman-Changeux model● Other kinetic frameworks (generalised mass action, lin-log,

convenience kinetics)

● Dynamical systems● Steady states, stability (local/global), Jacobian matrix,

how to find steady states (integration, Newton method), attractors, bifurcations, limit cycles

Standards

● Systems Biology Markup Language (SBML)● Systems Biology Graphical Notation (SBGN)● Minimal information requested in the annotation

of models (MIRIAM)

SBML - Systems Biology Markup Language

● Common file format for exchanging models

● Community-driven development

● Many software applications support SBML

● The user does not have to enter the same model if using different software

● Often libSBML is used to parse SBML files

Hucka et al. (2003) Bioinformatics 19:524-531

see http://sbml.org

Structure of an SBML model<?xml version="1.0" encoding="UTF-8"?><sbml xmlns="http://www.sbml.org/sbml/level2/version3" metaid="_000000" level="2" version="3"> <model metaid="_000001" id="NovakTyson1997CellModel" name="Novak1997_CellCycle"> <listOfUnitDefinitions>... </listOfUnitDefinitions> <listOfCompartments>... </listOfCompartments> <listOfSpecies>... </listOfSpecies> <listOfParameters>... </listOfParameters> <listOfRules>... </listOfRules> <listOfReactions>... </listOfReactions> <listOfEvents>... </listOfEvents></model></sbml>

SBML (cont.)

● XML file format (Extensible Markup Language, http://www.w3org./XML)

● SBML defines a kinetic model without specifying the specific mathematical analysis methods that are applied to the model (software dependent) and their parameters (that is why softwares like COPASI have their own native file format)

● SBML is organised in levels and versions (L3 V1)

● Many extensions have been proposed recently (e.g. specification of layouts, spatial processes, multistate species...)

Example: Biomodels 10

<?xml version="1.0" encoding="UTF-8"?>

<!-- This model was downloaded from BioModels Database --><!-- Tue May 22 10:18:58 BST 2012 --><!-- http://www.ebi.ac.uk/biomodels/ -->

<sbml xmlns="http://www.sbml.org/sbml/level2/version4" metaid="_492719" level="2" version="4"> <model metaid="_000001" id="Kholodenko2000_MAPK_feedback"> <notes> <body xmlns="http://www.w3.org/1999/xhtml"> <p> This a model from the article: <br/> <strong> Negative feedback and ultrasensitivity can bring about oscillations in the mitogen-activated protein kinase cascades. </strong> <br/>Kholodenko BN <em>Eur. J. Biochem.</em>2000:267(6):1583-8 <a href="http://www.ncbi.nlm.nih.gov/pubmed/10712587">10712587</a>, <br/> <strong>Abstract:</strong> <br/>Functional organization of signal transduction into protein phosphorylation cascades, such as the [..]

Other biological file formats

BioPAX, http://www.biopax.org● main focus on encoding biological knowledge and not

kinetic modelling

● supported by some online databases (export of data)

● little software support

CellML, http://cellml.org● similar functionality as SBML, but more complex/abstract

● formula-based

● little software support

Topological / qualitative models

● Are often represented as graphs● Symbols, arrows etc. are not standardized● Graphical representations are sometimes hard

to read or even ambiguous

Ambiguity in graphical notations

source: Le Novère et al. (2009) The Systems Biology Graphical Notation. Nature Biotechnology 8:735

SBGN

Systems Biology Graphical Notation

http://www.sbgn.org

● Community effort to standardize the way models are represented graphically

● Provides defined ways of representing models using standardized symbols and graphical elements

Different ways of representing models in SBGN

● Activity flow diagram → information flow between entities

● Entity relation diagram → all relationships in which a given entity participates

● Process description diagram → temporal course of biochemical interactions in a network

Process diagrams are similar to the diagrams normally found in biochemistry textbooks

Example: Activity flow diagram

● Nodes in the graph represent biological activities, e.g. gene expression

● Edges (arrows) represent the influence the activities have on each other

● Omits state changes, convenient for representing the effects of perturbations

Example: Entity relation diagram

Represent the interactions between entities and the rules that control them. Neglects temporal aspects.

Example: Process description diagram

● Represent processes that convert physical entities into other entities, change their states or change their location

Symbols for Process Description Diagrams

SBGN example: MAPK

MIRIAM

● Minimal information requested in the annotation of biochemical models

● Defines what features should be reported with a biochemical model

● Ensures quality standard with a view to re-usability of models

Le Novère et al. (2005) Nature Biotechnol. 23:1509-15

http://www.biomodels.net/miriam

MIRIAM

MIRIAM-compliant models must● be encoded electronically● must be described in a publication● the encoding must agree with the publication● the encoded model must be instantiated in a

simulation● this simulation must reproduce the results of the

publication

MIRIAM: required information

● Model name● Authors, affiliation, contact, date of creation● Citation with complete description of the model● Definition of terms of distribution● Annotations for:

Annotations

● Annotations are important because they link (arbitrarily named) model entities to entities in the real world (uniquely described in databases, such as Gene Ontology, ChEBI, UniProt, KEGG, and many more)

● Essential, e.g. when merging models

Software

● Gnu Octave / Matlab / R● Berkeley Madonna● CellDesigner● COPASI● VCell● ...

Mendes group

Kummer group

● COPASI (COmplex PAthway SImulator)● Software for the simulation and analysis of

biochemical networks● “Tool kit” with a variety of different methods:

● Deterministic, stochastic and hybrid simulation methods

● Metabolic Control Analysis, Elementary Flux Mode Analysis, Sensitivity Analysis

● Parameter Scanning, Optimization, Parameter Fitting● User-friendly GUI, runs under Mac, Linux, Windows

and Solaris and command line version● Artistic license/open-source● reads and writes SBML, etc.

http://www.copasi.org

S. Hoops, S. Sahle, R. Gauges, C. Lee, J. Pahle, N. Simus, M. Singhal, L. Xu, P. Mendes, U. Kummer (2006) “COPASI – a COmplex PAthway SImulator” Bioinformatics

22(24):3067-3074

Download COPASI

● navigate to www.copasi.org● follow Download → Free version● select latest stable version● select your operating system: Linux, Mac OS or

Windows● install

Please also check the documentation and user forum at www.copasi.org!

www.copasi.org● Frequent releases

● User Forum (recommended!)

● Documentation

● User Manual● FAQ● Technical documentation

● File format specification● Documentation of API● etc.

● Issue Tracker

● Please send bug reports to [email protected], specify problem and, if possible, attach .cps file

Command line version

● CopasiSEAll model relevant information (including) tasks to run is contained in .cps file (CopasiML, an XML schema)

Usage: CopasiSE [options] [file] --SBMLSchema schema The Schema of the SBML file to export. --configdir dir The configuration directory for copasi. The default is .copasi in the home directory. --configfile file The configuration file for copasi. The default is copasi in the ConfigDir. --exportBerkeleyMadonna file The Berkeley Madonna file to export. --exportC file The C code file to export. --exportXPPAUT file The XPPAUT file to export. --home dir Your home directory. --license Display the license. --nologo Surpresses the startup message. --validate Only validate the given input file (COPASI, Gepasi, or SBML) without performing any calculations. --verbose Enable output of messages during runtime to std::error. -c, --copasidir dir The COPASI installation directory. -e, --exportSBML file The SBML file to export. -i, --importSBML file A SBML file to import. -s, --save file The file the model is saved to after work. -t, --tmp dir The temp directory used for autosave.

CellDesigner

http://www.celldesigner.org

● free, java-based visual modelling tool

● qualitative model is created by drawing a reaction network

● available for Windows, Linux and Mac OS X

● graphical display similar to (but not exactly the same) to SBGN

● some analysis capabilities (simulation)

● models can be exported to SBML format

Databases

● Pathway databases● Databases that provide kinetic information● Interaction databases● Model databases● and many more...

Pathway databasesPathway databases mostly provide information on the reaction network topology of one or more organisms.

Examples:

● KEGG, http://www.genome.jp/kegg

● REACTOME, http://www.reactome.org

● Wiki pathways: http://www.wikipathways.org

● BioCyc, http://biocyc.org

● MetaCyc, http://metacyc.org

● Pathway Commons, http://www.pathwaycommons.org

● Biocarta, http://www.biocarta.com

See also http://www.pathguide.org for additional resources.

Kinetic data

There are very few databases that contain information on kinetic parameters and rate laws for different enzymes.

Examples:● BRENDA, http://www.brenda-enzymes.org● Sabio-RK, http://sabiork.h-its.org

Interaction databases

Especially for regulatory models, interaction data is more important than information on metabolic reactions.

Large number of different databases available.

Examples:

● IntAct, http://www.ebi.ac.uk/intact (general physical interactions)

● BioGRID, http://thebiogrid.org (physical and functional interactions)

● STRING, htttp://string-db.org (physical and functional interactions)

● STITCH, http://stitch.embl.de (protein-chemical interactions)

● ...

Model databases

● Biomodels, http://www.ebi.ac.uk/biomodels-main

● JWS Online, http://jjj.biochem.sun.ac.za

● CellML Model Repository, http://models.cellml.org

Biomodels database is the largest and most important.

Some publishers require models to be available in databases when corresponding article is published.

Large overlap between models in different databases.

Biomodels database

Biomodels database is part of Biomodels.net a website for different standards, formats and models.

Biomodels database (cont.)

http://http://www.ebi.ac.uk/biomodels-main

● Curated and non-curated branch● Allows advanced searches, e.g. according to

Gene Ontology terms, authors, and many more● Models can be downloaded in SBML and other

formats

Biomodels.netwww.ebi.ac.uk/biomodels-main

JWS Online

http://jjj.biochem.sun.ac.za

● Online model repository● Simulation and analysis methods (Java based)

JWS Online

Bionumbers database

http://bionumbers.hms.harvard.edu

Bionumbers database (cont.)

Data sources

● Usually data from different sources have to be combined to build a model

● Overlap between different databases but also inconsistencies and errors (CAUTION)

● Most often, important data is still missing →● additional data has to be extracted from literature

(manually or using text mining techniques)● need for measuring quantities in lab experiments

Exercise

● In the exercise on Thursday, 24.5.2012 we will cover Worksheet 3