dit - university of trentoassets.disi.unitn.it/.../phd-thesis/xix/lecca_paola.pdf · 2011. 2....
TRANSCRIPT
-
PhD Dissertation
International Doctorate School in Information andCommunication Technologies
DIT - University of Trento
Modeling and simulating system biology
with stochasticity
Paola Lecca
Advisor:
Prof. Corrado Priami
Microsoft Research - University of Trento
Centre for Computational and Systems Biology
December 2006
-
Acknowledgements
To thank means to recognize to be indebted, and I am thoroughly in-
debted to Corrado Priami. Corrado was an exceptional advisor, supporting
my independence while providing scientific guidance, direction and intel-
lectual stimulation. In this thesis and in the papers that we wrote together
as well, he helped me turn my ideas into concrete scientific results.
I am also very grateful to Paola Quaglia because during my Ph. D.
studies I could benefit from her guidance and because she was a good
mentor. In particular I am grateful for her openness to the field of scientific
investigation presented in this thesis, where her expertise in concurrency
theory and functional languages and programming met my background in
Physics.
I was extremely fortunate to work with Carlo Laudanna, Gabriela Con-
stantin, Ines Mancini, Rita Frassanito and their co-workers. I owe my
biological motivation and thinking to all of them. In particular this the-
sis is the result of a profitable collaboration started three years ago with
Carlo and Gabriela for modeling and simulating autoreactive lymphocytes
recruitment in inflamed brain micro-vessels. I am very grateful to Carlo
and Gabriela that devoted time, ideas and patience. Many thanks to Ines
for the stimulating scientific discussions about chemistry, biochemistry and
biology.
I am personally thankful to Andrew Phillips, Luca Cardelli, Adelinde
Uhrmacher, Mathew Palakal, Ralf Blossey and John J. Tyson for their
-
interest in my work and their constant encouragement. I learned from
them several new ways to look into biology by means of the ideas and the
tools offered by Computer Science. Andrew gave me useful suggestions for
the implementation of an efficient stochastic algorithm for the prototype of
BioBeta simulator. I thank also Aviv Regev and William Silverman that
introduced me to the use of ASPIC simulator.
I wish also to thank all the technical staff of the Dipartimento di In-
formatica e Telecomunicazioni of my University for their kindness and the
promptness of their intervention. Many thanks to the secretariat of the
Doctorate School for the attention and kindness in helping me in the jun-
gle of bureaucracy.
A special thank also to the researchers and students of the Microsoft
Reasearch - University of Trento Centre for Computational and System
Biology. I appreciated them not only as scientists but also as wonderful
persons to work with.
A pleasant surprise of my Ph. D. course was acquiring new friends work-
ing in disciplines distant from mine and from different european and not
european countries. I am grateful to them for their friendship, encourage-
ment, scientific discussion and moments of leisure. I found also exceptional
colleagues that I wish to thank a lot for their collaboration and questions
that helped me to improve my work.
Finally I would like to thank my family to which I dedicate this thesis
as a sign of my gratitude for the support in the difficult moments and for
taking part to my happiness in the beautiful moments of my Ph. D. studies
course. I am extremely grateful to my sister, Michela, that believed in me
and encouraged my intellectual path.
4
-
Abstract
Modeling and simulation of pathways as networks of biochemical reac-
tions have received increased interest in the context of system biology. The
central dogma of this re-emerging area states that it is system dynamics
and organizing principles of complex biological phenomena that give rise to
functioning and function of cells. Cell functions, such as growth, division,
differentiation and apoptosis are temporal processes, that can be under-
stood if they are treated as dynamic systems. System biology focuses on an
understanding of functional activity from a system-wide perspective and,
consequently, it is defined by two hey questions: (i) how do the components
within a cell interact, so as to bring about its structure and functioning?
(ii) How do cells interact, so as to develop and maintain higher levels of
organization and functions? In recent years, wet-lab biologists embraced
mathematical modeling and simulation as two essential means toward an-
swering the above questions. The credo of dynamics system theory is that
the behavior of a biological system is given by the temporal evolution of its
state. Our understanding of the time behavior of a biological system can be
measured by the extent to which a simulation mimics the real behavior of
that system. Deviations of a simulation indicate either limitations or er-
rors in our knowledge. The biochemical approach to understand biological
processes is essentially one of simulation. A biochemist typically prepares
a cell-free extract that can mediate a well-described physiological process.
Once the extract is fractioned to purify the components that catalyze in-
-
dividual reactions, the physiological process in reconstructed in vitro. The
validity of this approach is measured by ho closely the in vitro reconstructed
process matches physiological observations. In this thesis we show that by
carefully representing the principles and logic of the wet-lab reasoning, we
can simulate faithfully on a computer a model of a biochemical pathway.
We show also how simulation can be used as interactive modeling tool for
reasoning about biochemical interactions in the design of experiments, in
discovery, and in prediction. This work highlights that realistic simula-
tions of the biological systems evolution require a mathematical model of
the stochasticity of the involved processes and a formalism for specifying
the concurrent nature of the biochemical interactions. The Gillespie sim-
ulation algorithm, suitable generalized to simulate biochemical interaction,
rather them chemical reactions is satisfies the first requirement. The sec-
ond requirement can be satisfied both by the stochastic π-calculus and by the
stochastic Beta binders formalisms. These languages shift the focus from a
Newtonian vision of the molecular dynamics to a detailed specification of
the components structure and functions. The thesis examines thoroughly
and critically discuss the main concepts of stochastic chemical kinetics and
develops the necessary re-formulations to adapt them to a biological context.
The work shows models and simulation results of low level system, such as
simple chemical reactions, and of higher level processes, such as cell cycle,
lymphocyte recruitment, pathogenesis of familial Parkinson’s disease and
voltage gating of ion channels in intercellular communications. The studies
reveal the inefficiency of the usual ordinary differential equation approach
to describe these phenomena. The dynamics modeling of molecular or cell
biological systems is different from the classical application of differential
equations in mechanics, because mathematical models of biochemical path-
way are phenomenal constructions in that the interactions among systems
variables are defined in an operational rather than a mechanistic manner.
6
-
Finally, we present a prototype of simulator for the stochastic Beta binders,
that joins the expressive power of this new formalism in specifying biolog-
ical interactions with a re-formulation of the principles of the stochastic
chemical kinetics more close to the world of biological phenomena.
Keywords: system biology, stochastic simulation, chemical kinetics, π-
calculus, Beta binders
7
-
Contents
1 Introduction 1
1.1 What is biological modeling . . . . . . . . . . . . . . . . . 1
1.2 System Biology . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 What future for System Biology . . . . . . . . . . . 10
1.3 Complexity of a biological system . . . . . . . . . . . . . . 11
1.4 Stochastic modeling approach . . . . . . . . . . . . . . . . 13
1.4.1 Stochastic simulation algorithms . . . . . . . . . . . 21
1.5 Formalizing the complexity . . . . . . . . . . . . . . . . . . 23
1.5.1 Disadvanges in using O.D.E. for system biology: two
alternatives . . . . . . . . . . . . . . . . . . . . . . 26
1.5.2 The π-calculus . . . . . . . . . . . . . . . . . . . . 32
1.5.3 The Beta binders formalism . . . . . . . . . . . . . 41
1.6 Summarizing . . . . . . . . . . . . . . . . . . . . . . . . . 64
2 Chemical kinetics 67
2.1 The mathematical structure of biological models . . . . . . 67
2.2 Chemical reactions . . . . . . . . . . . . . . . . . . . . . . 70
2.3 Kinetics of chemical reactions . . . . . . . . . . . . . . . . 72
2.3.1 Mass-action kinetics . . . . . . . . . . . . . . . . . 78
2.3.2 Example 1: the Lotka-Volterra system . . . . . . . 80
2.3.3 Example 2: the Michaelis-Menten kinetics . . . . . 87
2.4 The structure of kinetic models . . . . . . . . . . . . . . . 89
i
-
2.4.1 Properties of process-time . . . . . . . . . . . . . . 90
2.4.2 Properties of state-space . . . . . . . . . . . . . . . 92
2.4.3 Nature of determination . . . . . . . . . . . . . . . 94
2.4.4 XYZ models . . . . . . . . . . . . . . . . . . . . . . 96
2.5 Markov processes . . . . . . . . . . . . . . . . . . . . . . . 97
2.6 The master equation . . . . . . . . . . . . . . . . . . . . . 100
2.6.1 The chemical master equation . . . . . . . . . . . . 101
2.7 Molecular approach to chemical kinetics . . . . . . . . . . 104
2.7.1 Reactions are collisions . . . . . . . . . . . . . . . . 104
2.7.2 Reaction rates . . . . . . . . . . . . . . . . . . . . . 110
2.7.3 The need for stochastic rates in stochastic simulations114
2.7.4 Zeroth-order reactions . . . . . . . . . . . . . . . . 117
2.7.5 First-order reactions . . . . . . . . . . . . . . . . . 118
2.7.6 Second-order reactions . . . . . . . . . . . . . . . . 119
2.7.7 Higher-order reactions . . . . . . . . . . . . . . . . 120
2.8 Fundamental hypothesis of stochastic chemical kinetics . . 121
2.9 General derivation of the stochastic rate constant . . . . . 124
2.10 The reaction probability density function . . . . . . . . . . 129
2.11 The stochastic simulation algorithms . . . . . . . . . . . . 131
2.11.1 Direct Method . . . . . . . . . . . . . . . . . . . . 132
2.11.2 First Reaction Method . . . . . . . . . . . . . . . . 134
2.11.3 Next Reaction Method . . . . . . . . . . . . . . . . 135
2.12 Time-dependent extension of First Reaction
Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
2.12.1 A case study: the passive transport of glucose . . . 141
2.12.2 Simulation results . . . . . . . . . . . . . . . . . . . 147
2.13 StochSim algorithm . . . . . . . . . . . . . . . . . . . . . . 153
2.14 Advantages and drawbacks of Gillespie algorithm . . . . . 155
2.15 Spatio-temporal algorithms . . . . . . . . . . . . . . . . . 156
ii
-
2.16 The Langevin equation . . . . . . . . . . . . . . . . . . . . 158
2.16.1 Use and abuse of Langevin equation . . . . . . . . . 159
2.17 Hybrid algorithms . . . . . . . . . . . . . . . . . . . . . . . 161
2.17.1 Hybrid modeling in intercellualr communication . . 163
3 Biochemical stochastic π-calculus models and simulations 185
3.1 Models and simulation of higher levels biological systems . 185
3.2 The biochemical stochastic π-calculus . . . . . . . . . . . . 188
3.2.1 The stochastic engine of π-calculus . . . . . . . . . 193
3.3 Eukaryotic cell cycle . . . . . . . . . . . . . . . . . . . . . 195
3.3.1 Molecular machinery of the cell cycle . . . . . . . . 196
3.3.2 A simple model of Start and Finish . . . . . . . . . 198
3.3.3 Specification . . . . . . . . . . . . . . . . . . . . . . 201
3.3.4 Simulation . . . . . . . . . . . . . . . . . . . . . . . 205
3.3.5 Remarks . . . . . . . . . . . . . . . . . . . . . . . . 206
3.4 Autoreactive lymphocyte recruitmnet . . . . . . . . . . . . 208
3.4.1 Quantitative models of lymphocyte recruitment . . 210
3.4.2 Dembo adhesion model . . . . . . . . . . . . . . . . 211
3.4.3 Stochastic π-calculus adhesion and rolling model . . 215
3.4.4 Main results . . . . . . . . . . . . . . . . . . . . . . 220
3.4.5 BioSpi prediction of rolling cells percentage as a func-
tion of vessel diameter . . . . . . . . . . . . . . . . 221
3.4.6 Remarks . . . . . . . . . . . . . . . . . . . . . . . . 225
3.5 A faulty mechanism of protein folding and degradation in
Parkinson’s disease . . . . . . . . . . . . . . . . . . . . . . 226
3.5.1 Mechanisms of neurodegeneration . . . . . . . . . . 227
3.5.2 First model: misfolded protein accumulation induced
by mutant α-synuclein . . . . . . . . . . . . . . . . 230
iii
-
3.5.3 Second model: misfolded protein accumulation in-
duced by mutant parkin . . . . . . . . . . . . . . . 234
3.5.4 Third model: misfolded protein accumulation as func-
tion of chaperones number . . . . . . . . . . . . . . 236
3.5.5 Remarks and future directions . . . . . . . . . . . . 237
4 Stochastic Beta binders and a prototype of simulator 243
4.1 A new language for system biology . . . . . . . . . . . . . 243
4.2 Syntax and semantics of Beta binders . . . . . . . . . . . . 245
4.3 Stochastic Beta-binders . . . . . . . . . . . . . . . . . . . . 248
4.4 BioBeta simulator . . . . . . . . . . . . . . . . . . . . . . . 254
4.4.1 Implementation . . . . . . . . . . . . . . . . . . . . 255
4.5 Simulation algorithm . . . . . . . . . . . . . . . . . . . . . 269
4.5.1 Propensity functions for boxes interactions . . . . . 269
4.5.2 Execution . . . . . . . . . . . . . . . . . . . . . . . 273
4.5.3 Simulations . . . . . . . . . . . . . . . . . . . . . . 276
5 Wet experiments for kinetics studies in system biology 287
5.1 Monitoring a chemical reaction . . . . . . . . . . . . . . . 287
5.2 Mass spectrometry . . . . . . . . . . . . . . . . . . . . . . 288
5.3 NMR spectroscopy . . . . . . . . . . . . . . . . . . . . . . 294
5.4 Model testing and confirmation . . . . . . . . . . . . . . . 297
A Simulation in system biology: the state of the art 303
A.0.1 Language-based approaches . . . . . . . . . . . . . 309
A.0.2 Databases of quantitative and qualitative information 310
B Analysis of a two-component signal transduction: a model
for the feedback loops on protein translation and degrada-
tion 313
iv
-
B.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 314
B.2 Reaction equation . . . . . . . . . . . . . . . . . . . . . . . 317
B.2.1 Model reduction . . . . . . . . . . . . . . . . . . . . 320
B.2.2 The organization of the regulatory network . . . . . 322
B.3 Feedback loops . . . . . . . . . . . . . . . . . . . . . . . . 324
B.3.1 Negative feedback loop on translation . . . . . . . . 325
B.3.2 Auto-activation of protein degradation . . . . . . . 329
B.3.3 A linear model . . . . . . . . . . . . . . . . . . . . 331
B.4 Experimental detection of feedback loops . . . . . . . . . . 333
B.5 Some remarks . . . . . . . . . . . . . . . . . . . . . . . . . 339
C Inference for rate constants 341
C.1 Complete data observations . . . . . . . . . . . . . . . . . 342
C.2 Incomplete data observations . . . . . . . . . . . . . . . . 345
Bibliography 348
v
-
List of Tables
2.1 Classes of biological phenomena and most used formalisms to describe them. 69
2.2 Reactions of the chemical model displayed in Fig. 2.7. No. corresponds to
the number in the figure. . . . . . . . . . . . . . . . . . . . . . . 102
2.3 Reactions of the chemical model depicted in Fig. 2.7, their propensity and
corresponding ”jump” of state vector ~nTR. V is the volumes in which the
reactions occur. . . . . . . . . . . . . . . . . . . . . . . . . . . 103
2.4 Set of chemical master equations describing the metabolites interaction showed
in Fig. 2.7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
2.5 Rates expression and O.D.E. model for GLUT transporter
[44]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
2.6 Passive glucose transport model in stochastic π-calculus.
abind, afacing, and aunbinding are the propensities for the re-
action of binding of the glucose molecule with the GLUT
transporter, for the facing of the glucose molecule to the
interior of the cell membrane and for the breakage of the
complex glucose-GLUT transporter, respectively. . . . . . . 144
3.1 Reduction rules of stochastic π-calculus . . . . . . . . . . . . . . . . 189
3.2 Heterodimer complex formation and breakage. . . . . . . . . . . . . . 190
3.3 Parameters values. See [133, 186]. . . . . . . . . . . . . . . . . . . 200
3.4 Biochemical stochastic π-calculus specification of cell cycle control mecha-
nisms in eukaryotic cell . . . . . . . . . . . . . . . . . . . . . . . 202
vii
-
3.5 Biochemical stochastic π-calculus specification of the 4-phases lymphocyte
recruitment process. . . . . . . . . . . . . . . . . . . . . . . . . 217
3.6 Deterministic rates for the 4-phases of lymphocyte recruitment. . . . . . 219
3.7 Space parameters and densities. . . . . . . . . . . . . . . . . . . . 221
3.8 Three sets of experimental values of rolling cell percentage (RCP ) for different
values of the vessel diameter Dv. The estimated experimental error on the
rolling cells percentage is ±3. . . . . . . . . . . . . . . . . . . . . 225
3.9 Stochastic π-calculus specification of a model of a faulty mechanism of protein
folding and degradation. . . . . . . . . . . . . . . . . . . . . . . 233
3.10 Channels rates and number of processes used in the models. The symbol “*”
means that the value is a variable parameter (see Fig. 3.17). . . . . . . . 235
4.1 Laws for structural congruence in Beta binders. This table is taken from [147]. 247
4.2 Axioms and rules for stochastic reduction relations in Beta binders. This
table is taken from [34]. . . . . . . . . . . . . . . . . . . . . . . 249
4.3 Auxiliary functions. This table is taken from [34]. . . . . . 253
4.4 States for join reactions of the elements of the system Sys. . . . . . . . 259
4.5 Example of states and functions defining the bio-processes enzyme and substrate263
4.6 Examples of BioBeta syntax for processes and their correspondent π-calculus
syntax. The indication of the deadlock process NIl can be omitted. . . . 269
4.7 Simulation parameter of the Beta binders model of ionic bonding between
Sodium and Chlorine. . . . . . . . . . . . . . . . . . . . . . . . 276
4.8 Simulation parameters for the APC/CDK antagonism. The rates associated
to the channels involved in inter reactions are those k’s used in Chapter 3
Section 3.3.2 Table 3.3 for the same reactions. The rates of all the hide and
unhide reduction not included in this table are infinite. . . . . . . . . . 284
B.1 Simulation parameters [94]. . . . . . . . . . . . . . . . . . . . . . 328
viii
-
List of Figures
1.1 The cell biology research cycle. . . . . . . . . . . . . . . . . . . . 4
1.2 (A) Kinetics of the changes of the enzyme E and the complex enzyme-
substrate ES as function of time (in seconds). (B) Kinetics of the changes of
the product (P) and the substrate S as function of time (in seconds). Both
(A) and (B) simulations have been performed with an initial number of en-
zyme and substrate particles E0=10 and P0 = 10, respectively. (C) Kinetics
of the changes of the enzyme E and the complex enzyme-substrate ES as
function of time (in seconds). (D) Kinetics of the changes of the product P
and the substrate S as function of time (in seconds). Both (D) and (E) simu-
lations have been performed with an initial number of enzyme and substrate
particles E0=1000 and P0 = 1000, respectively. The simulations have been
obtained with the Direct Gillespie algorithm [49] implementation of Dizzy
simulator [155]. . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.3 Deterministic simulations of the kinetics of the changes of E and ES (A) and
of P and S (B) as function of time (in seconds). The initial concentrations
of enzyme and product have been set to E0=1000 and P0 = 1000, respectively. 18
1.4 Ionic bonding between Na and Cl atoms. Na sends a message e on channel c
to Cl that received it on the same channel c. After this communication Na
becomes Na+ and Cl becomes Cl−. . . . . . . . . . . . . . . . . . 36
1.5 Visualization of the specification of ”physical binding” reaction in π-calculus. 38
1.6 Competitive inhibition: substrate and inhibitor interact with enzyme in a
mutually exclusive way. . . . . . . . . . . . . . . . . . . . . . . . 42
ix
-
1.7 Pictorial representation of the bounded states of enzyme and substrate in the
molecular complex enzyme-substrate. . . . . . . . . . . . . . . . . . 42
1.8 A system of two parallel bio-processes B1 and B2 (left and right box, respec-
tively) in (A). bio-processes intra (B) and inter (C) reductions [147]. . . . 46
1.9 Graphical representation of the evolution of a bio-process due to expose (A),
hide (B), and unhide reductions (C). The expose rule assumes that z 6∈ ∆ and
z 6= x. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
1.10 (A) The execution of the join reduction defined in (1.8). (B) The execution
of the split reduction defined in (1.9). As far as join rule, note that, unlike
BioAmbients, the Beta binders formalism forbids the nesting of boxes. . . 48
2.1 The state space of a binary mixture. . . . . . . . . . . . . . . . . . 73
2.2 Accessible states for the reactions 2A ⇋ 2B with C = 7. . . . . . . . . 78
2.3 Lotka-Volterra dynamics for [Y1]t=0, [Y2]t=0, k1 = 1, and k2 = k3 = 0.1. . . 82
2.4 Lotka-Volterra dynamics for [Y1]t=0, [Y2]t=0, k1 = 1, and k2 = k3 = 0.1. The
equilibrium solution for this combination of parameters is [Y1] = 1 and [Y2] =
10. These values correspond to the coordinates of the nullclines intersection
points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
2.5 Dimerization dynamics for [P ]t=0 = 1, [P2]t=0 = 0, k1 = 0 and k2 = 0.5. Thus
the equilibrium constant is Keq = 2, and the equilibrium concentrations are
[P ]eq = 0.39 and [P2]eq = 0.30. c = 1. . . . . . . . . . . . . . . . . . 86
2.6 A. Experimental rate of loss of optical activity of sucrose for three initial
concentrations of sucrose and fixed concentrations of the enzyme. Data of
Michaelis and Menten replotted from Wong [195]. B. The initial rate V (0)
in A. of the invertase catalyzed reaction plotted as a function of sucrose
concentration. . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
2.7 Two metabolites A and B coupled by a bimolecular reactions. Adapted from
[78] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
x
-
2.8 Since the curve shape is not symmetric, the average kinetic energy will always
be greater than the most probable. For the reaction to occur, the particles
involved need a minimum amount of energy - the activation energy. . . . . 106
2.9 Maxwell-Boltzmann speed distributions at different temperatures. As tem-
perature increases, the curve will spread to the right and the value of the
most probable kinetic energy will decrease. At temperature increases the
probability of finding molecules at higher energy increases. Note also that
the area under the curve is constant since total probability must be one. . 106
2.10 The collision volume δVcoll which molecule 1 will sweep out relative to molecule
2 in the next small time interval δt. . . . . . . . . . . . . . . . . . 108
2.11 Five deterministic solutions of the birth-death process given by Eq. (2.40)
for values of λ− µ given in the legend and for x0 = 50. . . . . . . . . . 115
2.12 Five stochastic realizations of the birth-death process together with the de-
terministic solution (x0 = 50 λ = 3, µ = 4). . . . . . . . . . . . . . . 116
2.13 A. Cartoon of the four states of the GLUT transporter. B.
Kinetic diagram. . . . . . . . . . . . . . . . . . . . . . . . 142
2.14 Time behavior of the fraction of GLUT transporters in states S1 and S2 (x1
and x2 respectively). Simulation obtained from O. D. E. model. . . . . . 149
2.15 Time behavior of glucose concentration in and out of cell. The system is in
equilibrium at t ≈ 4.9 min. Simulation obtained from O. D. E. . . . . . 149
2.16 Time behavior of the fraction of GLUT transporters in states S1 and S2.
Simulation obtained from time-dependent First Reaction Method algorithm.
This figure shows the stochastic simulation result corresponding to the deter-
ministic model showed in Fig. 2.14. . . . . . . . . . . . . . . . . . 150
2.17 Time behavior of glucose concentration in the cell. Simulation obtained from
time-dependent First Reaction Method algorithm. This figure shows the
stochastic simulation result corresponding to the deterministic model showed
in Fig. 2.15. . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
xi
-
2.18 Probability density function of the binding reaction between GLUT trans-
porter and GLUCOSE molecule versus temperature. . . . . . . . . . . 152
2.19 Time behavior of glucose concentration in the cell. Simulation obtained with
the Direct. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
2.20 Time behavior of the GLUT transporter concentrations in states S1 and S2. 153
2.21 Time behavior of the reverse potential, number of open channels and rates of
gating. The initial number of open channel is 100. . . . . . . . . . . . 166
2.22 Time behavior of the reverse potential, number of open channels and rates of
gating. The initial number of open channel is 200. . . . . . . . . . . . 167
2.23 Time behavior of the reverse potential, number of open channels and rates of
gating. The initial number of open channel is 300. . . . . . . . . . . . 168
2.24 Time behavior of the reverse potential, number of open channels and rates of
gating. The initial number of open channel is 400. . . . . . . . . . . . 169
2.25 Time behavior of the reverse potential, number of open channels and rates of
gating. The initial number of open channel is 500. . . . . . . . . . . . 170
2.26 Time behavior of the reverse potential, number of open channels and rates of
gating. The initial number of open channel is 600. . . . . . . . . . . . 171
2.27 Time behavior of the reverse potential, number of open channels and rates of
gating. The initial number of open channel is 700. . . . . . . . . . . . 172
2.28 Time behavior of the reverse potential, number of open channels and rates of
gating. The initial number of open channel is 800. . . . . . . . . . . . 173
2.29 Time behavior of the reverse potential, number of open channels and rates of
gating. The initial number of open channel is 900. . . . . . . . . . . . 174
2.30 Time behavior of the reverse potential, number of open channels and rates of
gating. The initial number of open channel is 1000. . . . . . . . . . . . 175
2.31 Time behavior of the reverse potential, number of open channels and rates of
gating. The initial number of open channel is 5000. . . . . . . . . . . . 176
2.32 Time behavior of the reverse potential, number of open channels and rates of
gating. C = 0.1µF/cm2 . . . . . . . . . . . . . . . . . . . . . . 177
xii
-
2.33 Time behavior of the reverse potential, number of open channels and rates of
gating. C = 0.5µF/cm2 . . . . . . . . . . . . . . . . . . . . . . 178
2.34 Time behavior of the reverse potential, number of open channels and rates of
gating. C = 1µF/cm2 . . . . . . . . . . . . . . . . . . . . . . . 179
2.35 Time behavior of the reverse potential, number of open channels and rates of
gating. C = 20µF/cm2 . . . . . . . . . . . . . . . . . . . . . . . 180
2.36 Time behavior of the reverse potential, number of open channels and rates of
gating. C = 30µF/cm2 . . . . . . . . . . . . . . . . . . . . . . 181
2.37 Time behavior of the reverse potential, number of open channels and rates of
gating. C = 40µF/cm2 . . . . . . . . . . . . . . . . . . . . . . . 182
2.38 Time behavior of the reverse potential, number of open channels and rates of
gating. C = 50µF/cm2 . . . . . . . . . . . . . . . . . . . . . . . 183
3.1 The phases of the cell cycle . . . . . . . . . . . . . . . . . . . . . 197
3.2 Cyclin sub-units are synthesized on ribosomes in the cytoplasm and bind
rapidly and irreversibly to CDK kinases to form active dimers cyclin/CDK.
The cyclin sub-units are degraded periodically by the APC, releasing in-
active CDK monomers. The APC is inactivated by cyclin/CDK and re-
activated by an “activator”. The k’s are the chemical reaction rates, that
for the most part are functions of the dynamics variables. For example,
k2 = k′
2[inactiveAPC] + k′′
2 [activeAPC], where k′
2 and k′′
2 are the enzymatic
turnover numbers characterizing the less- and more-active forms of APC,
respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
3.3 The sequence of events in the cell cycle can be represented as a negative
feedback loop: the cyclin/CDK dimers (X) turn on the activator (Cdc20),
which indirectly activates Cdh1, which destroys cyclin sub-units. . . . . . 201
3.4 Simulation of cyclin/CDK concentration variation in time from equations
(3.2) - (3.7) with the parameters given in Tab. 3.3. (See [186, 133]) . . . 203
xiii
-
3.5 Simulation of CDH1 and CDC14 concentrations variations in time from equa-
tions (3.2) - (3.7) with the parameters given in Tab. 3.3.(See [186, 133]) . . 204
3.6 BioSpi simulation output for the two state Nasmyth model of cell cycle con-
trol. Time evolution of absolute number of proteins involved in the process:
Cdh1, Cdc14 and cyclin/CDK. . . . . . . . . . . . . . . . . . . . 206
3.7 The 4-phase model of lymphocyte recruitment. . . . . . . . . . . . . 208
3.8 Time evolution of bond density in the Dembo adhesion model. . . . . . 212
3.9 Representative trajectory of lymphocyte tethering at a mean velocity v equal
to one half of the hydrodynamic velocity vh, with parameters: γ = 0.001 nm,
kon = 84 s−1, k0off = 1 s
−1. . . . . . . . . . . . . . . . . . . . . 212
3.10 Representative trajectory of rolling motion of lymphocyte, with a mean ve-
locity v < 0.5vh that experience durable arrests. . . . . . . . . . . . . 213
3.11 Representative trajectory of lymphocyte for firm adhesion with parameters:
γ = 0.001 nm, kon = 84s−1, k0off = 20s
−1. . . . . . . . . . . . . . . 214
3.12 BioSpi simulation of 4-phases model of lymphocyte recruitment. . . . . . 222
3.13 Time evolution of number of bound molecules for three different sets of vessel
diameters values. . . . . . . . . . . . . . . . . . . . . . . . . . 223
3.14 Experimental measurements of the variation of rolling cells percentage at
varying vessel diameter. . . . . . . . . . . . . . . . . . . . . . . 224
3.15 Rolling cells percentage versus vessel diameter in the BioSpi model. . . . . 224
xiv
-
3.16 Pathogenesis of PD induced by mutant α-synuclein: 1. the intereation of a
nascent protein with a chaperone can results in a right-folded protein or in
a misfolded protein; 2. the chaperone attempts to re-fold the faulty protein
and the result can be again a right-folded protein or a misfolded protein; 3.
therefore,the misfolded protein is drapped by the ubiquitin transported by
the parkin protein. A mutant variant of the parkin is not able to transport
the ubiquitin on the misfolded protein. The mutant α-synuclein inhibits the
activation of the proteasome by the ubiquitin. The mutant α-synuclein seems
to be proteasome-proof, but the model presented in this paper takes into ac-
count an eventual attempt of the proteasome to attack the faulty α-synuclein.
The outcomes of the interactions between the nascent linearprotein and the
chaperone, as well as of the interaction between the mutant α-synuclein and
the proteasome are stochastically determined by the reaction probabilities
derived from the kinetic reaction rates accordingly to the Direct Gillespie
algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
3.17 (A) rs = 10 µs−1, Ns = 10; (B) rs = 0.01 µs−1, Ns = 100; (C) rs = 10 µs−1, Ns =
100; (D) rs = 1.0 µs−1, Ns = 100; (E) rs = 10 µs
−1, Ns = 200; (F) rs = 100.0
µs−1, Ns = 100; (G) rs = 10 µs−1, Ns = 1000; (H) rs = 1000.0 µs
−1, Ns = 100.
The rates used in these simulations has been taken from [36]. . . . . . . . . . 239
3.18 Number of non correctly refolded proteins in PD induced by mutant parkin.
The curve of MISFOLDED’ zeros before 5 µs, indicating that the production
of MISFOLDED” starts since the beginning of the simulation and increases
as the square root of the time, without giving to the proteosomal mechanism
of the cell any chance to react. . . . . . . . . . . . . . . . . . . . 240
3.19 Variation of number of chaperones and wrongly refolded proteins in PD in-
duced by mutant α-synuclein. The initial number of chaperones is 10. This
simulation shows that this number is not adequate to defend the cell from
the increasing of faulty proteins. . . . . . . . . . . . . . . . . . . . 240
xv
-
3.20 Variation of number of chaperons and wrongly refolded pro-
teins in PD induced by mutant α-synuclein. The initial num-
ber of chaperones is 100 in the plot (A) and 1000 in the plot
(C). The plots (B) and (D) are a zoom of the plots (A) and
(C) to better visualize the time behavior of the processes
in the first 0.2 µs−1 and 0.005 µs−1, respectively. A suffi-
ciently large number of chaperones seem to ensure the cell
the possibility to activate the proteasomes and consequently
to decrease the number of faulty proteins. . . . . . . . . . 241
4.1 Pictorial view of the dimerization process. The affine binders indicated with
the same polygon are hidden and the other binders are added to the interface
of the new box. . . . . . . . . . . . . . . . . . . . . . . . . . . 260
4.2 Scheme of the base model of the ligand-induced endocytosis. The mechanism
is described in the text. . . . . . . . . . . . . . . . . . . . . . . 261
4.3 Welcome page of BioBeta simulator . . . . . . . . . . . . . . . . . 264
4.4 The BioBeta help page. . . . . . . . . . . . . . . . . . . . . . . 265
4.5 The BioBeta form for the insertion of the specification of a bio-process. Here
it is shown the specification of the bio-process B1 ::= β(x,Γ)[x(y).Nil|x̄z.Nil]. 266
4.6 Selected fields for states and function of the bio-process E as indicated in
Table 4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
4.7 Message confirming that there are no syntax errors. Therefore, the user can
proceeds to insert a new bio-process, to go back to modify the specification
of the previous one, to run the simulation or to quit the simulator. . . . . 267
4.8 In order to run a simulation the user must insert the duration of the simulation
and the names of the bio-processes whose time-behavior has to be recorded
in the output table. . . . . . . . . . . . . . . . . . . . . . . . . 268
4.9 This page is “used” as a repository of downloadable papers about Beta binders
and its application in modeling bio-local phenomena. . . . . . . . . . 268
xvi
-
4.10 flux diagram of the simulation algorithm. . . . . . . . . . . . . . . . 275
4.11 Stochastic simulations of the time-course of Na, Cl and the ions Na+ and Cl−277
4.12 Stochastic fluctuations in Michaelis-Menten catalysis. Enzyme and substrate
initial concentration: E0 = S0 = 10 particles. The fluctuations of the curve
of the enzyme totally cover the curve of the substrate. . . . . . . . . . 278
4.13 Stochastic fluctuations in Michaelis-Menten catalysis. Enzyme and substrate
initial concentration: E0 = S0 = 100 particles. The width of stochastic
fluctuation is smaller: the curve of the enzyme partially covers the curve of
the substrate. . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
4.14 Bio-processes of the main components of the system for the control of cell
cycle in eukaryotes. . . . . . . . . . . . . . . . . . . . . . . . . 279
4.15 Time course of the number of active cyclin/CDK complexes and active APC
complexes. The increase of the number of active cyclin/CDK complexes
corresponds to the decrease of active APC complexes. . . . . . . . . . 285
4.16 The maxima of the time course of active cyclin/CDK dimers correspond to
the minima of the active CDC14. . . . . . . . . . . . . . . . . . . 285
4.17 Time course of the number of active cyclin/CDK complexes and inactive APC
compleses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
4.18 The time course of inactive CDC14 shows a stepwise decrease, while the trend
of time course of active cyclin/CDK dimer shows a decrease during the first
50 min and an increase in the following 50 min. . . . . . . . . . . . . 286
5.1 Scheme of a mass spectrometer. Adapted from [96]. . . . . . . . . . . 290
xvii
-
5.2 Conventional LC. It is most commonly used to purify and isolate some com-
ponents of a mixture. A liquid chromatograph separates analyte molecules in
solution by flowing the solution through a column that is packed with parti-
cles 3 to 5 µm in diameter and is between 10 and 30 cm long. The diameter
of the column depends on the application and determines the liquid flow rate.
Preparatory columns are > 10 mm in diameter, analytical columns are be-
tween 4 and 10 mm in diameter, micro-bore columns between 1 and 2 mm in
diameter and capillary columns < 1 mm in diameters. . . . . . . . . . 293
5.3 Scheme of a LC-MS. . . . . . . . . . . . . . . . . . . . . . . . . 293
5.4 Experimental equipment by the Bio-organic Chemistry Laboratory of Uni-
versity of Trento: ion trap mass spectrometer with electrospray, APCI( =
Atmospheric Pressure Chemical Ionization) and nanospray ionization sources
integrated with Hewlett Packard 1100 liquid chromatograph system [163]. . 295
5.5 The main components of an NMR instrument. Adapted from [96]. . . . . 297
5.6 The experiments in vitro significantly eliminate the context of interactions
where the system component C under investigation was placed in the context
in vivo. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
B.1 Scheme of a two-components signal transduction system [94]. . . . . . . 315
B.2 A simplified schema of reaction mechanism for the KdpD/KdpE two-components
system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
B.3 Schema of the kdpFABCDE regulon [94]. . . . . . . . . . . . . . . . 316
B.4 First variant: the protein inhibits its own translation (e. g. by blocking
the ribosome binding site). Second variant: the protein activates its own
degradation, (e.g by the activation of a protease). . . . . . . . . . . . 325
B.5 Simulation of the time behavior of protein (A) and of mRNA (B). Parameters:
A = 1, B = 1. Initial conditions RNA(0) = 0 and prot(0) = 0. . . . . . 329
B.6 Simulation of the time behavior of protein (A) and of mRNA (B). Parameters:
A = 1, B = 0. Initial conditions RNA(0) = 0 and prot(0) = 0. . . . . . . 329
xviii
-
B.7 Simulation of the time behavior of protein (A) and of mRNA (B). Parameters:
A = 0, B = 1. Initial conditions RNA(0) = 0 and prot(0) = 0. . . . . . . 329
B.8 Simulation of the time behavior of protein (A) and of mRNA (B). Parameters:
A = −1, B = 1. Initial conditions RNA(0) = 1 and prot(0) = 1. . . . . . 337
xix
-
Chapter 1
Introduction
1.1 What is biological modeling
Modeling is an attempt to describe an understanding of the elements of a
system of interest, their states, and their interactions with other elements.
The model should be sufficiently detailed and precise so that it can in prin-
ciple be used to simulate the behavior of the system on a computer. In the
context of molecular cell biology, a model may describe the mechanisms in-
volved in transcription, translation, cell regulation, cellular signaling, DNA
damage and repair processes, the cell cycle or apoptosis. At a higher level,
modeling may be used to describe the functioning of a tissue, organ, or even
an entire organism. At still higher level, models can be used to describe
the behavior and time evolution of populations of individual organisms. At
the beginning of a modeling project, the first issue to confront is to decide
on which feature to include in the model and the level of detail the model
is intended to capture. So, for example, a model of an entire organism is
unlikely to describe the detailed functioning of every individual cell, but a
model of a cell is likely to include a variety of very detailed description of
key cellular processes. Even then, however, a model of a cell is unlikely to
contain details of every single gene and protein. In order to show how it is
possible to think about a biological process at different scales and different
1
-
1.1. WHAT IS BIOLOGICAL MODELING CHAPTER 1. INTRODUCTION
levels of detail, let us consider the photosynthesis process. It can be sum-
marized by a single chemical reaction mixing water with carbon dioxide to
get glucose and oxygen. The reaction is catalyzed by the sunlight. This
could be written as
Water + Carbon dioxidesunlight−→ Glucose + Oxygen
This single reaction is a summary of the overall effect of the process.
Although the photosynthesis consists of many reactions, the above equa-
tion to describe it is not really wrong. It globally represents the process at
higher level than the more detailed description that biologists often pre-
fer to work with. Whether a single overall equation or a full breakdown
into component reactions is necessary depends on whether intermediate
reagents are elements of interest to modeler. In general, we can state that
the ”art” to build a good model consists in the ability of capturing the
essential features of the biology without burdening the model with non-
essential details. However just because of the omission of the details, every
model is to some extent a simplification of the biology. Nevertheless, mod-
els are valuable because they take ideas that might have been expressed
verbally or diagrammatically, and make them more explicit, so that they
can begin to be understood in a quantitative rather than purely qualitative
way.
The features of a model depend very much on the aims of the modeling.
Modeling and simulation appeared on the scientific horizon much more
before the emergence of molecular and cellular biology. Their genesis is
in the physical sciences and engineering. In the physical sciences, besides
theoretical and experimental studies, modeling and simulation are consid-
ered as the third indispensable approach because not all hypotheses are
amenable for confirmation or rejection by experimental observations. In
biology, researchers are facing the same or maybe even worse situation. On
2
-
CHAPTER 1. INTRODUCTION 1.1. WHAT IS BIOLOGICAL MODELING
one hand experimental studies are unable to produce a sufficient amount
of data to support theoretical interpretations; on the other hand, due to
data insufficiency, theoretical research can not provide substantial guid-
ance and insights for experimentation. Therefore computational modeling
takes a more important role in biology by integrating experimental data,
facilitating theoretical hypotheses, and addressing what if questions.
An other important aim of modeling is to make clear the current state
of knowledge regarding a particular system, by attempting to precise about
the elements involved and the interactions between them. Doing this can
be an effective way to highlight gaps in understanding. Our understand-
ing of the experimental observations of any system can be measured by
the extent to which a simulation, we create, mimics the real behavior of
that system. Behaviors of computer-executable models are at first com-
pared with experimental values. If at this stage inconsistency is found, it
means that the assumptions, that represent our knowledge on the system,
are at best incomplete, or that the interpretation of the experimental data
is wrong. Models survived to this initial validation can then be used to
make predictions to be tested by experiments, as well to explore config-
urations of the system that are not easy to investigate by in vitro or in
vivo experiments. Creation of predictive models can give opportunities for
unprecedented control over the system. In contrast to physics, biology still
lacks the fundamental laws on which it is based. Modeling can provide
valuable insights into the workings and general principles of organization
of biological systems.
Modeling, simulation, and analysis of the simulation outcomes are there-
fore perfectly positioned for integration into the experimental cycle of cell
biology (Fig. 1.1). Although we will always need real experiments to
advance our understanding of biological processes, conducting in silico,
or computer-simulated experiments can help guide the wet-lab process by
3
-
1.2. SYSTEM BIOLOGY CHAPTER 1. INTRODUCTION
Qualitative modeling Quantitative modeling
Cellular data
Experiments Cell programmingAnalysis and interpretation
Figure 1.1: The cell biology research cycle.
narrowing the experimental search space.
1.2 System Biology
More than fifty years ago, Watson and Crick [192] identified the struc-
ture of DNA, thus paving the way for the molecular biology and genetics.
Grounding the biological phenomena on molecular basis made it possible
to describe the different aspects of biology, such as heredity, diseases and
development, as the result of the coherent interactions between sets of ele-
ments, that are either functionally different or most often multifunctional.
Grounding biological phenomena on a molecular basis made it possible
to include biology in a consistent framework of knowledge based on fun-
damental law of physics. Since then, the field of molecular biology has
emerged and enormous progress has been made. Molecular biology en-
ables us to understand biological systems as molecular machines. Large
numbers of genes and the function of transcriptional products have been
identified. DNA sequences have been fully identified for various organisms
such as mycoplasma, Escherichia Coli (E. coli), Caenorhabditis elegans (C.
4
-
CHAPTER 1. INTRODUCTION 1.2. SYSTEM BIOLOGY
elegans),Drosophila melanogaster, and Homo sapiens. Measurements of
protein level and their interactions is also making progress [77, 167]. In
parallel with such efforts, new methods have been invented to disrupt the
transcription of genes, such as loss of-function knockout of specific genes
and RNA interference that is particularly effective for C. elegans and is
now being applied for other species. Nevertheless, such knowledge is not
sufficient to provide us a complete understanding of biological systems as
systems [89]. Cells, tissues and organs, and organisms as well as ecologi-
cal webs are systems of components whose specific interactions have been
defined by evolution; so a system-level understanding should be the prime
goal of biology.
System-level understanding requires a set of principles and methodolo-
gies that links the behaviors of molecules to system characteristics and
function. These principles and methodologies should be developed in the
following four areas of investigation.
1. System structures. These include the network of gene interactions
and biochemical pathways, as well as the mechanisms by which such
interactions modulate the physical properties of intracellular and mul-
ticellular structures.
2. System dynamics 1. How a system behaves over time under vari-
ous conditions can be understood through metabolic analysis, sensi-
tivity analysis, and dynamic analysis methods such as portrait and
bifurcation analysis. Specifically, the system behavior analysis aim
at addressing the following questions: how does a system respond to
1With the term ”dynamics”, we simply mean ”time-evolution”. In this book the term is not used with
the meaning it has in mechanics, where it is different from ”kinetics” or ”kinematics” and it is concerned
with the effects of forces on the motion of a particle or system of particles, especially of forces that do not
originate within the system itself. On the contrary, in chemistry ”dynamics” is synonymous of ”kinetics”,
that is concerned with the rates of change in the concentration of reactants in a chemical reaction, and
thus with the time-behavior of the system.
5
-
1.2. SYSTEM BIOLOGY CHAPTER 1. INTRODUCTION
changes in the environment? How does it maintain robustness against
potential damage, such as DNA damage and mutations? How do spe-
cific interaction pathways exhibit functions observed? It is not a triv-
ial task to understand the behaviors of complex biological networks.
Computer simulation and a set of theoretical analysis are essential to
provide in-depth understanding of the mechanisms behind the path-
ways.
3. Control methods. the individuation of mechanisms that systemat-
ically control the state of the cell is necessary for two reasons: 1.
their understanding can be exploited to modulate them to minimize
malfunctions, and 2. they involve potential therapeutic targets for
treatments of diseases.
4. The design method. Strategy to modify and constructing biological
system with desired properties can be developed on definite design
principles and simulations, instead of blind trial-and-error.
Any progress in each of the above areas requires breakthroughs in our
understanding not only of molecular biology, but also of measurement tech-
nologies and computational sciences. Although advances in accurate, quan-
titative experimental approaches will doubtless continue, insights into the
functioning of biological systems will not result from purely intuitive as-
saults. The reason of this stays in the intrinsic complexity of biological
systems, that ac combination of experimental and computational simula-
tion approaches is expected to solve.
At the present, identification of gene-regulatory logic and biochemical
network is a major purpose. Nowadays, biological modeling aims at un-
covering mechanisms at the fine-grained level that are internally consistent
with molecule-level biological programs and at reproducing observed phe-
nomena. Since it is hard to continuously and systematically monitor the
6
-
CHAPTER 1. INTRODUCTION 1.2. SYSTEM BIOLOGY
parallel activities in molecular networks, molecule-level modeling [68, 131]
has become and indispensable tool to bridge experimental and theoretical
studies and to link system behaviors with molecular reactions.
Due to the distinctive differences between biological and physical sys-
tems, modeling a network of interacting molecules comes with additional
challenges and calls for new strategies and tools. The early objective of
modeling was to explore the feature of complex biological systems treated
as black boxes. In such a scenario, the goal was to understand and predict
the behavior of a system without knowing the microscopic details. The
strategy was to reproduce observed phenomena at high level with a sim-
plified description of the internal structures. Two methodological feature
emerged at this stage. First, since biological systems were approximated
as structure-less entities, many methods and tools were directly borrowed
from engineering fields such as Finite Element Method2 and Boundary El-
ement Method [7]. The second methodology was a high-level abstraction
based on the inverse approach to modeling. As consequence, numerical
techniques for the solution of ordinary differential equation (ODE) and
partial differential equations (PDE) were applied. Both black box assump-
tion and inverse modeling, though suitable for modeling mechanical sys-
tems, suffer from major problems when applied to biological systems. The
black box conjecture assumes that the infernal structure of the system is
static and thus it can not hold when the system evolves in time as, for
instance, in growth process. Complex internal structure and evolution are
key feature that differentiate biological systems from mechanical systems.
The inverse modeling suffers from generality loss and many inverse prob-
lem are mathematically ill-posed. Even if adequate and precise data are
available, unique solution is no always guaranteed and special techniques
2A lot of references about Finite Element Method can be found at
http://www.solid.ikp.liu.se/fe/tit.html
7
-
1.2. SYSTEM BIOLOGY CHAPTER 1. INTRODUCTION
are employed specifically to the problem in hand [79].
The dynamic context in which genes operate is much more complex than
the static composition of genes and genomes. Though sequence alignment
can help us find homologues, the exact functions of genes still need to
be confirmed experimentally. For example during embryonic development,
different ectopic [41] and failed gene expression events can lead to different
phenotypes. The problem is encountered by by creating various knock-
ins and knock-outs. The semantics of the genetic program can not be
modeled by using the black box conjecture. However, more generally, how
the interaction among molecules produces the complexity of a biological
system has no clear answer. Knowledge of biological complexity can lead
to design better or more efficient systems, and also for understanding of
pharmacological effects for drug discovery. Because of these reasons also,
a set of simulations, each of which coming from a perturbation of the
parameters of an original model, are helpful in understanding the dynamic
context in which gene, products and molecules operate.
A perturbed system is one in which the system’s behavior is forced out
of its ’normal’ state by disturbances coming, for example, from external
influences. This definition applies to theoretical physics or a biological
system, and in both cases perturbation offers a means to study and un-
derstand a system. Furthermore, applying perturbation theory to biology
may eventually allow prediction and treatment of pathological perturba-
tions (diseases) such as exist in the clinical setting.
Perturbation analysis studies the behavior of systems forced out of their
normal state. It is often the case that the behavior of a system under such
perturbations is much more amenable to theoretical analysis than the gen-
eral (i.e., normal) behavior of the system. The main reason is that, math-
ematically, the behavior of a system close to its ’normal’ state can often be
described by linear equations, whose theory is very well developed. In ad-
8
-
CHAPTER 1. INTRODUCTION 1.2. SYSTEM BIOLOGY
dition, beyond linear perturbation theory, there is a well-developed theory
for describing the behavior of systems as one moves away from a reference
state. This theory aims, for instance, to predict under what perturba-
tions a system will return to its reference state and which perturbations
will destabilize the system. This way of thinking also applies to biology.
The perturbation of a biological system by means of genetic mutation or
small molecules (chemical genetics) greatly aids the understanding of the
fundamental principles underlying such a system or process. Through ge-
netic dissection biologists learned that basic cellular processes such as cell
growth and cell division (as well as developmental processes depending on
the interaction of groups of cells and tissues) have been highly conserved
throughout evolution. Therefore, perturbations by small molecules or by
targeted or random mutations in individual genes in simple model organ-
isms such as yeast, Drosophila, C. elegans, Arabidopsis, and the mouse
have provided, and will in the future provide important insight into the
function of complex systems. Perturbation theory can also be applied bi-
ologically in a more controlled, reiterative manner. One can imagine tak-
ing some biological system of interest, defining its normal behavior, and
then investigating in a general and methodological way which perturba-
tions destabilize the system (in the sense that it will no longer return to
its normal state). Examples could be regulatory systems of various kinds,
such as those that keep the concentrations of different metabolites within
the cell at fixed levels and restore these levels after a perturbation. One
would then aim to identify what kind of perturbations would destabilize
these regulatory systems. One would go back and forth between perform-
ing perturbation experiments to see how the system behaves in response
to various perturbations, and building theoretical and computational mod-
els. One would start with ’small’ perturbations that can be described with
linear models, and would use those to predict, and subsequently test, the
9
-
1.2. SYSTEM BIOLOGY CHAPTER 1. INTRODUCTION
behavior in response to larger perturbations.
1.2.1 What future for System Biology
Biologists are getting enthusiastic about mathematical modeling, as model-
ers are getting exited by biology. The complexity of molecular and cellular
biological systems makes it necessary to consider dynamic systems theory
for modeling and simulation of intra- and inter-cellular processes. To de-
scribe a system as ”complex” has become a common way to either motivate
new approaches or to describe the difficulties in making progress. Cur-
rently, before we can fully explain and understand the functioning and the
functions of cells organs or organisms from the molecular level upwards,
the major difficulties to overcome are technological and methodological.
Nevertheless, whatever time is required, the complexity of these systems
ensures that there is no way around mathematical modeling in this en-
deavor. A mathematical pathway model does not represent an objective
reality outside the modeler’s mind. The model is no more, and no less, a
complement of biologist’s reasoning. Mathematics is the handicraft of the
natural sciences.
The risk in this exciting endeavor is that the following thoughts from
the beginnings of System Biology will remain true in the years to come:
”In spite of the considerable interest and efforts, the application of systems
theory has not quite lived up to expectations. One of the main reasons fro
the existing lag is that systems theory has not been directly concerned with
some of the problems of vital importance in biology.”[121]
The challenge is for both the theoreticians and experimentalists to change
their ways:
”The real advance in the application of systems theory to biology will come
10
-
CHAPTER 1. INTRODUCTION 1.3. COMPLEXITY ...
about only when the biologists start asking questions which are based on
the system-theoretic concepts rather than using these concepts to repre-
sent in still another way the phenomena which are already explained in
terms of biophysical or biochemical principles. Then we will not have ’the
application of engineering principles to biological problems’ but rather a
field of System Biology with its own identity and in its own right.”[121].
System biology has succeeded when it is widely accepted that there is
nothing more practical than a good theory [194].
It is now necessary to clarify what complexity means in the context of
system biology. A complete definition of complexity should be given with
respect to
• the model: the large number of variables that can determine the be-havior
• the natural system: the connectivity and non-linearity of relationships
• the technology: the limited precision and accuracy measurements
• the methodology: the uncertainty arising from the conceptual frame-work chosen (e. g. the choice of automata instead of differential
equations).
However in the next section we will focus on the natural system and
methodology to model and simulate them, that are the central issues of
this thesis, in which we will try to exploring the relationships between
the inherent characteristics of a biological system and the mathematical
framework and formalism that are more adapt to describe it.
1.3 Complexity of a biological system
It is often said that biological systems, such as cells, are complex systems,
and that the grand challenge of 21st century is to understand and model the
11
-
1.3. COMPLEXITY ... CHAPTER 1. INTRODUCTION
complexity of biological systems. Though complexity has been extensively
discussed at different levels [112, 196, 201, 168], there is no operational def-
inition for biological systems [4]. The common notion of complex systems
if of very large numbers of simple and identical elements interacting to
produce ’complex’ behaviors. However, the reality in biology is somewhat
different. In biological system large numbers of functionally different, and
often multifunctional, sets of elements interact selectively and non-linearly
to produce coherent rather than complex behaviors. A biological system
is not equal to the sum of its parts [136], in which functions emerge from
the properties of the networks rather than from any specific element. On
the contrary in biological systems, functions rely on a combination of the
network and the specific element involved [90]. A typical example is p53
interactions pathway. This protein, known as ’the guardian’ of the genome,
acts as tumor suppressor. It is activated, inhibited and degraded by reac-
tions as phosphorilation, de-phosphorilation, and proteolytic degradation,
while its targets are selected by the different modification patterns that
exist; these are properties that reflect the complexity of the element it-
self. Just considering this example, Kitano [90] highlighted that biological
system are better characterized as symbiotic systems.
Beside the inherent complexity, some hallmarks of complexity, such as
linearity and non-linearity, number of parameters, order of equations and
evolution of network, come out only when a system is formalized in spe-
cific ways (see Appendix B) for a linear formalization of a two-component
signal transduction model). Moreover, we can distinguish two types of
complexity both encountered in modeling biological systems: functional
and structural, or dynamic and static. The operative definition and the
identification of the complexity in biological system is not the only hard
task, but also its quantitative measure is a big task for experimental bi-
ologists. The popular measure of complexity for dynamical system is the
12
-
CHAPTER 1. INTRODUCTION 1.4. STOCHASTIC MODELING APPROACH
computational complexity. For instance, the complexity of a sequence can
be inferred from what finite state machine can produce. Although this
measure characterizes the amount of information necessary to predict the
future state of the machine, it fails to address its meaning in the world of
molecular and modular cell biology [4].
Since the topological structure of a molecular network undergoes sig-
nificant evolution within cells in biological development, to measure both
static and dynamic complexity according to such evolution may be a prac-
tical way, namely it is easier to identify and abstract information from it
[18, 91]. Furthermore, feature in topological structure, such as the exis-
tence of organized biological compartments, are also helpful in identifying
modularity of molecular interaction. We will return on this point in section
1.5.
Finally, there are other two important indexes of complexity in biological
systems. The first is non-linearity, including parameter sensitivity and
initial values sensitivity. The second, on which we will focus in this thesis
is the existence of stochasticity. The noise increases the complexity of the
systems even further by introducing issues of robustness, noise resonance
and bi-modal behavior.
1.4 Stochastic modeling approach
An important aspect of modeling of biological networks is the handling of
stochastic or random events that occur inside a cell. A more detailed and
formal discussion of this issue will have to be deferred until much later
in this thesis, once the appropriate concepts and terminology have been
established. In the meantime we highlight the issue citing some examples
that illustrate the importance of stochastic modeling both for simulation
and inference.
13
-
1.4. STOCHASTIC MODELING APPROACH CHAPTER 1. INTRODUCTION
Arguments for the application of stochastic models for chemical and
bio-chemical reactions come at least from three directions, since the model
1. takes into account
• the discrete character of the quantity of the components• the inherently random character of the phenomena
2. is in accordance with the theories of
• thermodynamics• stochastic processes
3. is appropriate to describe
• small system• instability phenomena
Many studies have reported occurrence of stochastic fluctuations and
noise in living systems. Observations of gene expression in individual cells
clearly illustrate the stochastic nature of transcription [1, 117]. Other stud-
ies in eukaryotic gene expression show that the messenger RNA (mRNA)
production is quantal [75] and is produced in random pulses [162, 191].
It has been proposed that proteins are produced in short ’bursts’ at ran-
dom time intervals rather than in a continuous manner [25]. Furthermore,
another clear evidence of the stochasticity of the biological phenomena at
the molecular level is the existence of qualitatively and quantitatively dif-
ferent outcomes in the temporal behavior of a system starting from the
same initial conditions. A classic example is the lysis/lysogenic switch of
bacteriophage λ infected E. Coli. Due to noise, the network may randomly
evolve into one of these two bistable regions [70, 69]. Role of noise has
also been seen in bacterial chemotaxis [107] and cellular selection [182]. At
14
-
CHAPTER 1. INTRODUCTION 1.4. STOCHASTIC MODELING APPROACH
the level of cellular population, the most important implication of noise
in critical cellular processes is that in spite of identical initial conditions,
with time, different cells may evolve along distinct pathways. population
measurements typically show that the level of expression from the same
gene vary significantly across cells with the same genetic material. The
origin of such variability among isogenic population is largely attributed
to stochastic phenomena [67].
At the microscopic level of functioning of cellular processes the inter-
actions between the molecules - DNA, mRNA, proteins, small molecules -
follow the laws of the statistical theoretical physics. A fundamental result
of this branch of physics is the√n law [165], which says that random-
ness or fluctuations in a system is inversely proportional to the square
root of the number n of particles present in the system. This number can
be considered as an index of the system size. As a result low number of
particles or low concentration result in high fluctuations, origin of which
is largely thermal oscillations. Biochemical species participating in pro-
cesses such as gene transcription, regulation and signaling often occur in
low copy number; for example as a single DNA template with small num-
ber of promoter sites, few tens of mRNA molecules and other transcription
factors numbering around few hundreds. Consequently, elementary reac-
tions, such as polymerase binding of complex formation, take place with
widely distributed reaction times. Such stochastic effects arising due to the
inherent nature of biochemical interactions are often termed as intrinsic
noise. As the concentrations of the reacting species increase, the stochas-
ticity becomes less prominent and the behavior of the system tends to the
deterministic solution. We illustrate this fact through the stochastic sim-
ulation of Michaelis-Menten enzyme catalysis, whose mechanism is given
by the following set of reactions 3
3The formalism of chemical notation will be presented in details in chapter 2.
15
-
1.4. STOCHASTIC MODELING APPROACH CHAPTER 1. INTRODUCTION
E + Sk+1→ ES
ESk−1→ E + S
ESk2→ E + P
where E is the enzyme, S is its substrate, and P is the product. The
reaction rates are k+1 = 1.0 M−1s−1, k−1 = 0.1 s
−1, and k2 = 0.01 M−1s−1.
Figs. 1.4 (A) and (B) show the results of stochastic simulation for 10 sec-
onds with initial enzyme and substrate molecules number being 10, while
Figs. 1.4 (C) and (D) show and 10000 particles. The plots show the results
in a simulated time of 400 s. Increasing the number of particles, the curves
of the time evolution of the reactants and products become less noisy.
In a network of molecular interactions there exists an extrinsic com-
ponent of noise too. The extrinsic component of randomness is due to
the external environmental conditions. For example, a transcription factor
for a given gene is often the protein product of another gene and thus its
production is also random. Such situations, where a protein product of
a stochastic triggering of a gene leads to the switching of another gene,
are characterized by a cascade of stochastic events. The timings of such
triggers can result in different outcomes [116].
We have to make clear that the formulation of the theory of stochastic
kinetics does not reduce the importance of deterministic kinetics, because
there exist a class of phenomena for which the stochastic model is only
slightly “better’ than the deterministic approach, while the mathematics
of the stochastic model is much more complicated. ODE description has
been practically used in many quantitative models. The general form of
an ODE model can be written as
16
-
CHAPTER 1. INTRODUCTION 1.4. STOCHASTIC MODELING APPROACH
(A) (B)
(C) (D)
Figure 1.2: (A) Kinetics of the changes of the enzyme E and the complex enzyme-substrate
ES as function of time (in seconds). (B) Kinetics of the changes of the product (P) and the
substrate S as function of time (in seconds). Both (A) and (B) simulations have been performed
with an initial number of enzyme and substrate particles E0=10 and P0 = 10, respectively. (C)
Kinetics of the changes of the enzyme E and the complex enzyme-substrate ES as function of
time (in seconds). (D) Kinetics of the changes of the product P and the substrate S as function
of time (in seconds). Both (D) and (E) simulations have been performed with an initial number
of enzyme and substrate particles E0=1000 and P0 = 1000, respectively. The simulations have
been obtained with the Direct Gillespie algorithm [49] implementation of Dizzy simulator [155].
d[Xi]
dt= fi(x) (1.1)
where i = 1, 2, . . . , N and [Xi(t)] is a continuous single-valued function
describing the time behavior of the concentration of the i-th species. The
17
-
1.4. STOCHASTIC MODELING APPROACH CHAPTER 1. INTRODUCTION
(A) (B)
Figure 1.3: Deterministic simulations of the kinetics of the changes of E and ES (A) and of P
and S (B) as function of time (in seconds). The initial concentrations of enzyme and product
have been set to E0=1000 and P0 = 1000, respectively.
specific forms of the function fi , which are usually nonlinear in the [Xi]’s,
are determined by the structures and rates constants of the chemical re-
actions of the system. The equations (1.1) are called reaction rate equa-
tions; solving them for the functions [X1(t)], . . . , [XN(t)], subject to the
prescribed initial conditions, is tantamount to solving the time evolution
of the number of molecules of each species.
The set of O.D.E. governing the deterministic dynamics of the Michaelis-
Menten kinetics is
d[S]
dt= k2[ES]− k1[S][E]
d[E]
dt= (k2 + k3)[ES]− k1[S][E]
d[ES]
dt= k1[S][E]− (k2 + k3)[ES]
d[P ]
dt= k3[ES]
There have been several platforms for ODE based modeling. Among
them, the most known are Gepasi [119] and E-CELL [183], which share
a number of features in common, e. g. for chemical reactions simula-
18
-
CHAPTER 1. INTRODUCTION 1.4. STOCHASTIC MODELING APPROACH
tion. Tools of mathematical analysis like metabolic control analysis and
linear stability analysis of steady state, and parameter fitness have also
been implemented. We refer the reader to the Appendix 5.4 for a review
on the currently available simulators based on differential equation for-
malism. However, though metabolic reactions can be simulated by these
tools, signaling activities may not be well supported [54]. Furthermore,
signaling networks are non static and undergo evolution [18, 200]. Thus,
modeling of the context-dependent cellular processes merits a different ap-
proach. A typical example is Presenilin, a protein responsible for cleaving
Notch/Delta complex. It can selectively cleave a large group of membrane
proteins in different contexts [98, 171]. Thus, to describe its behavior with
ODEs is infeasible, because
• the biochemical equations would be very complex
• with the addition of new gene or protein into the model many equa-tions must be re-written, an arduous work that greatly slows the mod-
eling process itself.
Another example of gene with complex function is the Notch gene itself,
that takes part in intercellular communication process. The semantics or
function of its interaction with other proteins depends on its partners and
the timing of interaction [154, 83]. In addition, in any practical model, to
get complete quantitative data on gene and protein activity, such as the
rate of transcription, translation and degradation of proteins, is extremely
difficult. Thus, only small or medium sized models have been reported.
This brief introduction to reaction rate equations allows us to under-
stand more deeply the meaning of the expression “intrinsically stochastic”,
that in this section we have used to define the character of a biological
phenomenon at the molecular scale. Although the great importance and
usefulness of the differential reaction rate equations approach to chemical
19
-
1.4. STOCHASTIC MODELING APPROACH CHAPTER 1. INTRODUCTION
kinetics cannot be denied, we should not lose sight of the fact that the phys-
ical basis for this approach is meaningless. This approach assumes that the
time evolution of a chemical reacting systems both continuous and deter-
ministic. However, since the molecular population levels can change only
by discrete integer amounts, the time evolution of a chemical reacting sys-
tem is no a continuous process. The time evolution is not a deterministic
process either. Even ignoring quantum mechanical effects and regarding
the molecular motions to be governed by the equations of classical me-
chanics, it is impossible even in principle to predict the dynamics of the
system unless we have a complete knowledge of the its state. Knowledge
about the state of the system includes the details about the position, the
orientation, and the momentum of every single molecule under considera-
tion, together with a complete knowledge of the chemistry of interacting
molecules. If we leave out such details of the state of the system in favor
of a higher level view, the dynamics of the system is not deterministic but
intrinsically stochastic. In other words, although the temporal behavior of
a chemically reacting system of classical molecules is deterministic in the
full position-momentum phase space, it is stochastic in the N -dimensional
subspace of the molecular population levels, as Eqs. 1.1 imply.
To conclude this section, we point out some of the roles played by the
stochasticity in biological phenomena. Since the noise is a nuisance, the
living systems have developed noise-suppressing mechanisms, an example
is the genetic redundancy [134]. The theory of feedback loop control states
that the noise is also a stabilizer and a driver of molecular motors. More-
over noise is also responsible for the phenomenon of stochastic resonance,
that is the phenomenon in which noise enhances the detection of weak
signals and help improve the biological information processing [71].
Noise is involved in the so-called stochastic focusing, in which cells ex-
ploit it to reduce the random variation in regulated processes, by tuning
20
-
CHAPTER 1. INTRODUCTION 1.4. STOCHASTIC MODELING APPROACH
a mechanism to a threshold [139]. And finally, stochasticity plays a cru-
cial role in the differentiation by establishing initial asymmetries leading
to different evolutive categories of different part of a system (example of
a role of noise in differentiation can be found in many processes regarding
the immune systems, such as the clonal amplification of cells expressing
an antigen, but also in many processes driving the rhythm of biological
oscillators such as those involved in circadian rhythm mechanism.
1.4.1 Stochastic simulation algorithms
Models with a small number of molecules can realistically be simulated
stochastically, that is, allowing the results to contain an element of proba-
bility, unlike a deterministic solution. The stochastic simulation algorithms
provide a practical method for simulating reactions which are stochastic
in nature. Different approaches in modeling stochastic character of bio-
logical phenomena uses different mathematical formalism and simulation
techniques. Although we will treat in detail this topic in the next chapter,
here we give a brief anticipation that allow the reader to understand the
solution proposed in this thesis both for the specification and simulation
of biological stochastic systems. The most used stochastic models are the
“continuous time - discrete state space - stochastic” (CDS) models, where
the intrinsic noise is simulated by the Chemical Master Equation. As we
will see in great detail in Chapter 2 the Chemical Master Equation is im-
possible to solve for most practical problems. Gillespie proposed two exact
stochastic simulation algorithms to solve the Chemical Master Equation
based on the assumptions that the system is homogeneous and well mixed.
The algorithm simulates one reaction at a time based on the propensity
function for each reaction. This function is the probability that a reaction
has to occur in a given infinitesimal interval of time.
At each time step, the chemical system is exactly in one state. The
21
-
1.4. STOCHASTIC MODELING APPROACH CHAPTER 1. INTRODUCTION
idea is to directly simulate the time evolution of the system. Basically,
the algorithm determines the nature and occurrence of the next reaction,
given that the system is in state X at time t. Given a system with total
number of reaction channels N and total number of species M , we then
define the following notations. The state of the system X is defined by
the state vector whose components are the numbers of molecules of each
involved chemical species, thus X = (X1, X2, . . . , XN).
• P (τ, µ) = probability that given the state at time t, the next reactionin volume V will occur in the infinitesimal time interval (t+ τ, t+ τ +
dτ), and will be an reaction Rµ
• cµ = stochastic rate constant for reaction µ. As we will prove inChapter 2, it can be derived from the deterministic rate constant k.
• hµ = number of distinct Rµ molecular reactant combinations availablein the state X
• aµ = propensity function of reaction µ
• aµdt = hµcµdt = probability that an Rµ reaction will occur in volumeV , in (t, t+ dt), given that the system is in a state X at time t.
The algorithm, known as Direct Method can be summarized as follows
1. Initialize the system at t = 0 with stochastic rate constants c1, c2, . . . , cM
and the initial numbers of molecules of each species x1, x2, . . . , xN .
2. For each i = 1, 2, . . . ,M , calculate ai(x, xi), based on the current state
Xcurrent
3. Calculate ao =∑
i ai(Xcurrent), the combined reaction hazard
4. Simulate time to next event, t′, as an Exp(a0) random quantity
22
-
CHAPTER 1. INTRODUCTION 1.5. FORMALIZING THE COMPLEXITY
5. Put t← t+ t′.
6. Simulate the reaction index, µ, as a discrete random quantity with
probabilities ai(Xcurrent)/a0, i = 1, 2, . . . ,M .
7. Update the state X of the system according to the reaction µ, that is
put Xcurrent ← Xcurrent + S(µ), where S(µ) denotes the µth column ofthe stoichiometry matrix S.
A variant of the Direct Method is the First Reaction Method. This
variant differs from the standard approach in the points 3, 4, 5. I does
not calculate a0 and extract a putative times ti from the Exp(ai) random
quantities. The simulated time for the next reaction is the smallest ti and
the reaction index is the one of the corresponding Ri reaction.
The Gillespie algorithm has been applied to many in silico biological
simulations recently. Kastner et al. in [85] applied the algorithm for simu-
lation of Hox cis-regulatory mechanisms. The simulation was successful in
reproducing key features of the wild-type pattern of gene expression and in
silico experiments yielded results similar to that of in vivo experiments. Be-
sides that, Kierzek et al. in [87], applied the algorithm to model lacZ gene
expression and discovered the influences of the frequencies of transcrip-
tion and translation initiation on random fluctuations in gene expression.
McAdams and Arkin in [116], also studied the transcription initiation and
translation mechanisms in the cellular regulatory network using Gillespie’s
algorithms and found several stochastic phenomena like the fluctuation in
protein production and switching delay for genetically coupled links.
1.5 Formalizing the complexity
The invention of conceptual and technological tools are the building blocks
of any scientific revolution and paradigm shift. Such conceptual and tech-
23
-
1.5. FORMALIZING THE COMPLEXITY CHAPTER 1. INTRODUCTION
nological tools are now emerging at the intersection of computer science,
mathematics, biology, chemistry and engineering.
The main three concepts that revolutionized the approaches of the re-
searcher to the system biology can be summarized as in the following:
• A living cell is an information processing device. Cells naturally pro-cess internal and environmental information in complex fashions and
interact with neighboring cells to achieve coordinated behavior.
• Cellular information processing and passing are carried out by net-works of interacting molecules.
• A better understanding of the cell requires an information processingmodel.
Computers have similar characteristics to the cell. Like software, cells
affect, prescribe, cau