dit - university of trentoassets.disi.unitn.it/.../phd-thesis/xix/lecca_paola.pdf · 2011. 2....

PhD Dissertation

International Doctorate School in Information andCommunication Technologies

DIT - University of Trento

Modeling and simulating system biology

with stochasticity

Paola Lecca

Advisor:

Prof. Corrado Priami

Microsoft Research - University of Trento

Centre for Computational and Systems Biology

December 2006

Acknowledgements

To thank means to recognize to be indebted, and I am thoroughly in-

debted to Corrado Priami. Corrado was an exceptional advisor, supporting

my independence while providing scientific guidance, direction and intel-

lectual stimulation. In this thesis and in the papers that we wrote together

as well, he helped me turn my ideas into concrete scientific results.

I am also very grateful to Paola Quaglia because during my Ph. D.

studies I could benefit from her guidance and because she was a good

mentor. In particular I am grateful for her openness to the field of scientific

investigation presented in this thesis, where her expertise in concurrency

theory and functional languages and programming met my background in

Physics.

I was extremely fortunate to work with Carlo Laudanna, Gabriela Con-

stantin, Ines Mancini, Rita Frassanito and their co-workers. I owe my

biological motivation and thinking to all of them. In particular this the-

sis is the result of a profitable collaboration started three years ago with

Carlo and Gabriela for modeling and simulating autoreactive lymphocytes

recruitment in inflamed brain micro-vessels. I am very grateful to Carlo

and Gabriela that devoted time, ideas and patience. Many thanks to Ines

for the stimulating scientific discussions about chemistry, biochemistry and

biology.

I am personally thankful to Andrew Phillips, Luca Cardelli, Adelinde

Uhrmacher, Mathew Palakal, Ralf Blossey and John J. Tyson for their

interest in my work and their constant encouragement. I learned from

them several new ways to look into biology by means of the ideas and the

tools offered by Computer Science. Andrew gave me useful suggestions for

the implementation of an efficient stochastic algorithm for the prototype of

BioBeta simulator. I thank also Aviv Regev and William Silverman that

introduced me to the use of ASPIC simulator.

I wish also to thank all the technical staff of the Dipartimento di In-

formatica e Telecomunicazioni of my University for their kindness and the

promptness of their intervention. Many thanks to the secretariat of the

Doctorate School for the attention and kindness in helping me in the jun-

gle of bureaucracy.

A special thank also to the researchers and students of the Microsoft

Reasearch - University of Trento Centre for Computational and System

Biology. I appreciated them not only as scientists but also as wonderful

persons to work with.

A pleasant surprise of my Ph. D. course was acquiring new friends work-

ing in disciplines distant from mine and from different european and not

european countries. I am grateful to them for their friendship, encourage-

ment, scientific discussion and moments of leisure. I found also exceptional

colleagues that I wish to thank a lot for their collaboration and questions

that helped me to improve my work.

Finally I would like to thank my family to which I dedicate this thesis

as a sign of my gratitude for the support in the difficult moments and for

taking part to my happiness in the beautiful moments of my Ph. D. studies

course. I am extremely grateful to my sister, Michela, that believed in me

and encouraged my intellectual path.

4

Abstract

Modeling and simulation of pathways as networks of biochemical reac-

tions have received increased interest in the context of system biology. The

central dogma of this re-emerging area states that it is system dynamics

and organizing principles of complex biological phenomena that give rise to

functioning and function of cells. Cell functions, such as growth, division,

differentiation and apoptosis are temporal processes, that can be under-

stood if they are treated as dynamic systems. System biology focuses on an

understanding of functional activity from a system-wide perspective and,

consequently, it is defined by two hey questions: (i) how do the components

within a cell interact, so as to bring about its structure and functioning?

(ii) How do cells interact, so as to develop and maintain higher levels of

organization and functions? In recent years, wet-lab biologists embraced

mathematical modeling and simulation as two essential means toward an-

swering the above questions. The credo of dynamics system theory is that

the behavior of a biological system is given by the temporal evolution of its

state. Our understanding of the time behavior of a biological system can be

measured by the extent to which a simulation mimics the real behavior of

that system. Deviations of a simulation indicate either limitations or er-

rors in our knowledge. The biochemical approach to understand biological

processes is essentially one of simulation. A biochemist typically prepares

a cell-free extract that can mediate a well-described physiological process.

Once the extract is fractioned to purify the components that catalyze in-

dividual reactions, the physiological process in reconstructed in vitro. The

validity of this approach is measured by ho closely the in vitro reconstructed

process matches physiological observations. In this thesis we show that by

carefully representing the principles and logic of the wet-lab reasoning, we

can simulate faithfully on a computer a model of a biochemical pathway.

We show also how simulation can be used as interactive modeling tool for

reasoning about biochemical interactions in the design of experiments, in

discovery, and in prediction. This work highlights that realistic simula-

tions of the biological systems evolution require a mathematical model of

the stochasticity of the involved processes and a formalism for specifying

the concurrent nature of the biochemical interactions. The Gillespie sim-

ulation algorithm, suitable generalized to simulate biochemical interaction,

rather them chemical reactions is satisfies the first requirement. The sec-

ond requirement can be satisfied both by the stochastic π-calculus and by the

stochastic Beta binders formalisms. These languages shift the focus from a

Newtonian vision of the molecular dynamics to a detailed specification of

the components structure and functions. The thesis examines thoroughly

and critically discuss the main concepts of stochastic chemical kinetics and

develops the necessary re-formulations to adapt them to a biological context.

The work shows models and simulation results of low level system, such as

simple chemical reactions, and of higher level processes, such as cell cycle,

lymphocyte recruitment, pathogenesis of familial Parkinson’s disease and

voltage gating of ion channels in intercellular communications. The studies

reveal the inefficiency of the usual ordinary differential equation approach

to describe these phenomena. The dynamics modeling of molecular or cell

biological systems is different from the classical application of differential

equations in mechanics, because mathematical models of biochemical path-

way are phenomenal constructions in that the interactions among systems

variables are defined in an operational rather than a mechanistic manner.

6

Finally, we present a prototype of simulator for the stochastic Beta binders,

that joins the expressive power of this new formalism in specifying biolog-

ical interactions with a re-formulation of the principles of the stochastic

chemical kinetics more close to the world of biological phenomena.

Keywords: system biology, stochastic simulation, chemical kinetics, π-

calculus, Beta binders

7

Contents

1 Introduction 1

1.1 What is biological modeling . . . . . . . . . . . . . . . . . 1

1.2 System Biology . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.1 What future for System Biology . . . . . . . . . . . 10

1.3 Complexity of a biological system . . . . . . . . . . . . . . 11

1.4 Stochastic modeling approach . . . . . . . . . . . . . . . . 13

1.4.1 Stochastic simulation algorithms . . . . . . . . . . . 21

1.5 Formalizing the complexity . . . . . . . . . . . . . . . . . . 23

1.5.1 Disadvanges in using O.D.E. for system biology: two

alternatives . . . . . . . . . . . . . . . . . . . . . . 26

1.5.2 The π-calculus . . . . . . . . . . . . . . . . . . . . 32

1.5.3 The Beta binders formalism . . . . . . . . . . . . . 41

1.6 Summarizing . . . . . . . . . . . . . . . . . . . . . . . . . 64

2 Chemical kinetics 67

2.1 The mathematical structure of biological models . . . . . . 67

2.2 Chemical reactions . . . . . . . . . . . . . . . . . . . . . . 70

2.3 Kinetics of chemical reactions . . . . . . . . . . . . . . . . 72

2.3.1 Mass-action kinetics . . . . . . . . . . . . . . . . . 78

2.3.2 Example 1: the Lotka-Volterra system . . . . . . . 80

2.3.3 Example 2: the Michaelis-Menten kinetics . . . . . 87

2.4 The structure of kinetic models . . . . . . . . . . . . . . . 89

i

2.4.1 Properties of process-time . . . . . . . . . . . . . . 90

2.4.2 Properties of state-space . . . . . . . . . . . . . . . 92

2.4.3 Nature of determination . . . . . . . . . . . . . . . 94

2.4.4 XYZ models . . . . . . . . . . . . . . . . . . . . . . 96

2.5 Markov processes . . . . . . . . . . . . . . . . . . . . . . . 97

2.6 The master equation . . . . . . . . . . . . . . . . . . . . . 100

2.6.1 The chemical master equation . . . . . . . . . . . . 101

2.7 Molecular approach to chemical kinetics . . . . . . . . . . 104

2.7.1 Reactions are collisions . . . . . . . . . . . . . . . . 104

2.7.2 Reaction rates . . . . . . . . . . . . . . . . . . . . . 110

2.7.3 The need for stochastic rates in stochastic simulations114

2.7.4 Zeroth-order reactions . . . . . . . . . . . . . . . . 117

2.7.5 First-order reactions . . . . . . . . . . . . . . . . . 118

2.7.6 Second-order reactions . . . . . . . . . . . . . . . . 119

2.7.7 Higher-order reactions . . . . . . . . . . . . . . . . 120

2.8 Fundamental hypothesis of stochastic chemical kinetics . . 121

2.9 General derivation of the stochastic rate constant . . . . . 124

2.10 The reaction probability density function . . . . . . . . . . 129

2.11 The stochastic simulation algorithms . . . . . . . . . . . . 131

2.11.1 Direct Method . . . . . . . . . . . . . . . . . . . . 132

2.11.2 First Reaction Method . . . . . . . . . . . . . . . . 134

2.11.3 Next Reaction Method . . . . . . . . . . . . . . . . 135

2.12 Time-dependent extension of First Reaction

Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

2.12.1 A case study: the passive transport of glucose . . . 141

2.12.2 Simulation results . . . . . . . . . . . . . . . . . . . 147

2.13 StochSim algorithm . . . . . . . . . . . . . . . . . . . . . . 153

2.14 Advantages and drawbacks of Gillespie algorithm . . . . . 155

2.15 Spatio-temporal algorithms . . . . . . . . . . . . . . . . . 156

ii

2.16 The Langevin equation . . . . . . . . . . . . . . . . . . . . 158

2.16.1 Use and abuse of Langevin equation . . . . . . . . . 159

2.17 Hybrid algorithms . . . . . . . . . . . . . . . . . . . . . . . 161

2.17.1 Hybrid modeling in intercellualr communication . . 163

3 Biochemical stochastic π-calculus models and simulations 185

3.1 Models and simulation of higher levels biological systems . 185

3.2 The biochemical stochastic π-calculus . . . . . . . . . . . . 188

3.2.1 The stochastic engine of π-calculus . . . . . . . . . 193

3.3 Eukaryotic cell cycle . . . . . . . . . . . . . . . . . . . . . 195

3.3.1 Molecular machinery of the cell cycle . . . . . . . . 196

3.3.2 A simple model of Start and Finish . . . . . . . . . 198

3.3.3 Specification . . . . . . . . . . . . . . . . . . . . . . 201

3.3.4 Simulation . . . . . . . . . . . . . . . . . . . . . . . 205

3.3.5 Remarks . . . . . . . . . . . . . . . . . . . . . . . . 206

3.4 Autoreactive lymphocyte recruitmnet . . . . . . . . . . . . 208

3.4.1 Quantitative models of lymphocyte recruitment . . 210

3.4.2 Dembo adhesion model . . . . . . . . . . . . . . . . 211

3.4.3 Stochastic π-calculus adhesion and rolling model . . 215

3.4.4 Main results . . . . . . . . . . . . . . . . . . . . . . 220

3.4.5 BioSpi prediction of rolling cells percentage as a func-

tion of vessel diameter . . . . . . . . . . . . . . . . 221

3.4.6 Remarks . . . . . . . . . . . . . . . . . . . . . . . . 225

3.5 A faulty mechanism of protein folding and degradation in

Parkinson’s disease . . . . . . . . . . . . . . . . . . . . . . 226

3.5.1 Mechanisms of neurodegeneration . . . . . . . . . . 227

3.5.2 First model: misfolded protein accumulation induced

by mutant α-synuclein . . . . . . . . . . . . . . . . 230

iii

3.5.3 Second model: misfolded protein accumulation in-

duced by mutant parkin . . . . . . . . . . . . . . . 234

3.5.4 Third model: misfolded protein accumulation as func-

tion of chaperones number . . . . . . . . . . . . . . 236

3.5.5 Remarks and future directions . . . . . . . . . . . . 237

4 Stochastic Beta binders and a prototype of simulator 243

4.1 A new language for system biology . . . . . . . . . . . . . 243

4.2 Syntax and semantics of Beta binders . . . . . . . . . . . . 245

4.3 Stochastic Beta-binders . . . . . . . . . . . . . . . . . . . . 248

4.4 BioBeta simulator . . . . . . . . . . . . . . . . . . . . . . . 254

4.4.1 Implementation . . . . . . . . . . . . . . . . . . . . 255

4.5 Simulation algorithm . . . . . . . . . . . . . . . . . . . . . 269

4.5.1 Propensity functions for boxes interactions . . . . . 269

4.5.2 Execution . . . . . . . . . . . . . . . . . . . . . . . 273

4.5.3 Simulations . . . . . . . . . . . . . . . . . . . . . . 276

5 Wet experiments for kinetics studies in system biology 287

5.1 Monitoring a chemical reaction . . . . . . . . . . . . . . . 287

5.2 Mass spectrometry . . . . . . . . . . . . . . . . . . . . . . 288

5.3 NMR spectroscopy . . . . . . . . . . . . . . . . . . . . . . 294

5.4 Model testing and confirmation . . . . . . . . . . . . . . . 297

A Simulation in system biology: the state of the art 303

A.0.1 Language-based approaches . . . . . . . . . . . . . 309

A.0.2 Databases of quantitative and qualitative information 310

B Analysis of a two-component signal transduction: a model

for the feedback loops on protein translation and degrada-

tion 313

iv

B.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 314

B.2 Reaction equation . . . . . . . . . . . . . . . . . . . . . . . 317

B.2.1 Model reduction . . . . . . . . . . . . . . . . . . . . 320

B.2.2 The organization of the regulatory network . . . . . 322

B.3 Feedback loops . . . . . . . . . . . . . . . . . . . . . . . . 324

B.3.1 Negative feedback loop on translation . . . . . . . . 325

B.3.2 Auto-activation of protein degradation . . . . . . . 329

B.3.3 A linear model . . . . . . . . . . . . . . . . . . . . 331

B.4 Experimental detection of feedback loops . . . . . . . . . . 333

B.5 Some remarks . . . . . . . . . . . . . . . . . . . . . . . . . 339

C Inference for rate constants 341

C.1 Complete data observations . . . . . . . . . . . . . . . . . 342

C.2 Incomplete data observations . . . . . . . . . . . . . . . . 345

Bibliography 348

v

List of Tables

2.1 Classes of biological phenomena and most used formalisms to describe them. 69

2.2 Reactions of the chemical model displayed in Fig. 2.7. No. corresponds to

the number in the figure. . . . . . . . . . . . . . . . . . . . . . . 102

2.3 Reactions of the chemical model depicted in Fig. 2.7, their propensity and

corresponding ”jump” of state vector ~nTR. V is the volumes in which the

reactions occur. . . . . . . . . . . . . . . . . . . . . . . . . . . 103

2.4 Set of chemical master equations describing the metabolites interaction showed

in Fig. 2.7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

2.5 Rates expression and O.D.E. model for GLUT transporter

[44]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

2.6 Passive glucose transport model in stochastic π-calculus.

abind, afacing, and aunbinding are the propensities for the re-

action of binding of the glucose molecule with the GLUT

transporter, for the facing of the glucose molecule to the

interior of the cell membrane and for the breakage of the

complex glucose-GLUT transporter, respectively. . . . . . . 144

3.1 Reduction rules of stochastic π-calculus . . . . . . . . . . . . . . . . 189

3.2 Heterodimer complex formation and breakage. . . . . . . . . . . . . . 190

3.3 Parameters values. See [133, 186]. . . . . . . . . . . . . . . . . . . 200

3.4 Biochemical stochastic π-calculus specification of cell cycle control mecha-

nisms in eukaryotic cell . . . . . . . . . . . . . . . . . . . . . . . 202

vii

3.5 Biochemical stochastic π-calculus specification of the 4-phases lymphocyte

recruitment process. . . . . . . . . . . . . . . . . . . . . . . . . 217

3.6 Deterministic rates for the 4-phases of lymphocyte recruitment. . . . . . 219

3.7 Space parameters and densities. . . . . . . . . . . . . . . . . . . . 221

3.8 Three sets of experimental values of rolling cell percentage (RCP ) for different

values of the vessel diameter Dv. The estimated experimental error on the

rolling cells percentage is ±3. . . . . . . . . . . . . . . . . . . . . 225

3.9 Stochastic π-calculus specification of a model of a faulty mechanism of protein

folding and degradation. . . . . . . . . . . . . . . . . . . . . . . 233

3.10 Channels rates and number of processes used in the models. The symbol “*”

means that the value is a variable parameter (see Fig. 3.17). . . . . . . . 235

4.1 Laws for structural congruence in Beta binders. This table is taken from [147]. 247

4.2 Axioms and rules for stochastic reduction relations in Beta binders. This

table is taken from [34]. . . . . . . . . . . . . . . . . . . . . . . 249

4.3 Auxiliary functions. This table is taken from [34]. . . . . . 253

4.4 States for join reactions of the elements of the system Sys. . . . . . . . 259

4.5 Example of states and functions defining the bio-processes enzyme and substrate263

4.6 Examples of BioBeta syntax for processes and their correspondent π-calculus

syntax. The indication of the deadlock process NIl can be omitted. . . . 269

4.7 Simulation parameter of the Beta binders model of ionic bonding between

Sodium and Chlorine. . . . . . . . . . . . . . . . . . . . . . . . 276

4.8 Simulation parameters for the APC/CDK antagonism. The rates associated

to the channels involved in inter reactions are those k’s used in Chapter 3

Section 3.3.2 Table 3.3 for the same reactions. The rates of all the hide and

unhide reduction not included in this table are infinite. . . . . . . . . . 284

B.1 Simulation parameters [94]. . . . . . . . . . . . . . . . . . . . . . 328

viii

List of Figures

1.1 The cell biology research cycle. . . . . . . . . . . . . . . . . . . . 4

1.2 (A) Kinetics of the changes of the enzyme E and the complex enzyme-

substrate ES as function of time (in seconds). (B) Kinetics of the changes of

the product (P) and the substrate S as function of time (in seconds). Both

(A) and (B) simulations have been performed with an initial number of en-

zyme and substrate particles E0=10 and P0 = 10, respectively. (C) Kinetics

of the changes of the enzyme E and the complex enzyme-substrate ES as

function of time (in seconds). (D) Kinetics of the changes of the product P

and the substrate S as function of time (in seconds). Both (D) and (E) simu-

lations have been performed with an initial number of enzyme and substrate

particles E0=1000 and P0 = 1000, respectively. The simulations have been

obtained with the Direct Gillespie algorithm [49] implementation of Dizzy

simulator [155]. . . . . . . . . . . . . . . . . . . . . . . . . . . 17

1.3 Deterministic simulations of the kinetics of the changes of E and ES (A) and

of P and S (B) as function of time (in seconds). The initial concentrations

of enzyme and product have been set to E0=1000 and P0 = 1000, respectively. 18

1.4 Ionic bonding between Na and Cl atoms. Na sends a message e on channel c

to Cl that received it on the same channel c. After this communication Na

becomes Na+ and Cl becomes Cl−. . . . . . . . . . . . . . . . . . 36

1.5 Visualization of the specification of ”physical binding” reaction in π-calculus. 38

1.6 Competitive inhibition: substrate and inhibitor interact with enzyme in a

mutually exclusive way. . . . . . . . . . . . . . . . . . . . . . . . 42

ix

1.7 Pictorial representation of the bounded states of enzyme and substrate in the

molecular complex enzyme-substrate. . . . . . . . . . . . . . . . . . 42

1.8 A system of two parallel bio-processes B1 and B2 (left and right box, respec-

tively) in (A). bio-processes intra (B) and inter (C) reductions [147]. . . . 46

1.9 Graphical representation of the evolution of a bio-process due to expose (A),

hide (B), and unhide reductions (C). The expose rule assumes that z 6∈ ∆ and

z 6= x. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

1.10 (A) The execution of the join reduction defined in (1.8). (B) The execution

of the split reduction defined in (1.9). As far as join rule, note that, unlike

BioAmbients, the Beta binders formalism forbids the nesting of boxes. . . 48

2.1 The state space of a binary mixture. . . . . . . . . . . . . . . . . . 73

2.2 Accessible states for the reactions 2A ⇋ 2B with C = 7. . . . . . . . . 78

2.3 Lotka-Volterra dynamics for [Y1]t=0, [Y2]t=0, k1 = 1, and k2 = k3 = 0.1. . . 82

2.4 Lotka-Volterra dynamics for [Y1]t=0, [Y2]t=0, k1 = 1, and k2 = k3 = 0.1. The

equilibrium solution for this combination of parameters is [Y1] = 1 and [Y2] =

10. These values correspond to the coordinates of the nullclines intersection

points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

2.5 Dimerization dynamics for [P ]t=0 = 1, [P2]t=0 = 0, k1 = 0 and k2 = 0.5. Thus

the equilibrium constant is Keq = 2, and the equilibrium concentrations are

[P ]eq = 0.39 and [P2]eq = 0.30. c = 1. . . . . . . . . . . . . . . . . . 86

2.6 A. Experimental rate of loss of optical activity of sucrose for three initial

concentrations of sucrose and fixed concentrations of the enzyme. Data of

Michaelis and Menten replotted from Wong [195]. B. The initial rate V (0)

in A. of the invertase catalyzed reaction plotted as a function of sucrose

concentration. . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

2.7 Two metabolites A and B coupled by a bimolecular reactions. Adapted from

[78] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

x

2.8 Since the curve shape is not symmetric, the average kinetic energy will always

be greater than the most probable. For the reaction to occur, the particles

involved need a minimum amount of energy - the activation energy. . . . . 106

2.9 Maxwell-Boltzmann speed distributions at different temperatures. As tem-

perature increases, the curve will spread to the right and the value of the

most probable kinetic energy will decrease. At temperature increases the

probability of finding molecules at higher energy increases. Note also that

the area under the curve is constant since total probability must be one. . 106

2.10 The collision volume δVcoll which molecule 1 will sweep out relative to molecule

2 in the next small time interval δt. . . . . . . . . . . . . . . . . . 108

2.11 Five deterministic solutions of the birth-death process given by Eq. (2.40)

for values of λ− µ given in the legend and for x0 = 50. . . . . . . . . . 115

2.12 Five stochastic realizations of the birth-death process together with the de-

terministic solution (x0 = 50 λ = 3, µ = 4). . . . . . . . . . . . . . . 116

2.13 A. Cartoon of the four states of the GLUT transporter. B.

Kinetic diagram. . . . . . . . . . . . . . . . . . . . . . . . 142

2.14 Time behavior of the fraction of GLUT transporters in states S1 and S2 (x1

and x2 respectively). Simulation obtained from O. D. E. model. . . . . . 149

2.15 Time behavior of glucose concentration in and out of cell. The system is in

equilibrium at t ≈ 4.9 min. Simulation obtained from O. D. E. . . . . . 149

2.16 Time behavior of the fraction of GLUT transporters in states S1 and S2.

Simulation obtained from time-dependent First Reaction Method algorithm.

This figure shows the stochastic simulation result corresponding to the deter-

ministic model showed in Fig. 2.14. . . . . . . . . . . . . . . . . . 150

2.17 Time behavior of glucose concentration in the cell. Simulation obtained from

time-dependent First Reaction Method algorithm. This figure shows the

stochastic simulation result corresponding to the deterministic model showed

in Fig. 2.15. . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

xi

2.18 Probability density function of the binding reaction between GLUT trans-

porter and GLUCOSE molecule versus temperature. . . . . . . . . . . 152

2.19 Time behavior of glucose concentration in the cell. Simulation obtained with

the Direct. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

2.20 Time behavior of the GLUT transporter concentrations in states S1 and S2. 153

2.21 Time behavior of the reverse potential, number of open channels and rates of

gating. The initial number of open channel is 100. . . . . . . . . . . . 166






















gating. C = 0.1µF/cm2 . . . . . . . . . . . . . . . . . . . . . . 177

xii


gating. C = 0.5µF/cm2 . . . . . . . . . . . . . . . . . . . . . . 178


gating. C = 1µF/cm2 . . . . . . . . . . . . . . . . . . . . . . . 179


gating. C = 20µF/cm2 . . . . . . . . . . . . . . . . . . . . . . . 180


gating. C = 30µF/cm2 . . . . . . . . . . . . . . . . . . . . . . 181


gating. C = 40µF/cm2 . . . . . . . . . . . . . . . . . . . . . . . 182


gating. C = 50µF/cm2 . . . . . . . . . . . . . . . . . . . . . . . 183

3.1 The phases of the cell cycle . . . . . . . . . . . . . . . . . . . . . 197

3.2 Cyclin sub-units are synthesized on ribosomes in the cytoplasm and bind

rapidly and irreversibly to CDK kinases to form active dimers cyclin/CDK.

The cyclin sub-units are degraded periodically by the APC, releasing in-

active CDK monomers. The APC is inactivated by cyclin/CDK and re-

activated by an “activator”. The k’s are the chemical reaction rates, that

for the most part are functions of the dynamics variables. For example,

k2 = k′

2[inactiveAPC] + k′′

2 [activeAPC], where k′

2 and k′′

2 are the enzymatic

turnover numbers characterizing the less- and more-active forms of APC,

respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . 199

3.3 The sequence of events in the cell cycle can be represented as a negative

feedback loop: the cyclin/CDK dimers (X) turn on the activator (Cdc20),

which indirectly activates Cdh1, which destroys cyclin sub-units. . . . . . 201

3.4 Simulation of cyclin/CDK concentration variation in time from equations

(3.2) - (3.7) with the parameters given in Tab. 3.3. (See [186, 133]) . . . 203

xiii

3.5 Simulation of CDH1 and CDC14 concentrations variations in time from equa-

tions (3.2) - (3.7) with the parameters given in Tab. 3.3.(See [186, 133]) . . 204

3.6 BioSpi simulation output for the two state Nasmyth model of cell cycle con-

trol. Time evolution of absolute number of proteins involved in the process:

Cdh1, Cdc14 and cyclin/CDK. . . . . . . . . . . . . . . . . . . . 206

3.7 The 4-phase model of lymphocyte recruitment. . . . . . . . . . . . . 208

3.8 Time evolution of bond density in the Dembo adhesion model. . . . . . 212

3.9 Representative trajectory of lymphocyte tethering at a mean velocity v equal

to one half of the hydrodynamic velocity vh, with parameters: γ = 0.001 nm,

kon = 84 s−1, k0off = 1 s

−1. . . . . . . . . . . . . . . . . . . . . 212

3.10 Representative trajectory of rolling motion of lymphocyte, with a mean ve-

locity v < 0.5vh that experience durable arrests. . . . . . . . . . . . . 213

3.11 Representative trajectory of lymphocyte for firm adhesion with parameters:

γ = 0.001 nm, kon = 84s−1, k0off = 20s

−1. . . . . . . . . . . . . . . 214

3.12 BioSpi simulation of 4-phases model of lymphocyte recruitment. . . . . . 222

3.13 Time evolution of number of bound molecules for three different sets of vessel

diameters values. . . . . . . . . . . . . . . . . . . . . . . . . . 223

3.14 Experimental measurements of the variation of rolling cells percentage at

varying vessel diameter. . . . . . . . . . . . . . . . . . . . . . . 224

3.15 Rolling cells percentage versus vessel diameter in the BioSpi model. . . . . 224

xiv

3.16 Pathogenesis of PD induced by mutant α-synuclein: 1. the intereation of a

nascent protein with a chaperone can results in a right-folded protein or in

a misfolded protein; 2. the chaperone attempts to re-fold the faulty protein

and the result can be again a right-folded protein or a misfolded protein; 3.

therefore,the misfolded protein is drapped by the ubiquitin transported by

the parkin protein. A mutant variant of the parkin is not able to transport

the ubiquitin on the misfolded protein. The mutant α-synuclein inhibits the

activation of the proteasome by the ubiquitin. The mutant α-synuclein seems

to be proteasome-proof, but the model presented in this paper takes into ac-

count an eventual attempt of the proteasome to attack the faulty α-synuclein.

The outcomes of the interactions between the nascent linearprotein and the

chaperone, as well as of the interaction between the mutant α-synuclein and

the proteasome are stochastically determined by the reaction probabilities

derived from the kinetic reaction rates accordingly to the Direct Gillespie

algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229

3.17 (A) rs = 10 µs−1, Ns = 10; (B) rs = 0.01 µs−1, Ns = 100; (C) rs = 10 µs−1, Ns =

100; (D) rs = 1.0 µs−1, Ns = 100; (E) rs = 10 µs

−1, Ns = 200; (F) rs = 100.0

µs−1, Ns = 100; (G) rs = 10 µs−1, Ns = 1000; (H) rs = 1000.0 µs

−1, Ns = 100.

The rates used in these simulations has been taken from [36]. . . . . . . . . . 239

3.18 Number of non correctly refolded proteins in PD induced by mutant parkin.

The curve of MISFOLDED’ zeros before 5 µs, indicating that the production

of MISFOLDED” starts since the beginning of the simulation and increases

as the square root of the time, without giving to the proteosomal mechanism

of the cell any chance to react. . . . . . . . . . . . . . . . . . . . 240

3.19 Variation of number of chaperones and wrongly refolded proteins in PD in-

duced by mutant α-synuclein. The initial number of chaperones is 10. This

simulation shows that this number is not adequate to defend the cell from

the increasing of faulty proteins. . . . . . . . . . . . . . . . . . . . 240

xv

3.20 Variation of number of chaperons and wrongly refolded pro-

teins in PD induced by mutant α-synuclein. The initial num-

ber of chaperones is 100 in the plot (A) and 1000 in the plot

(C). The plots (B) and (D) are a zoom of the plots (A) and

(C) to better visualize the time behavior of the processes

in the first 0.2 µs−1 and 0.005 µs−1, respectively. A suffi-

ciently large number of chaperones seem to ensure the cell

the possibility to activate the proteasomes and consequently

to decrease the number of faulty proteins. . . . . . . . . . 241

4.1 Pictorial view of the dimerization process. The affine binders indicated with

the same polygon are hidden and the other binders are added to the interface

of the new box. . . . . . . . . . . . . . . . . . . . . . . . . . . 260

4.2 Scheme of the base model of the ligand-induced endocytosis. The mechanism

is described in the text. . . . . . . . . . . . . . . . . . . . . . . 261

4.3 Welcome page of BioBeta simulator . . . . . . . . . . . . . . . . . 264

4.4 The BioBeta help page. . . . . . . . . . . . . . . . . . . . . . . 265

4.5 The BioBeta form for the insertion of the specification of a bio-process. Here

it is shown the specification of the bio-process B1 ::= β(x,Γ)[x(y).Nil|x̄z.Nil]. 266

4.6 Selected fields for states and function of the bio-process E as indicated in

Table 4.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267

4.7 Message confirming that there are no syntax errors. Therefore, the user can

proceeds to insert a new bio-process, to go back to modify the specification

of the previous one, to run the simulation or to quit the simulator. . . . . 267

4.8 In order to run a simulation the user must insert the duration of the simulation

and the names of the bio-processes whose time-behavior has to be recorded

in the output table. . . . . . . . . . . . . . . . . . . . . . . . . 268

4.9 This page is “used” as a repository of downloadable papers about Beta binders

and its application in modeling bio-local phenomena. . . . . . . . . . 268

xvi

4.10 flux diagram of the simulation algorithm. . . . . . . . . . . . . . . . 275

4.11 Stochastic simulations of the time-course of Na, Cl and the ions Na+ and Cl−277

4.12 Stochastic fluctuations in Michaelis-Menten catalysis. Enzyme and substrate

initial concentration: E0 = S0 = 10 particles. The fluctuations of the curve

of the enzyme totally cover the curve of the substrate. . . . . . . . . . 278

4.13 Stochastic fluctuations in Michaelis-Menten catalysis. Enzyme and substrate

initial concentration: E0 = S0 = 100 particles. The width of stochastic

fluctuation is smaller: the curve of the enzyme partially covers the curve of

the substrate. . . . . . . . . . . . . . . . . . . . . . . . . . . . 278

4.14 Bio-processes of the main components of the system for the control of cell

cycle in eukaryotes. . . . . . . . . . . . . . . . . . . . . . . . . 279

4.15 Time course of the number of active cyclin/CDK complexes and active APC

complexes. The increase of the number of active cyclin/CDK complexes

corresponds to the decrease of active APC complexes. . . . . . . . . . 285

4.16 The maxima of the time course of active cyclin/CDK dimers correspond to

the minima of the active CDC14. . . . . . . . . . . . . . . . . . . 285

4.17 Time course of the number of active cyclin/CDK complexes and inactive APC

compleses. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286

4.18 The time course of inactive CDC14 shows a stepwise decrease, while the trend

of time course of active cyclin/CDK dimer shows a decrease during the first

50 min and an increase in the following 50 min. . . . . . . . . . . . . 286

5.1 Scheme of a mass spectrometer. Adapted from [96]. . . . . . . . . . . 290

xvii

5.2 Conventional LC. It is most commonly used to purify and isolate some com-

ponents of a mixture. A liquid chromatograph separates analyte molecules in

solution by flowing the solution through a column that is packed with parti-

cles 3 to 5 µm in diameter and is between 10 and 30 cm long. The diameter

of the column depends on the application and determines the liquid flow rate.

Preparatory columns are > 10 mm in diameter, analytical columns are be-

tween 4 and 10 mm in diameter, micro-bore columns between 1 and 2 mm in

diameter and capillary columns < 1 mm in diameters. . . . . . . . . . 293

5.3 Scheme of a LC-MS. . . . . . . . . . . . . . . . . . . . . . . . . 293

5.4 Experimental equipment by the Bio-organic Chemistry Laboratory of Uni-

versity of Trento: ion trap mass spectrometer with electrospray, APCI( =

Atmospheric Pressure Chemical Ionization) and nanospray ionization sources

integrated with Hewlett Packard 1100 liquid chromatograph system [163]. . 295

5.5 The main components of an NMR instrument. Adapted from [96]. . . . . 297

5.6 The experiments in vitro significantly eliminate the context of interactions

where the system component C under investigation was placed in the context

in vivo. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301

B.1 Scheme of a two-components signal transduction system [94]. . . . . . . 315

B.2 A simplified schema of reaction mechanism for the KdpD/KdpE two-components

system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316

B.3 Schema of the kdpFABCDE regulon [94]. . . . . . . . . . . . . . . . 316

B.4 First variant: the protein inhibits its own translation (e. g. by blocking

the ribosome binding site). Second variant: the protein activates its own

degradation, (e.g by the activation of a protease). . . . . . . . . . . . 325

B.5 Simulation of the time behavior of protein (A) and of mRNA (B). Parameters:

A = 1, B = 1. Initial conditions RNA(0) = 0 and prot(0) = 0. . . . . . 329


A = 1, B = 0. Initial conditions RNA(0) = 0 and prot(0) = 0. . . . . . . 329

xviii


A = 0, B = 1. Initial conditions RNA(0) = 0 and prot(0) = 0. . . . . . . 329


A = −1, B = 1. Initial conditions RNA(0) = 1 and prot(0) = 1. . . . . . 337

xix

Chapter 1

Introduction

1.1 What is biological modeling

Modeling is an attempt to describe an understanding of the elements of a

system of interest, their states, and their interactions with other elements.

The model should be sufficiently detailed and precise so that it can in prin-

ciple be used to simulate the behavior of the system on a computer. In the

context of molecular cell biology, a model may describe the mechanisms in-

volved in transcription, translation, cell regulation, cellular signaling, DNA

damage and repair processes, the cell cycle or apoptosis. At a higher level,

modeling may be used to describe the functioning of a tissue, organ, or even

an entire organism. At still higher level, models can be used to describe

the behavior and time evolution of populations of individual organisms. At

the beginning of a modeling project, the first issue to confront is to decide

on which feature to include in the model and the level of detail the model

is intended to capture. So, for example, a model of an entire organism is

unlikely to describe the detailed functioning of every individual cell, but a

model of a cell is likely to include a variety of very detailed description of

key cellular processes. Even then, however, a model of a cell is unlikely to

contain details of every single gene and protein. In order to show how it is

possible to think about a biological process at different scales and different

1

1.1. WHAT IS BIOLOGICAL MODELING CHAPTER 1. INTRODUCTION

levels of detail, let us consider the photosynthesis process. It can be sum-

marized by a single chemical reaction mixing water with carbon dioxide to

get glucose and oxygen. The reaction is catalyzed by the sunlight. This

could be written as

Water + Carbon dioxidesunlight−→ Glucose + Oxygen

This single reaction is a summary of the overall effect of the process.

Although the photosynthesis consists of many reactions, the above equa-

tion to describe it is not really wrong. It globally represents the process at

higher level than the more detailed description that biologists often pre-

fer to work with. Whether a single overall equation or a full breakdown

into component reactions is necessary depends on whether intermediate

reagents are elements of interest to modeler. In general, we can state that

the ”art” to build a good model consists in the ability of capturing the

essential features of the biology without burdening the model with non-

essential details. However just because of the omission of the details, every

model is to some extent a simplification of the biology. Nevertheless, mod-

els are valuable because they take ideas that might have been expressed

verbally or diagrammatically, and make them more explicit, so that they

can begin to be understood in a quantitative rather than purely qualitative

way.

The features of a model depend very much on the aims of the modeling.

Modeling and simulation appeared on the scientific horizon much more

before the emergence of molecular and cellular biology. Their genesis is

in the physical sciences and engineering. In the physical sciences, besides

theoretical and experimental studies, modeling and simulation are consid-

ered as the third indispensable approach because not all hypotheses are

amenable for confirmation or rejection by experimental observations. In

biology, researchers are facing the same or maybe even worse situation. On

2

CHAPTER 1. INTRODUCTION 1.1. WHAT IS BIOLOGICAL MODELING

one hand experimental studies are unable to produce a sufficient amount

of data to support theoretical interpretations; on the other hand, due to

data insufficiency, theoretical research can not provide substantial guid-

ance and insights for experimentation. Therefore computational modeling

takes a more important role in biology by integrating experimental data,

facilitating theoretical hypotheses, and addressing what if questions.

An other important aim of modeling is to make clear the current state

of knowledge regarding a particular system, by attempting to precise about

the elements involved and the interactions between them. Doing this can

be an effective way to highlight gaps in understanding. Our understand-

ing of the experimental observations of any system can be measured by

the extent to which a simulation, we create, mimics the real behavior of

that system. Behaviors of computer-executable models are at first com-

pared with experimental values. If at this stage inconsistency is found, it

means that the assumptions, that represent our knowledge on the system,

are at best incomplete, or that the interpretation of the experimental data

is wrong. Models survived to this initial validation can then be used to

make predictions to be tested by experiments, as well to explore config-

urations of the system that are not easy to investigate by in vitro or in

vivo experiments. Creation of predictive models can give opportunities for

unprecedented control over the system. In contrast to physics, biology still

lacks the fundamental laws on which it is based. Modeling can provide

valuable insights into the workings and general principles of organization

of biological systems.

Modeling, simulation, and analysis of the simulation outcomes are there-

fore perfectly positioned for integration into the experimental cycle of cell

biology (Fig. 1.1). Although we will always need real experiments to

advance our understanding of biological processes, conducting in silico,

or computer-simulated experiments can help guide the wet-lab process by

3

1.2. SYSTEM BIOLOGY CHAPTER 1. INTRODUCTION

Qualitative modeling Quantitative modeling

Cellular data

Experiments Cell programmingAnalysis and interpretation

Figure 1.1: The cell biology research cycle.

narrowing the experimental search space.

1.2 System Biology

More than fifty years ago, Watson and Crick [192] identified the struc-

ture of DNA, thus paving the way for the molecular biology and genetics.

Grounding the biological phenomena on molecular basis made it possible

to describe the different aspects of biology, such as heredity, diseases and

development, as the result of the coherent interactions between sets of ele-

ments, that are either functionally different or most often multifunctional.

Grounding biological phenomena on a molecular basis made it possible

to include biology in a consistent framework of knowledge based on fun-

damental law of physics. Since then, the field of molecular biology has

emerged and enormous progress has been made. Molecular biology en-

ables us to understand biological systems as molecular machines. Large

numbers of genes and the function of transcriptional products have been

identified. DNA sequences have been fully identified for various organisms

such as mycoplasma, Escherichia Coli (E. coli), Caenorhabditis elegans (C.

4

CHAPTER 1. INTRODUCTION 1.2. SYSTEM BIOLOGY

elegans),Drosophila melanogaster, and Homo sapiens. Measurements of

protein level and their interactions is also making progress [77, 167]. In

parallel with such efforts, new methods have been invented to disrupt the

transcription of genes, such as loss of-function knockout of specific genes

and RNA interference that is particularly effective for C. elegans and is

now being applied for other species. Nevertheless, such knowledge is not

sufficient to provide us a complete understanding of biological systems as

systems [89]. Cells, tissues and organs, and organisms as well as ecologi-

cal webs are systems of components whose specific interactions have been

defined by evolution; so a system-level understanding should be the prime

goal of biology.

System-level understanding requires a set of principles and methodolo-

gies that links the behaviors of molecules to system characteristics and

function. These principles and methodologies should be developed in the

following four areas of investigation.

1. System structures. These include the network of gene interactions

and biochemical pathways, as well as the mechanisms by which such

interactions modulate the physical properties of intracellular and mul-

ticellular structures.

2. System dynamics 1. How a system behaves over time under vari-

ous conditions can be understood through metabolic analysis, sensi-

tivity analysis, and dynamic analysis methods such as portrait and

bifurcation analysis. Specifically, the system behavior analysis aim

at addressing the following questions: how does a system respond to

1With the term ”dynamics”, we simply mean ”time-evolution”. In this book the term is not used with

the meaning it has in mechanics, where it is different from ”kinetics” or ”kinematics” and it is concerned

with the effects of forces on the motion of a particle or system of particles, especially of forces that do not

originate within the system itself. On the contrary, in chemistry ”dynamics” is synonymous of ”kinetics”,

that is concerned with the rates of change in the concentration of reactants in a chemical reaction, and

thus with the time-behavior of the system.

5


changes in the environment? How does it maintain robustness against

potential damage, such as DNA damage and mutations? How do spe-

cific interaction pathways exhibit functions observed? It is not a triv-

ial task to understand the behaviors of complex biological networks.

Computer simulation and a set of theoretical analysis are essential to

provide in-depth understanding of the mechanisms behind the path-

ways.

3. Control methods. the individuation of mechanisms that systemat-

ically control the state of the cell is necessary for two reasons: 1.

their understanding can be exploited to modulate them to minimize

malfunctions, and 2. they involve potential therapeutic targets for

treatments of diseases.

4. The design method. Strategy to modify and constructing biological

system with desired properties can be developed on definite design

principles and simulations, instead of blind trial-and-error.

Any progress in each of the above areas requires breakthroughs in our

understanding not only of molecular biology, but also of measurement tech-

nologies and computational sciences. Although advances in accurate, quan-

titative experimental approaches will doubtless continue, insights into the

functioning of biological systems will not result from purely intuitive as-

saults. The reason of this stays in the intrinsic complexity of biological

systems, that ac combination of experimental and computational simula-

tion approaches is expected to solve.

At the present, identification of gene-regulatory logic and biochemical

network is a major purpose. Nowadays, biological modeling aims at un-

covering mechanisms at the fine-grained level that are internally consistent

with molecule-level biological programs and at reproducing observed phe-

nomena. Since it is hard to continuously and systematically monitor the

6


parallel activities in molecular networks, molecule-level modeling [68, 131]

has become and indispensable tool to bridge experimental and theoretical

studies and to link system behaviors with molecular reactions.

Due to the distinctive differences between biological and physical sys-

tems, modeling a network of interacting molecules comes with additional

challenges and calls for new strategies and tools. The early objective of

modeling was to explore the feature of complex biological systems treated

as black boxes. In such a scenario, the goal was to understand and predict

the behavior of a system without knowing the microscopic details. The

strategy was to reproduce observed phenomena at high level with a sim-

plified description of the internal structures. Two methodological feature

emerged at this stage. First, since biological systems were approximated

as structure-less entities, many methods and tools were directly borrowed

from engineering fields such as Finite Element Method2 and Boundary El-

ement Method [7]. The second methodology was a high-level abstraction

based on the inverse approach to modeling. As consequence, numerical

techniques for the solution of ordinary differential equation (ODE) and

partial differential equations (PDE) were applied. Both black box assump-

tion and inverse modeling, though suitable for modeling mechanical sys-

tems, suffer from major problems when applied to biological systems. The

black box conjecture assumes that the infernal structure of the system is

static and thus it can not hold when the system evolves in time as, for

instance, in growth process. Complex internal structure and evolution are

key feature that differentiate biological systems from mechanical systems.

The inverse modeling suffers from generality loss and many inverse prob-

lem are mathematically ill-posed. Even if adequate and precise data are

available, unique solution is no always guaranteed and special techniques

2A lot of references about Finite Element Method can be found at

http://www.solid.ikp.liu.se/fe/tit.html

7


are employed specifically to the problem in hand [79].

The dynamic context in which genes operate is much more complex than

the static composition of genes and genomes. Though sequence alignment

can help us find homologues, the exact functions of genes still need to

be confirmed experimentally. For example during embryonic development,

different ectopic [41] and failed gene expression events can lead to different

phenotypes. The problem is encountered by by creating various knock-

ins and knock-outs. The semantics of the genetic program can not be

modeled by using the black box conjecture. However, more generally, how

the interaction among molecules produces the complexity of a biological

system has no clear answer. Knowledge of biological complexity can lead

to design better or more efficient systems, and also for understanding of

pharmacological effects for drug discovery. Because of these reasons also,

a set of simulations, each of which coming from a perturbation of the

parameters of an original model, are helpful in understanding the dynamic

context in which gene, products and molecules operate.

A perturbed system is one in which the system’s behavior is forced out

of its ’normal’ state by disturbances coming, for example, from external

influences. This definition applies to theoretical physics or a biological

system, and in both cases perturbation offers a means to study and un-

derstand a system. Furthermore, applying perturbation theory to biology

may eventually allow prediction and treatment of pathological perturba-

tions (diseases) such as exist in the clinical setting.

Perturbation analysis studies the behavior of systems forced out of their

normal state. It is often the case that the behavior of a system under such

perturbations is much more amenable to theoretical analysis than the gen-

eral (i.e., normal) behavior of the system. The main reason is that, math-

ematically, the behavior of a system close to its ’normal’ state can often be

described by linear equations, whose theory is very well developed. In ad-

8


dition, beyond linear perturbation theory, there is a well-developed theory

for describing the behavior of systems as one moves away from a reference

state. This theory aims, for instance, to predict under what perturba-

tions a system will return to its reference state and which perturbations

will destabilize the system. This way of thinking also applies to biology.

The perturbation of a biological system by means of genetic mutation or

small molecules (chemical genetics) greatly aids the understanding of the

fundamental principles underlying such a system or process. Through ge-

netic dissection biologists learned that basic cellular processes such as cell

growth and cell division (as well as developmental processes depending on

the interaction of groups of cells and tissues) have been highly conserved

throughout evolution. Therefore, perturbations by small molecules or by

targeted or random mutations in individual genes in simple model organ-

isms such as yeast, Drosophila, C. elegans, Arabidopsis, and the mouse

have provided, and will in the future provide important insight into the

function of complex systems. Perturbation theory can also be applied bi-

ologically in a more controlled, reiterative manner. One can imagine tak-

ing some biological system of interest, defining its normal behavior, and

then investigating in a general and methodological way which perturba-

tions destabilize the system (in the sense that it will no longer return to

its normal state). Examples could be regulatory systems of various kinds,

such as those that keep the concentrations of different metabolites within

the cell at fixed levels and restore these levels after a perturbation. One

would then aim to identify what kind of perturbations would destabilize

these regulatory systems. One would go back and forth between perform-

ing perturbation experiments to see how the system behaves in response

to various perturbations, and building theoretical and computational mod-

els. One would start with ’small’ perturbations that can be described with

linear models, and would use those to predict, and subsequently test, the

9


behavior in response to larger perturbations.

1.2.1 What future for System Biology

Biologists are getting enthusiastic about mathematical modeling, as model-

ers are getting exited by biology. The complexity of molecular and cellular

biological systems makes it necessary to consider dynamic systems theory

for modeling and simulation of intra- and inter-cellular processes. To de-

scribe a system as ”complex” has become a common way to either motivate

new approaches or to describe the difficulties in making progress. Cur-

rently, before we can fully explain and understand the functioning and the

functions of cells organs or organisms from the molecular level upwards,

the major difficulties to overcome are technological and methodological.

Nevertheless, whatever time is required, the complexity of these systems

ensures that there is no way around mathematical modeling in this en-

deavor. A mathematical pathway model does not represent an objective

reality outside the modeler’s mind. The model is no more, and no less, a

complement of biologist’s reasoning. Mathematics is the handicraft of the

natural sciences.

The risk in this exciting endeavor is that the following thoughts from

the beginnings of System Biology will remain true in the years to come:

”In spite of the considerable interest and efforts, the application of systems

theory has not quite lived up to expectations. One of the main reasons fro

the existing lag is that systems theory has not been directly concerned with

some of the problems of vital importance in biology.”[121]

The challenge is for both the theoreticians and experimentalists to change

their ways:

”The real advance in the application of systems theory to biology will come

10

CHAPTER 1. INTRODUCTION 1.3. COMPLEXITY ...

about only when the biologists start asking questions which are based on

the system-theoretic concepts rather than using these concepts to repre-

sent in still another way the phenomena which are already explained in

terms of biophysical or biochemical principles. Then we will not have ’the

application of engineering principles to biological problems’ but rather a

field of System Biology with its own identity and in its own right.”[121].

System biology has succeeded when it is widely accepted that there is

nothing more practical than a good theory [194].

It is now necessary to clarify what complexity means in the context of

system biology. A complete definition of complexity should be given with

respect to

• the model: the large number of variables that can determine the be-havior

• the natural system: the connectivity and non-linearity of relationships

• the technology: the limited precision and accuracy measurements

• the methodology: the uncertainty arising from the conceptual frame-work chosen (e. g. the choice of automata instead of differential

equations).

However in the next section we will focus on the natural system and

methodology to model and simulate them, that are the central issues of

this thesis, in which we will try to exploring the relationships between

the inherent characteristics of a biological system and the mathematical

framework and formalism that are more adapt to describe it.

1.3 Complexity of a biological system

It is often said that biological systems, such as cells, are complex systems,

and that the grand challenge of 21st century is to understand and model the

11

1.3. COMPLEXITY ... CHAPTER 1. INTRODUCTION

complexity of biological systems. Though complexity has been extensively

discussed at different levels [112, 196, 201, 168], there is no operational def-

inition for biological systems [4]. The common notion of complex systems

if of very large numbers of simple and identical elements interacting to

produce ’complex’ behaviors. However, the reality in biology is somewhat

different. In biological system large numbers of functionally different, and

often multifunctional, sets of elements interact selectively and non-linearly

to produce coherent rather than complex behaviors. A biological system

is not equal to the sum of its parts [136], in which functions emerge from

the properties of the networks rather than from any specific element. On

the contrary in biological systems, functions rely on a combination of the

network and the specific element involved [90]. A typical example is p53

interactions pathway. This protein, known as ’the guardian’ of the genome,

acts as tumor suppressor. It is activated, inhibited and degraded by reac-

tions as phosphorilation, de-phosphorilation, and proteolytic degradation,

while its targets are selected by the different modification patterns that

exist; these are properties that reflect the complexity of the element it-

self. Just considering this example, Kitano [90] highlighted that biological

system are better characterized as symbiotic systems.

Beside the inherent complexity, some hallmarks of complexity, such as

linearity and non-linearity, number of parameters, order of equations and

evolution of network, come out only when a system is formalized in spe-

cific ways (see Appendix B) for a linear formalization of a two-component

signal transduction model). Moreover, we can distinguish two types of

complexity both encountered in modeling biological systems: functional

and structural, or dynamic and static. The operative definition and the

identification of the complexity in biological system is not the only hard

task, but also its quantitative measure is a big task for experimental bi-

ologists. The popular measure of complexity for dynamical system is the

12

CHAPTER 1. INTRODUCTION 1.4. STOCHASTIC MODELING APPROACH

computational complexity. For instance, the complexity of a sequence can

be inferred from what finite state machine can produce. Although this

measure characterizes the amount of information necessary to predict the

future state of the machine, it fails to address its meaning in the world of

molecular and modular cell biology [4].

Since the topological structure of a molecular network undergoes sig-

nificant evolution within cells in biological development, to measure both

static and dynamic complexity according to such evolution may be a prac-

tical way, namely it is easier to identify and abstract information from it

[18, 91]. Furthermore, feature in topological structure, such as the exis-

tence of organized biological compartments, are also helpful in identifying

modularity of molecular interaction. We will return on this point in section

1.5.

Finally, there are other two important indexes of complexity in biological

systems. The first is non-linearity, including parameter sensitivity and

initial values sensitivity. The second, on which we will focus in this thesis

is the existence of stochasticity. The noise increases the complexity of the

systems even further by introducing issues of robustness, noise resonance

and bi-modal behavior.

1.4 Stochastic modeling approach

An important aspect of modeling of biological networks is the handling of

stochastic or random events that occur inside a cell. A more detailed and

formal discussion of this issue will have to be deferred until much later

in this thesis, once the appropriate concepts and terminology have been

established. In the meantime we highlight the issue citing some examples

that illustrate the importance of stochastic modeling both for simulation

and inference.

13

1.4. STOCHASTIC MODELING APPROACH CHAPTER 1. INTRODUCTION

Arguments for the application of stochastic models for chemical and

bio-chemical reactions come at least from three directions, since the model

1. takes into account

• the discrete character of the quantity of the components• the inherently random character of the phenomena

2. is in accordance with the theories of

• thermodynamics• stochastic processes

3. is appropriate to describe

• small system• instability phenomena

Many studies have reported occurrence of stochastic fluctuations and

noise in living systems. Observations of gene expression in individual cells

clearly illustrate the stochastic nature of transcription [1, 117]. Other stud-

ies in eukaryotic gene expression show that the messenger RNA (mRNA)

production is quantal [75] and is produced in random pulses [162, 191].

It has been proposed that proteins are produced in short ’bursts’ at ran-

dom time intervals rather than in a continuous manner [25]. Furthermore,

another clear evidence of the stochasticity of the biological phenomena at

the molecular level is the existence of qualitatively and quantitatively dif-

ferent outcomes in the temporal behavior of a system starting from the

same initial conditions. A classic example is the lysis/lysogenic switch of

bacteriophage λ infected E. Coli. Due to noise, the network may randomly

evolve into one of these two bistable regions [70, 69]. Role of noise has

also been seen in bacterial chemotaxis [107] and cellular selection [182]. At

14


the level of cellular population, the most important implication of noise

in critical cellular processes is that in spite of identical initial conditions,

with time, different cells may evolve along distinct pathways. population

measurements typically show that the level of expression from the same

gene vary significantly across cells with the same genetic material. The

origin of such variability among isogenic population is largely attributed

to stochastic phenomena [67].

At the microscopic level of functioning of cellular processes the inter-

actions between the molecules - DNA, mRNA, proteins, small molecules -

follow the laws of the statistical theoretical physics. A fundamental result

of this branch of physics is the√n law [165], which says that random-

ness or fluctuations in a system is inversely proportional to the square

root of the number n of particles present in the system. This number can

be considered as an index of the system size. As a result low number of

particles or low concentration result in high fluctuations, origin of which

is largely thermal oscillations. Biochemical species participating in pro-

cesses such as gene transcription, regulation and signaling often occur in

low copy number; for example as a single DNA template with small num-

ber of promoter sites, few tens of mRNA molecules and other transcription

factors numbering around few hundreds. Consequently, elementary reac-

tions, such as polymerase binding of complex formation, take place with

widely distributed reaction times. Such stochastic effects arising due to the

inherent nature of biochemical interactions are often termed as intrinsic

noise. As the concentrations of the reacting species increase, the stochas-

ticity becomes less prominent and the behavior of the system tends to the

deterministic solution. We illustrate this fact through the stochastic sim-

ulation of Michaelis-Menten enzyme catalysis, whose mechanism is given

by the following set of reactions 3

3The formalism of chemical notation will be presented in details in chapter 2.

15


E + Sk+1→ ES

ESk−1→ E + S

ESk2→ E + P

where E is the enzyme, S is its substrate, and P is the product. The

reaction rates are k+1 = 1.0 M−1s−1, k−1 = 0.1 s

−1, and k2 = 0.01 M−1s−1.

Figs. 1.4 (A) and (B) show the results of stochastic simulation for 10 sec-

onds with initial enzyme and substrate molecules number being 10, while

Figs. 1.4 (C) and (D) show and 10000 particles. The plots show the results

in a simulated time of 400 s. Increasing the number of particles, the curves

of the time evolution of the reactants and products become less noisy.

In a network of molecular interactions there exists an extrinsic com-

ponent of noise too. The extrinsic component of randomness is due to

the external environmental conditions. For example, a transcription factor

for a given gene is often the protein product of another gene and thus its

production is also random. Such situations, where a protein product of

a stochastic triggering of a gene leads to the switching of another gene,

are characterized by a cascade of stochastic events. The timings of such

triggers can result in different outcomes [116].

We have to make clear that the formulation of the theory of stochastic

kinetics does not reduce the importance of deterministic kinetics, because

there exist a class of phenomena for which the stochastic model is only

slightly “better’ than the deterministic approach, while the mathematics

of the stochastic model is much more complicated. ODE description has

been practically used in many quantitative models. The general form of

an ODE model can be written as

16


(A) (B)

(C) (D)

Figure 1.2: (A) Kinetics of the changes of the enzyme E and the complex enzyme-substrate

ES as function of time (in seconds). (B) Kinetics of the changes of the product (P) and the

substrate S as function of time (in seconds). Both (A) and (B) simulations have been performed

with an initial number of enzyme and substrate particles E0=10 and P0 = 10, respectively. (C)

Kinetics of the changes of the enzyme E and the complex enzyme-substrate ES as function of

time (in seconds). (D) Kinetics of the changes of the product P and the substrate S as function

of time (in seconds). Both (D) and (E) simulations have been performed with an initial number

of enzyme and substrate particles E0=1000 and P0 = 1000, respectively. The simulations have

been obtained with the Direct Gillespie algorithm [49] implementation of Dizzy simulator [155].

d[Xi]

dt= fi(x) (1.1)

where i = 1, 2, . . . , N and [Xi(t)] is a continuous single-valued function

describing the time behavior of the concentration of the i-th species. The

17


(A) (B)

Figure 1.3: Deterministic simulations of the kinetics of the changes of E and ES (A) and of P

and S (B) as function of time (in seconds). The initial concentrations of enzyme and product

have been set to E0=1000 and P0 = 1000, respectively.

specific forms of the function fi , which are usually nonlinear in the [Xi]’s,

are determined by the structures and rates constants of the chemical re-

actions of the system. The equations (1.1) are called reaction rate equa-

tions; solving them for the functions [X1(t)], . . . , [XN(t)], subject to the

prescribed initial conditions, is tantamount to solving the time evolution

of the number of molecules of each species.

The set of O.D.E. governing the deterministic dynamics of the Michaelis-

Menten kinetics is

d[S]

dt= k2[ES]− k1[S][E]

d[E]

dt= (k2 + k3)[ES]− k1[S][E]

d[ES]

dt= k1[S][E]− (k2 + k3)[ES]

d[P ]

dt= k3[ES]

There have been several platforms for ODE based modeling. Among

them, the most known are Gepasi [119] and E-CELL [183], which share

a number of features in common, e. g. for chemical reactions simula-

18


tion. Tools of mathematical analysis like metabolic control analysis and

linear stability analysis of steady state, and parameter fitness have also

been implemented. We refer the reader to the Appendix 5.4 for a review

on the currently available simulators based on differential equation for-

malism. However, though metabolic reactions can be simulated by these

tools, signaling activities may not be well supported [54]. Furthermore,

signaling networks are non static and undergo evolution [18, 200]. Thus,

modeling of the context-dependent cellular processes merits a different ap-

proach. A typical example is Presenilin, a protein responsible for cleaving

Notch/Delta complex. It can selectively cleave a large group of membrane

proteins in different contexts [98, 171]. Thus, to describe its behavior with

ODEs is infeasible, because

• the biochemical equations would be very complex

• with the addition of new gene or protein into the model many equa-tions must be re-written, an arduous work that greatly slows the mod-

eling process itself.

Another example of gene with complex function is the Notch gene itself,

that takes part in intercellular communication process. The semantics or

function of its interaction with other proteins depends on its partners and

the timing of interaction [154, 83]. In addition, in any practical model, to

get complete quantitative data on gene and protein activity, such as the

rate of transcription, translation and degradation of proteins, is extremely

difficult. Thus, only small or medium sized models have been reported.

This brief introduction to reaction rate equations allows us to under-

stand more deeply the meaning of the expression “intrinsically stochastic”,

that in this section we have used to define the character of a biological

phenomenon at the molecular scale. Although the great importance and

usefulness of the differential reaction rate equations approach to chemical

19


kinetics cannot be denied, we should not lose sight of the fact that the phys-

ical basis for this approach is meaningless. This approach assumes that the

time evolution of a chemical reacting systems both continuous and deter-

ministic. However, since the molecular population levels can change only

by discrete integer amounts, the time evolution of a chemical reacting sys-

tem is no a continuous process. The time evolution is not a deterministic

process either. Even ignoring quantum mechanical effects and regarding

the molecular motions to be governed by the equations of classical me-

chanics, it is impossible even in principle to predict the dynamics of the

system unless we have a complete knowledge of the its state. Knowledge

about the state of the system includes the details about the position, the

orientation, and the momentum of every single molecule under considera-

tion, together with a complete knowledge of the chemistry of interacting

molecules. If we leave out such details of the state of the system in favor

of a higher level view, the dynamics of the system is not deterministic but

intrinsically stochastic. In other words, although the temporal behavior of

a chemically reacting system of classical molecules is deterministic in the

full position-momentum phase space, it is stochastic in the N -dimensional

subspace of the molecular population levels, as Eqs. 1.1 imply.

To conclude this section, we point out some of the roles played by the

stochasticity in biological phenomena. Since the noise is a nuisance, the

living systems have developed noise-suppressing mechanisms, an example

is the genetic redundancy [134]. The theory of feedback loop control states

that the noise is also a stabilizer and a driver of molecular motors. More-

over noise is also responsible for the phenomenon of stochastic resonance,

that is the phenomenon in which noise enhances the detection of weak

signals and help improve the biological information processing [71].

Noise is involved in the so-called stochastic focusing, in which cells ex-

ploit it to reduce the random variation in regulated processes, by tuning

20


a mechanism to a threshold [139]. And finally, stochasticity plays a cru-

cial role in the differentiation by establishing initial asymmetries leading

to different evolutive categories of different part of a system (example of

a role of noise in differentiation can be found in many processes regarding

the immune systems, such as the clonal amplification of cells expressing

an antigen, but also in many processes driving the rhythm of biological

oscillators such as those involved in circadian rhythm mechanism.

1.4.1 Stochastic simulation algorithms

Models with a small number of molecules can realistically be simulated

stochastically, that is, allowing the results to contain an element of proba-

bility, unlike a deterministic solution. The stochastic simulation algorithms

provide a practical method for simulating reactions which are stochastic

in nature. Different approaches in modeling stochastic character of bio-

logical phenomena uses different mathematical formalism and simulation

techniques. Although we will treat in detail this topic in the next chapter,

here we give a brief anticipation that allow the reader to understand the

solution proposed in this thesis both for the specification and simulation

of biological stochastic systems. The most used stochastic models are the

“continuous time - discrete state space - stochastic” (CDS) models, where

the intrinsic noise is simulated by the Chemical Master Equation. As we

will see in great detail in Chapter 2 the Chemical Master Equation is im-

possible to solve for most practical problems. Gillespie proposed two exact

stochastic simulation algorithms to solve the Chemical Master Equation

based on the assumptions that the system is homogeneous and well mixed.

The algorithm simulates one reaction at a time based on the propensity

function for each reaction. This function is the probability that a reaction

has to occur in a given infinitesimal interval of time.

At each time step, the chemical system is exactly in one state. The

21


idea is to directly simulate the time evolution of the system. Basically,

the algorithm determines the nature and occurrence of the next reaction,

given that the system is in state X at time t. Given a system with total

number of reaction channels N and total number of species M , we then

define the following notations. The state of the system X is defined by

the state vector whose components are the numbers of molecules of each

involved chemical species, thus X = (X1, X2, . . . , XN).

• P (τ, µ) = probability that given the state at time t, the next reactionin volume V will occur in the infinitesimal time interval (t+ τ, t+ τ +

dτ), and will be an reaction Rµ

• cµ = stochastic rate constant for reaction µ. As we will prove inChapter 2, it can be derived from the deterministic rate constant k.

• hµ = number of distinct Rµ molecular reactant combinations availablein the state X

• aµ = propensity function of reaction µ

• aµdt = hµcµdt = probability that an Rµ reaction will occur in volumeV , in (t, t+ dt), given that the system is in a state X at time t.

The algorithm, known as Direct Method can be summarized as follows

1. Initialize the system at t = 0 with stochastic rate constants c1, c2, . . . , cM

and the initial numbers of molecules of each species x1, x2, . . . , xN .

2. For each i = 1, 2, . . . ,M , calculate ai(x, xi), based on the current state

Xcurrent

3. Calculate ao =∑

i ai(Xcurrent), the combined reaction hazard

4. Simulate time to next event, t′, as an Exp(a0) random quantity

22

CHAPTER 1. INTRODUCTION 1.5. FORMALIZING THE COMPLEXITY

5. Put t← t+ t′.

6. Simulate the reaction index, µ, as a discrete random quantity with

probabilities ai(Xcurrent)/a0, i = 1, 2, . . . ,M .

7. Update the state X of the system according to the reaction µ, that is

put Xcurrent ← Xcurrent + S(µ), where S(µ) denotes the µth column ofthe stoichiometry matrix S.

A variant of the Direct Method is the First Reaction Method. This

variant differs from the standard approach in the points 3, 4, 5. I does

not calculate a0 and extract a putative times ti from the Exp(ai) random

quantities. The simulated time for the next reaction is the smallest ti and

the reaction index is the one of the corresponding Ri reaction.

The Gillespie algorithm has been applied to many in silico biological

simulations recently. Kastner et al. in [85] applied the algorithm for simu-

lation of Hox cis-regulatory mechanisms. The simulation was successful in

reproducing key features of the wild-type pattern of gene expression and in

silico experiments yielded results similar to that of in vivo experiments. Be-

sides that, Kierzek et al. in [87], applied the algorithm to model lacZ gene

expression and discovered the influences of the frequencies of transcrip-

tion and translation initiation on random fluctuations in gene expression.

McAdams and Arkin in [116], also studied the transcription initiation and

translation mechanisms in the cellular regulatory network using Gillespie’s

algorithms and found several stochastic phenomena like the fluctuation in

protein production and switching delay for genetically coupled links.

1.5 Formalizing the complexity

The invention of conceptual and technological tools are the building blocks

of any scientific revolution and paradigm shift. Such conceptual and tech-

23

1.5. FORMALIZING THE COMPLEXITY CHAPTER 1. INTRODUCTION

nological tools are now emerging at the intersection of computer science,

mathematics, biology, chemistry and engineering.

The main three concepts that revolutionized the approaches of the re-

searcher to the system biology can be summarized as in the following:

• A living cell is an information processing device. Cells naturally pro-cess internal and environmental information in complex fashions and

interact with neighboring cells to achieve coordinated behavior.

• Cellular information processing and passing are carried out by net-works of interacting molecules.

• A better understanding of the cell requires an information processingmodel.

Computers have similar characteristics to the cell. Like software, cells

affect, prescribe, cau

dit - university of trentoassets.disi.unitn.it/.../phd-thesis/xix/lecca_paola.pdf · 2011. 2....

Documents