simulation of lac operon regulation in e. coli...
TRANSCRIPT
1
SIMULATION OF LAC OPERON REGULATION
IN E. COLI USING THE SOFTWARE COPASI
ABSTRACT
The bioinformatics through its applications offers the possibilities to simulate metabolism
pathways and guesses about the biochemical mechanisms that drive the cell behavior.
Knowing or estimating some processes’ parameters, is possible to simulate and analyse
processes trends in a virtual environment and to evaluate if the simulation model is
consistent with the theoretical models. In this work, we simulate the mechanisms of lac
Operon regulation in the model organism Escherichia coli using the Biochemical System
Simulator Software COPASI. The simulation focuses on the signalling molecules
concentrations of and cell answer effectors when the lactose is added in the extracellular
environment. The evaluated components will be cAMP, lac Enzymes, mRNA, lac
operators and concentration of glucose and lactose. This thesis describes how the model is
structured, the mathematical components of the model and the result of simulations under
different conditions. The model does not include all the variables of the real system but it
tries to consider parallel pathways of lactose degradation. The provided model, even if not
complete, seems to be consistent with the basic theoretical model in almost all the
simulations. Further the model will be deposited in a public domain for future usage.
2
SUM M ARY
CHAPTER I – INTRODUCTION AND MODEL’S ORGANISM ................................ 4
Introduction ........................................................................................................................ 4
Model’s Organisms: Escherichia coli ................................................................................ 7
CHAPTER II – LAC OPERON STRUTURE AND REGULATION ............................ 8
The lac Operon structure .................................................................................................... 8
What is an Operon .......................................................................................................... 8
What is the lac an Operon .............................................................................................. 8
Regulation pathway of lac Operon..................................................................................... 8
Principles of transcriptional regulation .......................................................................... 8
Regulation of transcriptional initiation: the lac operon in Prokaryotes....................... 10
CHAPTER III – COPASI ................................................................................................. 12
What is COPASI .......................................................................................................... 12
What is a Simulation Model......................................................................................... 12
COPASI modeling elements ........................................................................................ 13
COPASI biological modeling elements ....................................................................... 14
Global Quantities.......................................................................................................... 16
Events and Tasks .......................................................................................................... 18
Simulation Time settings.............................................................................................. 19
Output specifications .................................................................................................... 19
CHAPTER IV – MODELING THE LAC OPERON PATHWAY IN COPASI ......... 21
Structure of the lac Operon model ............................................................................... 21
Glucose and Lactose transport ..................................................................................... 23
Catabolite repression .................................................................................................... 24
Induction and repression of the lac operon .................................................................. 24
Lac proteins production................................................................................................ 26
Degradation of lactose ................................................................................................ 27
Mathematical model .................................................................................................... 28
Assumptions and limitations of the model .................................................................. 31
3
CHAPTER V – SIMULATION, RESULTS AND CONCLUSION .............................. 33
Time parameters and simulation resolution ................................................................. 33
Initial conditions........................................................................................................... 34
Scheduled events .......................................................................................................... 35
Simulations graphs ....................................................................................................... 36
Results ......................................................................................................................... 44
Conclusions .................................................................................................................. 46
References ........................................................................................................................ 47
4
CHAPTER I: INTRODUCTION AND M ODEL’S ORGANISM
INTRODUCTION
The cell is a dynamic entity that forms every living organism. Inside this microscopic
structure, everything is subjected to a refined regulation avoiding the cell to waste energy
and resources, and guaranteeing a correct adaptation to the environment as much as
possible. Further, through a network of metabolic pathways and their interaction, the cell
can manage the anabolic and catabolic processes that control the genes exp ression levels,
and hence the concentrations of each fundamental component inside its structure. The
complex networks of chemicals and biochemical signals give to the cell the capacity to
control the expression of potential genomic information codified in the whole DNA only
when the information becomes necessary to metabolize a specific substrate or act a specific
response. In this context, this thesis tries to demonstrate that a well-studied regulation
pathway structure, such as the lac Operon, can be studied in a virtual developing
environment, i.e., the COPASI software, letting us to observe the biochemical mechanisms
at work in different (virtual) experimental settings of feeding sources (glucose and lactose).
To reach this goal, we will apply bioinformatics tool. There are multiple definitions of
bioinformatics, but the most adapt, in my opinion, is that made by Xiong that define
Bioinformatics as “an interdisciplinary research area at the interface between computer
sciences and biological sciences” [10]. Being more inclusive, bioinformatics involves
technology that uses computers for storage, retrieval, manipulation, and distribution of
information related to biological data, such as DNA, RNA and proteins.
Figure 1 - DNA data trends between 1989-2013
5
Given that in the last decades the available biological data is exponentially grown (please
refer to Figure 1), and the genomic data analysis is highly repetitive and mathematically
complex, the computational technology is absolutely indispensable in mining genomes for
information gathering and knowledge building. The bioinformatics is a moiety of a related
field known as computational biology. Bioinformatics is limited to sequence, structural and
functional analysis of genes and genomes and their corresponding products under the name
of computational molecular biology. It is considered then computational biology includes
all the biological areas that use computational tools as, for example, the mathematical
modelling of ecosystem or population dynamics.
The ultimate goals of bioinformatics are the better understanding of living cell and how it
works at the molecular level. At the same time, the analyse of biological data often
generates new problems and challenges that in turn push forward the development of new
and better computational tools. This work belongs at the branch of bioinformatics called
functional analyses, including gene expression prediction, metabolic pathway
reconstruction, and metabolism simulation. The other two areas of analyses are the
structure analysis that makes prediction of nucleic acid structures and protein structures,
classification and comparison between them, and Sequence analysis that, instead, includes
sequence alignment, sequence database searching, genome comparison, gene & promoter
prediction and phylogeny.
In [10], Xiong claims that bioinformatics is not only essential for basic genomic and
molecular biology research, but it has a major impact on many areas of biotechnology and
biomedical sciences. Some applications, for example, are in knowledge-based drug design,
forensic DNA analysis, and agricultural biotechnology. In the first research area, an
important feature is that the informatics-based approach significantly reduces the time and
costs necessary to develop drugs with higher power and less toxicity. In forensic, results
from molecular phylogenetic analysis have been accepted as evidence in the criminal
courts. In agriculture, the analyses of genomes can help the developing of new species of
plants like crop with a better genetic. All these aspects mean that bioinformatics is
integrated in a lot of areas, to let us able, through the analyses of available data, to make
simulations about the considered system’s functions, to predict the system behaviour in the
future and to make new interpretations of the current available data.
The bioinformatics, at the same time, has a number of inherent limitations. The
bioinformatics and experimental biology are independent, but complementary, activities.
Bioinformatics depends on experimental science to produce raw data for analysis. Instead,
it provides useful interpretation of experimental data and it can guide the experimental
research. Bioinformatics predictions are not formal proof of any concept and cannot
replace the traditional experimental research methods. The factors that affects the quality
of bioinformatics predictions are the quality of data, the complexity of the algorithms used,
6
and the computing power available. These parameters given us an evaluation of errors
produced by bioinformatics programs. Xiong in his book assesses that it is a good practice
to use multiple programs and perform multiple evaluations whenever it is possible.
In this thesis, we build the first metabolic regulation model of lac Operon in COPASI. As
first step, we adapt the theoretical model in [5,7] to the COPASI modelling approach.
Actually, there are different ways to model lac Operon in COPASI, each with its strengths
and weaknesses. We use a reaction approach for the model because a pathway is a set of
react ions. Usually a biochemical model can be described by mathematical differential
equations and react ions. The model is dependent from the purposes and the outcomes we
are looking for. To obtain a consistent model it is mandatory to optimize the interactions
between the mathematical part and biochemical part of the model otherwise we will have a
not consistent simulation in respect to the experimental proves and the theoretical model.
In the COPASI model we limit the mathematical equations to the enzymes kinetics
react ion rate by the following assumption: in the real world, one reaction could happen
only if the substrates and all the factors are present. In the absence of a single factor a
given reaction cannot happen. If we build the model only with the mathematical rules, an
equation is not always able to understand that if the substrate is absent the reaction cannot
happen and the simulation in some conditions easily gives a negative concentration, that is
impossible to observe in the reality. For this reason we adapt the model described in [7],
avoiding some equation and adding its equivalent written as reactions. After then the
model is completed, we run seven different simulations.
This thesis is divided in four chapters. The first chapter (Chapter I) contains the
introduction to the thesis and the presentation of the chosen organism model. The second
chapter (Chapter II) describes the molecular knowledge and mechanisms about lac operon
regulation in bacteria. The Chapter three (Chapter III) presents the software COPASI,
Chapter four (Chapter IV) the simulation model and how we structured it in COPASI. In
the last chapter (Chapter V) we will present and discuss the simulations results.
7
M ODEL’S ORGANISM: ESCHERICHIA COLI
Escherichia coli is a Gram-negative, non sporulating and facultative anaerobic rod. It is
about 2.0 micro meters (µm) in length and its diameter is between 0.25 and 1.0 µm (see
Figure 2 for an example). The optimal temperature for multiplication is 37°C, but it can do
it until 49°C (120°F). Multiplication is not tightly glucose dependent given that the
bacteria can use a lot of substrates to obtain energy as fumarate, trimethylamine N-oxide
and dimethyl sulfoxide, amino acids and others compounds. E. coli can only survive
outside the body for a limited period of time so it can be considered as an ideal indicator
organism order to test samples from environment for fecal contamination, but some
researches showed that can survive for a long time too. Normally are present in the gut of
animals. Escherichia coli includes a vast population of bacteria that demonstrate a very
high degree of both phenotypic and genetic diversity [9].
Figure 2 - E. Coli Electron Microscope Coloured Images - https://www.flickr.com/photos/niaid/16578744517
In a laboratory setting, the E. coli can be grown inexpensively and easily. E. coli has been
widely studied for about 60 years. It is the most ext ensively investigated prokaryotic model
organism and considered to be very important species in biotechnology and microbiology.
E. coli holds an important position in industrial microbiology and modern biological
engineering because of its easy manipulation and also long history of its laboratory
cultures. The research work of Herbert Boyer and Stanley Norman Cohen regarding use of
restriction enzymes and plasmids in order to create recombinant DNA by E. coli became
the base of biotechnology.
E. coli is considered to be a very flexible host for the heterologous proteins production.
Recombinant protein production involves various protein exp ressions in E. coli. Plasmids
have been used to introduce genes into the microbes by researchers which have leads to
high level of protein expression. Such proteins can be produced by the fermentation
process in the industries at mass level. A very useful and important application of
recombinant DNA technology was production of human insulin by E. coli manipulation.
E. coli cells in modified form have been used in the development of vaccine,
bioremediation, biofuels production and formation of immobilized enzymes [9].
8
CHAPTER II: LAC OPERON STRU CTURE AND REGULATION
In this Chapter, it is presented the definition of an operon, the lac Operon Structure and its
transcriptional Regulation in Prokaryotes.
LAC OPERON STRUCTURE
WHAT IS AN OPERON
In genetics, an operon is a functional unit of genomic DNA containing a cluster
of genes under the control of a single promoter. The genes are transcribed together into a
mRNA strand and either translated together into the cytoplasm, or undergo trans-splicing
to create monocistronic mRNAs that are translated separately (in Eukaryotes) [6].
Originally, operons were thought to exist solely in prokaryotes, then the first operon in
eukaryotes was discovered in the early 1990’s.
LAC OPERON STRUCTURE
WHAT IS THE LAC OPERON
The lac operon of the model bacterium Escherichia coli was the first operon to be
discovered and provides a typical example of operon function. It consists of three
adjacent structural genes: a promoter, a terminator, and an operator. The lac operon is
regulated by several factors including the availability of glucose and lactose. It can be
activated by allolactose. Allolactose binds to the repressor protein and prevents it from
repressing lac operon genes transcription [6].
The three lac genes - lacZ, lacY, and lacA - are arranged adjacently on the E. coli genome
and they are together called the lac operon. The lacZ gene encodes the enzyme β-
galactosidase, which cleaves the sugar lactose into galactose and glucose, both of which
are used by the cell as energy sources. The lacY gene encodes the lactose permease, a
protein that inserts the lactose into the cell membrane and transports it into the cell. The
lacA gene encodes thiogalactoside transacetylase, which rids the cell of toxic
thiogalactosides that also get transported in by lacY.
REGULATION PATHWAY OF LAC OPERON
PRINCIPLES OF TRASCRIPTIONAL REGULATION
Not all genes are expressed in all cells at all the time. Indeed, much of living organisms
adaptations depends by the ability of cells to express specific genes in different
combinations at different times and in different places. Even a lowly bacterium expresses
only some of its genes at any given time. This capability ensures, for example, then it can
9
produce the enzymes needed to metabolize the nutrients it encounters while it blocks the
production of enzymes for other nutrients that are not available at that time. The
development of multicellular organisms offers an even more striking example of this so-
called “differential gene expression.” Essentially all the cells in a human contain the same
genes, but the set of genes expressed in forming one cell type is different from the set of
genes that are exp ressed other cell types. Thus, a muscle cell exp resses a set of genes
differently (at least in part) from those exp ressed by a neuron, a skin cell, and so on.
Mainly, these differences occur at the level of transcription, most commonly, the initiation
of transcription.
Genes expression is very often controlled by extracellular signals. In case of bacteria, this
typically means molecules presents in the growth medium. These signals are connected to
genes by regulatory proteins, which come in two types: positive regulators, or activators,
and negative regulators or repressors. Typically, these regulators are DNA-binding
proteins that recognize specific sites at or near the genes they control. An activator
increases the specific transcription of the regulated gene, and a repressor decreases or
eliminates the transcription.
Although there are cases where gene exp ression is regulated at essentially every step , from
the gene to its products, the most common step at which regulation impinges is the
initiation of transcription. There are two reasons why this might make sense. First,
transcription initiation is the most energetically efficient step to regulate, ensuring that no
energy and resources are wasted. Second, regulation at this first step is easier to do because
a less number of factors must occur.
RNA Polymerase binds many promoters only weekly in the absence of regulatory proteins.
This because one or more promoter elements discussed above is absent or imperfect. When
polymerase does occasionally bind, however, it spontaneously undergoes a transition to the
open complex and initiate transcription. This gives a low level of constitutive expression
called basal level. RNA polymerase binding to the promoter is the rate-limiting step in this
case. To control expression form such promoter, a repressor needs only to bind to a site
overlapping the region bound by polymerase. In that way, repressor blocks polymerase to
bind the promoter, thereby preventing transcription. The site on the DNA where a
repressor binds is called an operator.
The lac genes of Escherichia coli are transcribed from a promoter that is regulated by an
activator and a repressor working as described above [5].
10
REGULATION PATHWAY OF LAC OPERON
REGULATION OF TRASCRIPTION INITIATION: THE LAC OPERON IN
PROKARYOTES
The lac promoter, located at the 5’ end of lacZ (see Figure 3), manages the transcription of
all the three genes as a single mRNA; this mRNA is then translated to give the three
protein products.
Figure 3 - lac Operon structure – 5
These genes are expressed at high levels only when lactose is available, and glucose—the
preferred energy source—is not or is present in low concentration. Two regulatory proteins
are involved: one is an activator called CAP, and the other is a repressor called the Lac
repressor. The Lac repressor is encoded by the lac-I gene, which is close the other lac
genes, but transcribed from its own (constitutively exp ressed) promoter. The name CAP
stands for catabolite activator protein, but this activator is also known as CRP (for
cAMP receptor protein).
Figure 4 - Lac Operon Regulation Diagram. Web address: https://sbi4u2013.files.wordpress.com/2013/02/lacoperon.jpg
11
The gene encoding CAP is located elsewhere on the bacterial chromosome, not linked to
the lac genes. Both CAP and the Lac repressor are DNA-binding proteins and each binds to
a specific site on DNA at or near the lac promoter (the CAP site and the operator,
respectively; see Figure 3).
Each of these regulatory proteins responds to one environmental signal and communicates
it to the lac genes. Thus, CAP mediates the effect of glucose, whereas Lac repressor
mediates the lactose signal. This regulatory system works in the following way (please
refer to 4). Lac repressor can bind DNA and represses transcription only in the absence of
lactose. In the presence of that sugar, the repressor is inactive and the genes de-repressed
(exp ressed). CAP can bind DNA and activates the lac genes only in the absence of glucose.
Thus, the combined effect of these two regulators ensures that the genes are exp ressed at
significant levels when lactose is present and glucose absent [5].
12
CHAPTER III: COPASI
COPASI
WHAT IS COPASI
COPASI is a software application for simulation and analysis of biochemical networks and
their dynamics. COPASI is a stand-alone program that supports models in the SBML (the
Systems Biology Markup Language) [2] standard and can simulate their behaviour using
ODEs (Ordinary Differential Equations) or Gillespie’s stochastic simulation algorithm [1].
Currently the application develop ed is at Version 4.19 (build 140). Considering the
COPASI’s machine compatibilities, it is available for Linux, Windows and Mac.
COPASI is part of de.NBI, the ‘German Network for Bioinformatics Infrastructure’. The
network provides comprehensive first-class bioinformatics services to users in life sciences
research, industry and medicine. The de.NBI program coordinates bioinformatics training,
education and the cooperation of the German bioinformatics community with international
bioinformatics network structures including ELIXIR [3].
COPASI is used in research for the following aims:
modelling biological, biochemical, and chemical systems;
development of theory of computational methods;
development of “wet” laboratory methods.
It is part of other software tools. A list of many scientific articles, as well as a basic user
manual, are available on line [1].
In this thesis, we use COPASI to model the Operon Lac Pathway with the final aim of
simulating the dynamics of the corresponding dynamics under some conditions. Before to
describe COPASI and its modelling element, we define, in the following section, what is a
simulation model.
COPASI
WHAT IS A SIMULATION MODEL
In science and technique “to simulate” means to represent, through models, some
phenomena, system and/or process with the final aim to study its components, status and
reactions in artificial designed conditions.
Then the “Model” is a real or virtual construct that copies the essential structural
components of the studied system, meant as functional unity of interconnected parts.
13
A model is a representation of a phenomena from a view point hence it cannot represent
the phenomena in the totality of its appearance but it is a reduction/abstraction containing
all the information useful to observe or simulate the characteristics of the phenomena
object of study and analyses. The representation with models is not limited only in the
experimental research but, more in general, belongs to all the observational sciences. The
simulation trough models act isolating a system considering irrelevant some variables that
are not considered essential in the study of what we are investigating to. No model could or
has to copy all the components of the real system it represents. Instead, the model has to
reduce the system complexity, its variables and its degrees of freedom.
The advantages of simulation are the fictitious acceleration in the time results of the model,
to highlighting the trends and predict the dynamic develop of the model. The efficiency of
a simulation depends from the quality of the model. Its ability to includes the essential
variables and exclude the not essential to the specific aims.
COPASI
COPASI MODELING ELEMENTS
Given that we are talking about biochemical simulation programs, we can talk briefly
about the internal structure of COPASI. The first step to approach with COPASI is the
creat ion of the model we are interested to. The program has a set of elements that permit to
define the measures units (Concentrations, Time and Volumes units), compartments, the
chemical or biological species that belongs to our model, the reactions that make our
model dynamic, and events that can be used to change a given parameter during the
simulation of the model.
Whenever the model has been created, it is necessary to decide the way the results are
showed. If we want to plot the results, setting opportunely the plots feature in COPASI’s
Output section, the program generates graphics with the set parameter, as soon as the
simulation starts. We have the possibilities to modify the scale of measures directly on the
graphics and save the work in different file format (pdf, images and others).
Going back to the simulation features, COPASI includes the task element that contains all
the functions about the simulation modelling. The program is able to determine if the
model reaches a “steady state”, that is a state where the parameters reach stable values and
do not sensibly change during the simulation time if no external interferences occur. There
are several types of analysis that COPASI allows to execute, such as: Steady-State
Analysis, Metabolic Control Analysis, Parameters Estimations, Stoichiometric State
Analysis, Time Course Simulation and others depending the objectives of the simulations.
14
COPASI already contains mathematical models for mass fluxes and physic measure units
but it enables us to add new math models, showing that it is a flexible platform for many
kinds of models simulation.
COPASI
COPASI BIOLOGICAL MODELING ELEMENTS
In the previous section we briefly described the structure of COPASI and its main
functions. Here we will exp lore the parts that are more important for the Chemical,
Biological and Biotechnological use. Particularly, two elements that we keep focus to are
reactions and species [7,6].
During the creation of biological models there is a chemical or/and biochemical part of the
model that has to be managed in the simulation environment. In COPASI, we can insert
into the model every single species that have a role in the model itself. A specie can be a
molecule or a protein, however, such distinction does not make any difference for the
simulator. It is possible to instantiate a new species and to define some parameters for them,
such as:
- Initial concentration – that specifies how much of that component is present at the
beginning of the simulation;
- Initial Expression – it is the ODE needed to determine the initial concentration of the
species;
- Compartment – that defines in which compartment the component is located.
Always depending by the way that the model is managed;
- Type – that defines how the species interact with the model. For example is possible
to choose between reactions, fixed, assignment and ode. The reaction mode means
that the concentration of the specie/s depending by reactions, while, in the ode mode,
it is necessary to insert the mathematical equation that specifies how the simulator
will calculate and update the concentration of the species during the simulation. Fixed
indicate that the concentration of the specie remains the same in the whole simulation
independently by the reaction within is involved while assignment calculate the value
using a mathematical equation but it is not added to the previous value calculated.
Obviously the concentration of one specie could depend from others species or
parameters that could be added directly in formula using the correct syntax for
COPASI.Expression – If the type is ODE, you have to insert the differential equation
for the species. [1]
15
Figure 5a– The figures 5a, 5b and 5c list all the chemical/biochemical species involved in the model – COPASI screenshot
5b
5c
16
The other modelling element related to the biological aspect that we consider is the
reaction. In the Reactions branch it is possible to insert specific reaction mechanisms
using the specific syntax for COPASI. As a normal chemical reaction could have one or
more substrates and one or more products. Follow the laws of thermodynamic, as an
enzymatically catalysed reaction could follow different kinetics, the software permit to
choose between mass action kinetic, and more complex kinetic models available in the
functions menu. Furthermore, it is possible to insert extra kinetics laws in the specific
section.
The main difference between the two bigger classes of reaction are that some of those are
irreversible, and the others are reversible. With a simple syntax, the symbol ‘=’ is for
reversible reactions and the symbol ‘->’ is for irreversible reaction.
But we can apply the react ion kinetic model to other more complex biological functions
such as the activity of Enzyme Hexokinase (a phosphotransferase) that transports Glucose
from extracellular matrix into the cell’s cytoplasm and catalyses the adding of phosphate
group to the same substrate, releasing Glucose-6-phosphate (G6P). The reaction could be
write in COPASI as following:
eG + ATP = G6P + ADP + H+ [4]
where eG is the extracellular Glucose, ATP(Adenosine Triphosphate), ADP(Adenosine
diphosphate) and H+ is a proton released.
The formation of macromolecular complexes can be written as a react ion, as we have done
in ours model (Figure 7a and 7b). Having the kinetics parameters of equilibrium reaction
we can write as we have done in the model:
cAMP + CRP = [cAMP:CRP].
This reaction has been taken from the appendix two of reference [7]. After that the
react ion is created, and chose the kinetics law we have to set the parameters of the reaction
as the association and dissociation constant of the complex. To have a clarification of what
we did see Figure 6. The parameters involved in the reaction depends by the formula that
drive the rate law. This is very useful function because we can create any reaction, having
the formula that define the trend of react ion during the simulation, we can apply the same
rate law to other reactions simply selecting it by the list on the rate law “combo box”.
17
Figure 6 – A reaction setting screen in COPASI. In the field reaction is written the stoichiometry and the whole syntax of global
reaction. The textbox on the top permit to assign a reaction name to identify it. Below the reaction there is the rate law that
define the kinetic in the central part all the parameters of the written reaction where we can assign the values to the kinetic
parameters.
COPASI
GLOBAL QUANTITIES
Another feature of COPASI is the possibility to set “Global Quantities” and ours model has
in total 69 Global quantities. A Global Quantity is a variable that could be called and used
in other branches of the program as for example in the kinetics parameters of the reactions
previously showed (Figure 6). In the image cited we can see that the value of k1 under the
“Mapping” column is “K_ns”. “K_ns” is a global quantity that we created with the
equilibrium kinetic value that we took from the ref [7], constants tables. In this way is not
necessary each time to write the same value instead you can select it directly from the
scrolling menu in the Mapping column, at the corresponding raw. Then one of the first
things to let the work easier and faster was to create global quantities for each one of the
kinetic values reported on the “List of parameters used in the model” [7]. After this step all
the parameters were loaded on the program with the possibility to call them rapidly in a
reaction or a formula.
18
COPASI
EVENTS AND TASKS
An event in COPASI is a modification in any model parameter or species concentration.
The program permits to schedule event where and when some conditions are satisfied or
not. In the branch events, adding a new event we should insert the condition when the
event will happen in the field trigger exp ression. In the field called “Target” it possible to
choose which parameter or specie will be modified. In the textbox “Expression” we will
insert the new parameter for the target chose.
Figura 7 – Sample event of lactose injections at simulation time 500000
Depending on the aim of simulation, the branch task offers different approaches to the
model, helping to estimate parameters, parameter analysis during the whole simulation,
verifying the existence of a steady-state for the model set. Many of these functions are for
an advanced knowledge of the program and we are not using many of them. We used only
the “Time Course” function that let set the time resolution parameters of the simulation.
The time parameters will be explained below.
19
COPASI
SIM ULATION PARAMETERS
COPASI is a simulator and to run a simulation session we need to set some parameters.
The COPASI section to run and set parameters is called Time Course Section. The
parameters that are available to set are:
- the duration (D) of the simulation. This value is the difference between final time
and the Time at the start. Then D = Tfinal – T0.
- the Interval Size that determines the precision of simulation. It determines the
number of the intervals that the simulator will consider during simulation to
calculates values. It is possible to find the number of intervals dividing the
duration by the Interval Size.
Number of Intervals = 𝐷𝑢𝑟𝑎𝑡𝑖𝑜𝑛
𝐼𝑛𝑡𝑒𝑟𝑣𝑎𝑙 𝑆𝑖𝑧𝑒
A huge number of Intervals ensures higher resolution of the simulation and of the
graphics: higher the number of intervals and hence of sampling, and higher the
simulation accuracy. Of course, considering high number of intervals, the
simulation will require more time to complete than a simulation using a smaller
number of Intervals.
COPASI
OUTPUT SPECIFICATIONS
The simulation is a mean to obtain information about the behaviour of specific components
in a model, or to make a prevision of some parameters in some given conditions. To extract
correctly the data looked for, COPASI has the possibility to save the report of a simulation
directly on a file, to process the data with other software (ex MathLab) and to create bi-
dimensional graphics with the relevant chosen parameters. To choose the data that will be
showed it is necessary use the class plot. Going in output specifications branch, there is a
further branch called “Plots”. Inside Plots we could create a new plot and afterwards create
a new “curve”. During the creation of the curve is possible to select which parameter will
be showed on the X axis and which on the Y axis. The classical example is put the Time
along the X axis and a Concentration of Specific species on the Y axis. In this way, after
the starting of the simulation is possible to observe the concentration of the desired species
during the simulation. Another advantage of the software is the possibility to insert
20
multiple curves inside the same plot. This feature permits to observe, in the case previously
described, the concentration of multiple species along the time in the same graphic.
Figure 8 – A screen of Plot branch in Output Specifications
Figure 9 – A screen of Plot Free Operators structure and set curves. On the right all the parameters of the lines.
21
CHAPTER IV: M ODELING THE LAC OPERON PATHWAY IN COPASI
In this section we describe the simulation model we defined in COPASI per the Lac
Operon Pathway. The devised model has been inspired by [7]. In the following, we firstly
describe the structure of the simulation model, then the 46 reactions our COPASI model is
composed and the mathematical model of the dynamics we referred to.
M ODELING THE LAC OPERON PATHWAY IN COPASI
STRUCTURE OF THE LAC OPERON MODEL
Starting from the basics definitions of the model, the measures units that we choose are
mole (mol) for concentration of a given specie, litre (l) for the volume and minutes (min) to
point at the time.
The first parameter that we assumed is the compartment. In the COPASI model we built
there is only one compartment that we called simply “cell” with the volume of 1*10-6 l [8].
We can do this assumption because we know that bacteria are prokaryotes. Prokaryotes
usually are simpler cells than Eukaryotes, mainly in the internal organization of elements
so we can consider the whole cell as a unique big compartment.
Figure 10a – The list of all the 46 reactions of the model – COPASI screenshots
22
Inside this compartment, not considering vesicles, all the metabolic reactions take place.
The Model is defined by 46 reactions and 46 molecular species .
Figure 10b – The list of all the 46 reactions of the model – COPASI screenshots
In Figure 10a and 10b, we show the COPASI screenshots of the section where it is
possible to specify reactions and the syntax COPASI requires for reaction specification. In
the reported table, for each react ion we need to provide a name, the reaction, rate law and
flux.
Normally a generic cell, that is defined as the smallest living unit of every organism on this
planet, takes the nutriment from the outer environment and uses to “digest” it to extract
energy. This energy, that in a cell is stored in a chemical gradient (see Mitochondria) or on
high energy bonds (ATP), is fundamental to many essential processes which permit the cell
itself to follow its living functions. The cell, thanks to its semipermeable membrane, can
23
select the bigger part of the substances that it absorbs.
Figure 11 - Schematic functional representation of the simulated model [7]
MODELING THE LAC OPERON PATHWAY IN COPASI
GLUCOSE AND LACTOSE TRANSPORT
Designing the model we should reproduce as much as possible the chain of events that
normally happens in a given organism. Starting from the plasmatic membrane level, at this
step we find all the transport and recognition components. Than in our model on the
membrane we can find 2 enzymatic complex interest of: 1 – Phosphotransferase system
(PTS) and 2 – Permease. The PTS is responsible for the transport of the Glucose from
outer to the inner side of the membrane and during this passage the complex add a
phosphate group to Glucose to produce Glucose-6-phosphate (G6P). This reaction is
written as:
Glu_ext -> G6P (1)
where Glu_ext is the external Glucose.
The second reaction, or process, that happens at the membrane level is Permease transport
of lactose from the extracellular environment to the intracellular space. Than we can
describe this reaction with the syntax:
24
Lac_ext -> Lac_int (2)
where Lac_ext is the external lactose and Lac_int is internal lactose.
M ODELING THE LAC OPERON PATHWAY IN COPASI
CATABOLITE REPRESSION
The metabolism of other sugar in the cell is inhibited if glucose is present and plentiful in a
phenomenon known as catabolite repression. The signalling molecule of this way is cAMP
(cyclic AMP). The total concentration is given by the ratios of production, under given
condition by adenylate cyclase, degradation and secretion. In absence or presence in little
concentration of external and internal glucose the quantity of intracellular cAMP increases.
This signal says to the cell that a primary source of energy is exhausting. In response to
this increasing of intracellular cAMP the cell actives genes that permit her to use different
sources of energy before unused as in the case of diauxic growth. [7]
Than the ATP is transformed in cAMP in the following reaction:
ATP -> cAMP (3)
In ours model the syntax is: -> cAMP because we assumed that ATP concentration in not-
limiting factor in the cAMP production.
Further in the current model we assumed that the ratio of cAMP production is fast and
related with the external Glucose concentration.
M ODELING THE LAC OPERON PATHWAY IN COPASI
INDUCTION AND REPRESSION OF THE LAC OPERON
The lac operon transcription is controlled by the binding of the tetrameric lac repressor to
one or more of the three operator regions and by the CRP-cAMP complex to its specific
DNA binding site. Than analysing the chain of equilibrium association of the regulation
complexes we wrote the series interaction showed in the article [7] (reaction from 1 to 28
in Figure 10a and 10b) directly in to the program COPASI adapting the model modifying
some reaction to have a functional mechanism. In the complexes equilibrium reactions the
specie D indicate a “not productive site or promoter”.
The first factor that modulates the transcription of operon is the basal binding, or affinity,
of the RNA-Polymerase for its sigma (σ) subunit and then to the promoter. This complex is
written as [RNAP:σ:P] complex, where RNAP is the RNA-Polymerase and P is the
25
promoter. Sigma is an essential protein of the initial transcriptional complex that binds a
specific promoter indeed. In Escherichia coli this factor is identified with the name of σ70.
As it has been proved, different σ factors are directed to different promoters with different
activity as for example σ54 increases the transcription of genes responsible for the
nitrogen’s metabolism. [5-7].
The second factor that controls the lac expression is the interaction between the signalling
molecule cAMP and the Catabolite Receptor Protein (CRP) identified by [CRP:cAMP]
complex syntax. In presence of cAMP, CRP binds cAMP forming the above said complex.
This complex can migrate on the DNA and it can bind to a promoter specific site. A
promoter, when it is activated, enhances the transcription of the associated genes. The
complex [CRP:cAMP] bind CAP site, positioned about 60 nucleotides before the starting
site of transcription, stabilizes the RNA-Polymerase that has poor affinity for the promoter
without the [CRP:cAMP:CAP] complex. [5-7].
The third factor that controls the expression of the lac Operon is the system made by the
repressors. The repressor proteins are produced by a gene far from the lac Operon and
execute their repression on the lac operator overlapping in part on the sequence where the
RNA-Polymerase binds to start the transcription. [5] In their default state the repressors are
bond to the DNA lac operator but when the allolactose, a derivative product of lactose,
binds them, the repressors change conformation and release the binding DNA. This event
permits to start the transcription and enhances the lac mRNA production. This model has
three operator sites that could be repressed and the total repression is the product of the
repression on these three sites. [O1], [O2] and [O3] in the model are the three operator sites
free from the bond of the repressor, while [Rep:O1] for example indicates the complex
Repressor-Operator site. Each ones of the operators could be bonded from the active
repressor form [7].
The two reactions that describe the transcription of lac Repressor and lac Operon are the
following:
-> mRNA_ZYA (4)
-> mRNA_Rep (5)
where mRNA_ZYA is the messenger RNA of the lac Operon, while mRNA_Rep is the
messenger RNA of the repressor. The following reactions in opposition describe the
degradation of the mRNA:
mRNA_ZYA -> (6)
mRNA_Rep -> (7)
The reactions (4) and (5) represent the RNA’s anabolism (transcription) and the reactions
(6) and (7) the RNA’s catabolism (degradation).
26
(4) and (5) are wrote without the substrate because we assume that all the components for
RNA transcription are not in limiting quantities. The same for the products of the
degradation described in (6) and (7) because we assumed that the accounts of
ribonucleotides are not essential for the current model. We consider them not essential but
surely they are in a real cell system. Without ribonucleotides the cell cannot synthetize new
RNA forms (mRNA, tRNA, rRNA, snRNA, ecc) but in ours model we are exploring the
dynamics of lac operon and we isolate it from the rest of normal metabolism. The model
works assuming all the basal components for each fundamental needing of the cell is
present with not limiting concentrations. Accounting of this components could be an
improvement of the model here introduced, that increase its complexity.
MODELING THE LAC OPERON PATHWAY IN COPASI
LAC PROTEINS PRODUCT ION
The lac mRNA operon translation mainly increases the concentration of Permease
transporter on the plasmatic membrane and enhances the cytosolic levels of β-
Galactosidase. The production and then the concentrations of these enzymes in the cell
depends mainly by two pathways. The anabolic p athway that is proportional to the mRNA-
ZYA in the cell. On the opposite side there is the catabolic pathway of proteins that
involves ubiquitin-Kinase and protease that continuously destroy protein.
To indicate the synthesis of all the protein in the model in COPASI we insert the following
reactions:
-> Permease (8)
->β-Galactosidase (9)
-> Rep (10)
We did not insert the substrate because they are enzymes derived from translation. Given
that in a cell one mRNA message could be read by more than one ribosome, there is no
direct stoichiometric relationship between one mRNA molecule and one protein. Indeed
we know the constant which the mRNA is translated and how many moles of proteins the
cell produce per mole of mRNA-ZYA, known under the name of translation rate constant
[7]. In this model we assume this process, talking in reactions terms, as two separate
reactions, related internally thank to an ODE (Ordinary Differential Equation).
The proteins could be destroyed by Ubiquitine System or decay after certain time. We
assumed all the process under a value called protein decay rate constant. The involved
reactions are:
27
Permease -> (11)
Β-Galactosidase -> (12)
Rep -> (13)
When the proteins are destroyed, all their amino acids are released. Here it is useless
consider amino acids as limiting components in the synthesis and too during degradation
because it complicates the model and does not properly the aims of the works. However, as
we said for the ribonucleotides before, could be a suggestion for future improvements of
the introduced model.
MODELING THE LAC OPERON PATHWAY IN COPASI
DEGRADATION OF LACTOSE
In the proposed model we assumed that the intracellular lactose follows three different
ways at the same times. The first reaction that involves the intracellular lactose is the
action of the Enzyme β-Galactosidase. This enzyme converts the lactose to allolactose.
The following reaction describes essentially such transforming process:
Lac_int -> Allo (14)
where Allo is Allolactose.
The second step is the hydrolysis of allolactose to Glucose and Galactose. We assume that
the Glucose resulting of this reaction is immediately phosphorylated to G6P. The reaction
is:
Allo -> G6P + Gal (15)
where Gal means Galactose, epimer of Glucose. The enzyme that catalyses this reaction is
always β-Galactosidase.
The third possibility is that the lactose is directly converted in Glucose and Galactose. We
described the reaction as:
Lac_int -> G6P + Gal (16)
The hydrolysis of lactose to glucose and galactose by β-galactosidase is also assumed to
follow Michaelis-Menten kinetics.
28
M ODELING THE LAC OPERON PATHWAY IN COPASI
MATHEMATICAL MODEL
In this section we report the used mathematical model that describes the assumed fluxes
and kinetics. The first reaction described (1) shows the Glucose flux towards the cytoplasm.
The equation that describes the flux is
Vt,Glu = kt,Glu ∗ ([Gluext]
[Gluext] + 𝐾𝑡,𝐺𝑙𝑢
)
where kt ,Glu is the glucose transport rate constant and K t ,Glu is the saturation constant for
glucose transport [7 – equation (1)].
The second equation (2) describes the intracellular flux of Lactose depending by the
concentration of permease and external glucose that inhibit the lactose transport. The
mathematical interpretation is
𝑉𝑡 ,𝐿𝑎𝑐 = 𝑘𝑙𝑎𝑐,𝑖𝑛 ∗ {([𝐿𝑎𝑐𝑒𝑥𝑡]
[𝐿𝑎𝑐𝑒𝑥𝑡] + 𝐾𝑡,𝐿𝑎𝑐
) ∗ (𝐾𝑖,𝐺𝑙𝑢
𝐾𝑖,𝐺𝑙𝑢 + [𝐺𝑙𝑢𝑒𝑥𝑡]) − (
[𝐿𝑎𝑐𝑖𝑛𝑡]
[𝐿𝑎𝑐𝑖𝑛𝑡] + 𝐾𝑡,𝐿𝑎𝑐 /𝑝)}
∗ [𝑃𝑒𝑟𝑚]
[7-(11)] where Ki,Glu is the lactose transport constant for inhibition by glucose, p is cellular
density and, [Lacext] the extracellular lactose concentration, [Lacint ]the intracellular lactose
concentration and [Perm] the lac permease concentration.
The production of cAMP (3) is defined by the following equation
𝑉𝑐𝐴𝑀𝑃 =𝑘𝑐𝐴𝑀𝑃
𝑝∗ (
𝐾𝑎,𝑐𝐴𝑀𝑃
[𝐺𝑙𝑢𝑒𝑥𝑡] + 𝐾𝑎,𝑐𝐴𝑀𝑃
)
[7- equation (2a)] where Ka,cAMP is the inhibition constant for the effect of glucose on
cAMP synthesis, kcAMP is the cAMP synthesis rate constant.
The equation that describes the rate of lac mRNA-ZYA (4) is the following, referencing to
reaction (4)
𝑉𝑚𝑅𝑁𝐴−𝑍𝑌𝐴 = 𝑘𝑚𝑅𝑁𝐴 −𝑍𝑌𝐴𝜂1 𝜂2𝜂3 [𝐺]
[7- equation (3)] where kmRNA-ZYA is the transcriptional rate constant including non-
productive promoter and η1, η2 and η3 are the transcription efficiency factors that describes
the transcriptional control by RNA- Polymerase, catabolite repression, and the repressor,
respectively.
29
The first efficiency factor is at level of the initiation complex of RNA-Polymerase with the
specific sigma subunit and the productive promoter. The factor is equal to the fract ion of
total promoters occupied by RNA-Polymerase holoenzyme
𝜂1 =[𝑅𝑁𝐴𝑃: ơ: 𝑃]
[𝐸]
[7- equation (4)] where [RNAP:ơ:P] is the concentration of complex of the RNA-
Polymerase holoenzyme (RNAP:ơ) and the promoter [P] is the total concentration of
promoters.
The second efficiency factor describes the enhancement of transcription initiation by the
binding of CRP-cAMP complex to its binding site near the promoter and is equal to the
fraction of total binding sites occupied by the CRP-cAMP complex
𝜂2 =[𝐶𝑅𝑃: 𝑐𝐴𝑀𝑃: 𝐸]
[𝐸]
[7- equation (5)] where [CRP:cAMP:E] is the concentration of CRP:cAMP bound to its
binding site in the lac operon (E) and [E] is the total concentration of these sites.
The third efficiency factor describes the inhibition of transcription by the binding of the lac
repressor protein to one of the three operator sites near the lac promoters and the
derepression of transcription by the binding of allolactose to the repressor. It is considered
that the binding of the repressor to two of the operator sites is necessary for tight repression
of transcription and that DNA looping between the two of operators increases the local
repressor concentration and stimulate binding of the repressor to multiple operators.
𝜂3 = ([𝑂1𝑓]
[𝑂1]) (
[𝑂2𝑓]
[𝑂2]) (
[𝑂3𝑓]
[𝑂3])
[7- equation (6)] where [O1f] is the concentration of free operator 1 and [O1] is the total
concentration of operator 1.
The values of the efficiency factors are inserted in the “Global quantities” section of the
program and their values are set with “assignment” in contrast with all the V parameters
that were inserted as function that determine the kinetics of the reactions.
The production of the enzyme β-galactosidase is assumed be proportional to the
concentration of lac-ZYA mRNA, referencing to reaction (9)
𝑉𝛽𝑔𝑎𝑙 = 𝑘𝛽𝑔𝑎𝑙 [𝑚𝑅𝑁𝐴𝑍𝑌𝐴]
[7- equation (7)] where kβgal is the translation rate constant. The equation for the synthesis
of permease is quite similar, referencing to reaction (8)
30
𝑉𝑃𝑒𝑟𝑚 = 𝑘𝑃𝑒𝑟𝑚 [𝑚𝑅𝑁𝐴𝑍𝑌𝐴 ]
[7- equation (8)] where kPerm is also the translation rate constant.
In contrast to the lacZYA promoter, the repressor gene is constitutively expressed. The
expression of reaction (5) is
𝑉𝑚𝑅𝑁𝐴 −𝑅𝑒𝑝 = 𝑘𝑚𝑅𝑁𝐴 −𝑅𝑒𝑝𝜂1 [𝐺𝑅]
[7- equation (9)] where kmRNA-Rep is the transcription rate constant and [GR] is the repressor
gene concentration. The rate of translation is dependent on the repressor mRNA
concentration [mRNARep], referencing to reaction (10)
𝑉𝑅𝑒𝑝 = 𝑘𝑅𝑒𝑝 [𝑚𝑅𝑁𝐴𝑅𝑒𝑝 ]
[7- equation (10)] where kRep is the translation rate constant.
About the degradation of lactose once inside cell, there are three kinetics expression
models relatives to the reactions (14, 15 and 16). In the (14) reaction the lactose is
converted to Allolactose described with
𝑉𝐿𝑎𝑐−𝐴𝑙𝑙𝑜 = 𝑘𝐿𝑎𝑐−𝐴𝑙𝑙𝑜 ([𝐿𝑎𝑐𝑖𝑛𝑡]
[𝐿𝑎𝑐𝑖𝑛𝑡] +𝐾𝑚𝐿𝑎𝑐
𝑝
) [𝛽𝑔𝑎𝑙]
[7- equation (12)] where [βgal] is the β-galactosidase concentration, Km,Lac is the saturation
constant for lactose transformation, and kLac-Allo is the rate constant for transformation of
lactose to allolactose per mole of enzyme.
The reaction (15) describe the rate of conversion of Allolactose to Glucose and Galactose.
It is assumed that the Glucose is immediately phosphorylated to Glucose-6-phosphate. It is
not known exactly which pathway the cell uses to consume the glucose formed from
lactose and allolactose then in ours model we will consider only the possibility of
instantaneous phosphorylation but this model could be integrated with more pathways,
expanding the simulation model itself.
𝑉𝑐𝑎𝑡,𝐴𝑙𝑙𝑜 = 𝑘𝑐𝑎𝑡−𝐴𝑙𝑙𝑜 ([𝐴𝑙𝑙𝑜]
[𝐴𝑙𝑙𝑜] +𝐾𝑚,𝐴𝑙𝑙𝑜
𝑝
) [𝛽𝑔𝑎𝑙]
[7-(14)] where [Allo] is the intracellular lactose concentration, Km,Allo is the saturation
constant for allolactose degradation, and kcat,Allo is the rate constant for hydrolysis of
allolactose per mole of enzyme.
31
The reaction (16) consider the possibility where the lactose is directly converted to glucose
and galactose by β-galactosidase. The equation is
𝑉𝑐𝑎𝑡 ,𝐿𝑎𝑐 = 𝑘𝑐𝑎𝑡,𝐿𝑎𝑐 ([𝐿𝑎𝑐𝑖𝑛𝑡]
[𝐿𝑎𝑐𝑖𝑛𝑡] +𝐾𝑚,𝐿𝑎𝑐
𝑝
) [𝛽𝑔𝑎𝑙]
[7-(13)] where Km,Lac is the saturation constant for lactose degradation and kcat,Lac is the rate
constant for transformation of lactose to glucose and galactose.
M ODELING THE LAC OPERON PATHWAY IN COPASI
ASSUMPTIONS AND LIMITATIONS OF THE MODEL
To run the simulation, we had to assume some parameters that could limits the correctness
of the model. The uncertainty about all the reactions involved in the regulation is the first
factor and the exclusion of some important variables in the model could be another factor.
The lac operon is one of the best studied model of genes expression regulation and we built
the model integrat ing the information from [5 and 7]. Others models, as the competition of
lactose transport by the not metabolizable compound tiomethyl galactoside [11], are not
considered in this model, because we assumed there are not others compounds in the
extracellular environment excepted Glucose and lactose. Further is possible that we did not
include some recent work that demonstrate that others compounds are involved in the
catabolite repression and inducer exclusion [12], however this could be a hint for a future
revisions and model improvement.
The simulations made by the authors of [7] consider multiple parallel pathways for the
same compound in different models. In our model, parallel pathways are considered only
for the reactions of the internal lactose that can be hydrolysed to allolactose and after in
glucose and galactose (2 steps reaction), or directly to glucose and galactose without pass
for the intermediate state of allolactose (1 step reaction).
Another assumption that could give us a partially incomplete simulation model is that “all
the glucose produced from lactose is directly converted in glucose-6-phospate from
hexokinase and available for the cell glycolysis”. However, this is not the focus of the
work and we consider this aspect as irrelevant because we are not interested of which way
glucose is metabolized. The authors of [7] integrate the model of direct phosphorylation
with the secretion of the glucose produced in the extracellular matrix by the cell, to be after
kept back inside.
In our model we not considered the exponential bacterial growth and so we do not account
the exponential request and oxidation of glucose and lactose made by the total number of
32
the cells during the time. We are focusing only on the regulation molecular processes
and so we can assume that the totality of cells present in the culture are equal to 1 unit of
biomass that is expressed in gram of Dry Cell Weight (g DCW). Some constants in [7] are
expressed in function of this parameter and in the ours simulations we assumed that the
total biomass quantity is equal to 1 and in the culture medium there is a particular factor
that avoid the cells proliferation. In this way we can simulate the behaviour of a single
group of cells that did not proliferate, as if in the medium is present an antimitotic agent as
5’-deoxy-nucleotides that avoid the synthesis of new DNA for the cell division.
Considering the parameter of experimental cellular density ƿ from the reference [7], the
total cell number in the simulation is 300 = 1 g DCW.
In our model we did not consider the glucose transport dependent by the quantity of
glucose transporter proteins GLUTs present on the plasmatic membrane because it make
the model more complex in a way that is useless for the object ives of this thesis. Moreover,
we assume constant oxidation of G6P (G6P DECAY reaction in Figure 6) by the cell
where in a real system the rate of oxidation is variable and depends by the cell needs. The
oxidation of G6P is equal to 5*10-5 mol/min and is almost equal to the efflux rate constant
for glucose in the constant tables reported in [7].
Further, as we said above, we did not consider limiting concentrations of basic compounds
that are fundamental for cell metabolism and common functions as amino acids,
ribonucleotides, ATP and other energetic molecules.
33
CHAPTER V: SIM ULATIONS, RESULTS AND CONCLUSION
In this chapter we present the simulation time parameters and the initial conditions. We
reported a list of injection events in each one of the simulations and ideas for futures
simulations. We show then the mains results graphs for each one of ours simulations
conditions, discussing them and giving conclusions of the presented thesis.
SIM ULATIONS
TIME PARAMETERS AND SIMULATION RESOLUTION
In this section we describe how to set simulation parameters in COPASI. In Figure 9, we
report the COPASI screenshot of the form to complete before starting the simulation.
Figure 12 – COPASI Time Course branch screenshot
For our runs we consider a duration time of 501000 minutes. The program automatically
calculates that the number of intervals. We choose a big interval of time because we can let
the system reach a relatively stable state during the simulations then we assumed that
500000 minutes are enough time to reach relatively stable concentration of lac Operon
basal metabolism proteins. Looking the Free Operons, mRNA and Proteins levels in the
Graphs series I (GRAPHS group I), at minute 500000 we have a “good” value of stability.
Then we consider the minute 500000 as the 0 (zero) time of the experiment.
34
SIM ULATIONS
INITIAL CONDITIONS
We set the initial conditions (concentrations) of the simulations as the ones reported in [7].
We assumed only the quantity of initial extracellular glucose concentrations as 4*10-6 and
the same value for the intracellular glucose (G6P) concentration. In Figure 10, we report
the COPASI screenshot of the initial conditions of our model.
Figure 13 - A screenshot of COPASI species initial conditions of the simulations
35
SIM ULATIONS
SCHEDULED EVENTS
We specify an event to inject lactose in our model during simulation. We run 7 simulation
s each of them having different concentration of glucose and lactose injections. In Figure
11 we report for each run simulation the glucose and lactose concentration injected. In the
first simulation no glucose nor lactose have been injected. From simulation 2 to simulation
5, only lactose has been injected with different concentrations that go from 0.000000004 to
0.04 moles. In simulation 6 is scheduled an injection of a mix of glucose and lactose while
the last one (Simulation 7) considers only the injection of glucose to verify any not
consistent reaction without lactose. Considering such events, we would be able to observe
model behaviour and to verify its consistence with respect to the theoretic model.
Figure 14 -The Injections are listed in the scheme
SIM ULATIONS
INTEGRATION - IDEAS FOR FUTURE SIMULATIONS
More others events could be scheduled as an example, multiple injection of the same
quantity of lactose to verify if the system (the cells), enhance or diminish the efficiency of
lactose metabolisms under multiple periodic injections condition.
36
SIM ULATIONS (GRAPHS GROUP I)
SIMULATION 1
37
The I-1 graph show the cAMP levels along the simulation. The I-2 show the external
glucose (blue line), external lactose (green line) and G6P (red line) concentrations. The I-3
show the allolactose concentration (redline) and the internal lactose concentration (blue
line). The I-4 show the levels of Repressor mRNA (red line) and ZYA mRNA (blue line).
The I-5 show the free DNA operators binding sites along the simulation. The I-6 show the
proteins levels: Repressor (green line), β-galactosidase (red line) and Permease (blue line).
In the graph I-7 we have the three transcription efficiency factors η1 (red line), η2 (blue
line) and η3 (green line).
38
SIM ULATIONS (GRAPHS GROUP II)
SIMULATION 2
In the simulation number 2 we have on graph II-1 the external glucose (blue line), external
lactose (green line) and the G6P (red line). The II-2 show the allolactose levels (red line)
and internal lactose concentration (blue line). The graph II-3 show the free DNA operators
binding sites and the graph II-4 the levels of proteins Repressor (green line), β-
galactosidase (red line) and Permease (blue line).
39
SIM ULATIONS (GRAPHS GROUP III)
SIMULATION 3
In the simulation number 3 it is showed in the first graph (III-1) the external glucose (blue
line), external lactose (green line) and G6P (red line). III-2 show the allolactose
concentration (red line) and internal lactose concentration (blue line). The III-3 the free
DNA operators binding sites and the III-4 the levels of Repressor Protein (green line), β-
galactosidase (red line) and Permease (blue line).
40
SIM ULATIONS (GRAPHS GROUP IV)
SIMULATION 4
In In the simulation 4 we have always the molecules graphs (IV-1 and IV-2), the free
operators trend (IV-3) and the proteins concentration along the simulation (IV-4). In the
IV-4 the repressor is the green line, β-galactosidase the red line and the Permease the blue
line.
41
SIM ULATION (GRAPHS GROUP V)
SIMULATION 5
In the simulation number 5 we have in the graph (V-1) the external glucose (blue line), the
external lactose (green line) and the G6P (red line). V-2 show the allolactose (red line) and
the internal lactose (blue line). The graph V-3 show the levels of the three operators sites
and V-4 the proteins concentration Repressor (green line), β-galactosidase (red line) and
Permease (blue line).
42
SIM ULATION (GRAPH GROUP VI)
SIMULATION 6
In the simulation number 6 we have in the graph VI-1 the external glucose (blue line),
external lactose (green line) and G6P (red line). In VI-2 we have the external lactose (green
line) trend that cannot be appreciated in graph 1 cause the low concentration in respect to
the external glucose. The VI-3 show allolactose (red line) and the internal lactose (blue
line). The VI-4 describe the protein trends: Repressor (green line), β-galactosidase (red
line) and Permease (blue line). The last graph VI-5 show the levels of the free operators
binding sites.
43
SIM ULATION (GRAPHS GROUP VII)
SIMULATION 7
In these graphs series we have VII-1 that show cAMP levels. VII-2 that show external
glucose (blue line), external lactose (green line) and G6P (red line). VII-3 show the levels
of allolactose (red line) and internal lactose (blue line). VII-4 show the protein levels:
Repressor (green line), β-galactosidase (red line) and Permease (blue line). The graph VII-
5 show the levels of free operator DNA binding sites and the VII-6 show the Transcription
Efficiency Factors: η1 (red line), η2 (blue line) and η3 (green line).
44
RESULTS
The first graph group (Graphs group I) show all the parameters of simulation during the
stabilization phase that we assume as 500000 minutes. This simulation in a long time is an
alternative way to find the steady-state of the system, we let stabilize the values in a long
duration time but the result that we are observing are between the minute 500000 and
501000, so an interval of 1000 minutes. We choose this way to set the steady-state because
the steady state option found a stable state with some negative concentration parameters
that are unreal from the beginning. We can see how after the first moment (100000-200000
minutes) the levels of the proteins, mRNA and molecules reach the minimum values. Then
we considered as everything before the minute 500000 are initial conditions. After this, in
the simulation number 2 (Graphs group II) was injected the first lactose quantity (see
Figure 11). The answer of the cells has a delay of more than 900 minutes before the free
operators levels increase and the synthesis of proteins start to significantly enhance. The
delay seems be caused by the very slow transport rate of the external lactose inside the cell
given the small concentration. In the III graphs group (Graph group III) the injected
quantity was bigger and the answer of the cells was significantly faster. Despite the levels
of intracellular lactose and allolactose, and the following free operators, increase in less
than 100 minutes thanks to the bigger lactose gradient and the low levels of external
glucose that does not interfere with lactose transport (inducer exclusion). The simulation
four show lesser time response than the 3rd simulation and the protein production peak is
only about 2 times the response of the system with a concentration 100 times smaller. The
difference is made by the fact that in the simulation 3 the quantity of allolactose does not
saturated all the repressor bond to the DNA, despite the free levels of Operator1(O1) and
Operator2(O2) do not reach the maximum then there is even a little bit of repression. In the
simulation 4 instead the repressors are completely bond to the higher quantity of
allolactose then the free operators DNA binding sites reach the maximum concentrations,
that describe the situation where all the DNA binding sites are free from repressors, then
the binding by the complex [RNAP:σ] to the promoter is enhanced and the transcription is
at maximum rate. In the simulation 5 the concentration of injected lactose was enough to
saturate the cell regulation mechanisms for a longer time until all the lactose is converted
in glucose-6-phosphate (G6P) and galactose. For all the time that the allolactose is
plentiful, the cells continue to produce enzymes needed for its catabolism and the free
operator levels remain the highest. Another approach to the simulation is the number 6,
where we reproduce the lactose injection in simulation 4 but we added 0.4 moles of
glucose in the same injection. As we can observe (Simulation6, Graph groups 6) the cells
answer with a bigger delay than the simulation 4 caused by the inducer exclusion of the
glucose against the lactose transport. The theoretical model suggests that when glucose is
present the lactose metabolisms should be inhibited but the in this case the simulation
model indicate that there is a lower activity of the genes and a delay in answer of the cell,
but not the complete repression. Indeed the lac proteins start to increase later with a slower
ratio than the simulation 4 while the energetic substrate glucose is consumed. The
45
theoretical model does not specify which is threshold glucose concentration when the lac
repression start to come less. Some works, as [12], indicate that another compound
(glucose-6-phosphate) is involved in the inducer exclusion phenomenon. In our model the
only compound that make inducer exclusion is the external glucose, and not the glucose in
all its forms. This means that the program make distinguish between eternal glucose,
internal glucose and G6P, then if the external glucose in low but the internal G6P is higher
the inducer exclusion does not work so well and the lac enzymes concentration enhance
before we expected them. To test if there are fundamental functional errors in the model
we make another simulation where are injected only 0,04 moles of glucose and see if the
system start to produce lac proteins. The Simulation 7 (Graphs group 7) show the results.
The cAMP levels go down as the external glucose is added. This is consistent with the
theoretical model and increases again when the external glucose is consumed. The lac
protein levels are stable at the basal expression concentration and the operators are tightly
repressed. Observing the TEFs graph (VII-6) we can observe the effects of tight repression
in the low values of η2 and η3.
In each ones of the simulations just after that the allolactose disappears, the free operator
levels go down and the proteins levels too, thanks to the degradation that in these moments
start to be higher than the synthesis.
We have considered that a less number of simulation is not enough to valuate a complex
behaviour of the model, but as bigger will be the simulations number, richer will be the
quality of results. A sensitive parameters analysis is not the purpose of the work that we
limited only to the reproduction of the metabolic regulation of the lac operon using the
software COPASI and not a deeper functional analysis.
46
CONCLUSIONS
The bioinformatics offers the possibility to make virtual experiments in many fields
reducing the costs of experimental science, however it cannot substitute it. The two fields
should be integrated. The experimental science produce the raw data for the computing
sciences that can analyse it, build simulation models and test the models in a variety of
artificial conditions where the experimental sciences could have problems. The advantages
of computing science are the rapidity to obtain previsions, test the models and aid the
experimental science to orientate the research. The models quality reflects the scientific
knowledge of the specific system or pathway that is analysed, the quality of data, the right
selection of essential and not essential variables in the isolated model. Through the
COPASI software, used by the German Network of Bioinformatics, we tried to reproduce
the lac operon regulation crossing the mathematical model developed and the reaction
network dynamics in a real cell. The model is made by 46 reactions, 46 biochemical
species, 69 global quantities. We apply the equations of assumed kinetics and parameters
by [7] in ours model, integrat ing it with a reactions oriented model. The simulations
performed with different injected quantities of lactose and after glucose show us the
behaviour that the cells had responding to the presence of new external feeding source. The
simulation with only glucose show that the system in absence of lactose respond in a
consistent way with the theoretical model, without exp ression of lac protein and lac
mRNA, showing that there are no fundamental functional errors. However, the model does
not consider the all effects of inducer exclusion made by the G6P towards the lactose
transport (Permease) but only the external glucose inducer exclusion with the results that
when the internal glucose-6-phosphate (G6P) concentration is high and the external is low
while there is lactose in the environment, the cell start to absorb it and produce the
enzymes before we expected those. Different quantities of lactose injected stimulate the
system to have different trends of lac genes translation and proteins production. Many
other tests could be executed on the models and parameters but were not the objectives of
the work. The model does not consider related cell pathways interaction and make some
generic assumption to does not complex it. Remembering that we took the values assumed
for kinetics models from ref [7]. The model creation and simulation arise a specific
question about the regulation dynamics: “Is there a glucose threshold concentration, in
presence of lactose, when the cell start to significantly increase the transcription of lac
genes at?” Further the model developed could be improved and shared with the university
platform to help the students to understand the dynamics of the lac operon regulation,
makes simulations in every desired virtual artificial condition, and offering an integrative
tool to their academic knowledge.
47
REFERENCES
[1] – Biochemical system Simulator. Web address: http://copasi.org/
[2] – The Systems Biology Markup Language. Web address: http://sbml.org/
[3] – German Network for Bioinformatics Infrastructure. Web address:
http://www.denbi.de/
[4] – Glycolysis. Hexokinase. Web address: http://glycolysis.co.uk/hexokinase.php
[5] – James D. Watson, Tania A. Baker, Alexander Gann, Michael Levine, Richard Losick |
Molecular Biology of the Gene, 7th edition | ISBN: 9780321762436
[6] – Operon. Web address: https://en.wikipedia.org/wiki/Operon
[7]- Patrick Wong, Stephanie Gladney, J.D. Keasling. Mathematical Model of the lac
Operon: Inducer Exclusion, Catabolite Repression, and Diauxic Growth on Glucose and
Lactose. Biotechnol. Prog. 1997, 13, 132-143.
[8] – Cell Biology by the Numbers. Web address: http://book.bionumbers.org/how-big-is-
an-e-coli-cell-and-what-is-its-mass/
[9] - Muhammad Kamran Taj, Zohra Samreen, Ji Xiu Ling, Imran Taj, Taj Muhammad
Hassani and Wei Yunlin. Escherichia Coli as a model organism. Int. J. Engg. Res. Sci.
Tech. 2014| ISSN 2319-5991| Vol.3, No. 2, May 2014 IJERST
[10] – Ji Xiong. Essential Bionformatics. Cambridge university, 2006| ISBN-IO: 0-521-
60082-0
[11] – Orlando Diaz-Hernandez and Moisés Santillan. Bistable behaviour of the lac operon
in E. coli when induced with a mixture of lactose and TMG. Raina Robeva, Sweet Briar
College, USA – July 2010, Volume I, Article 22. Doi: 10.3389/fphys.2010.00022
[12] – Boris M. Hogema, Jos C. Arents, Rechien Bader, Kevin Eijkemans, Toshifumi
Inada, Hiroji Aiba and Pieter W. Postma. Inducer exclusion by glucose6-phosphate in
Escherichia coli. Blackwell Science Ltd, Molecular biology (1998), 28(4), 755-765. Doi:
10.1046/j.1365-2958.1998.00833