neutral graph of regulatory boolean networks using evolutionary computation

8
Neutral graph of regulatory Boolean networks using evolutionary computation Gonzalo A. Ruz Facultad de Ingenier´ ıa y Ciencias Universidad Adolfo Ib´ nez Av. Diagonal Las Torres 2640, Santiago, Chile [email protected] Eric Goles Facultad de Ingenier´ ıa y Ciencias Universidad Adolfo Ib´ nez Av. Diagonal Las Torres 2640, Santiago, Chile [email protected] Abstract—An evolution strategy is proposed to construct neu- tral graphs. The proposed method is applied to the construction of the neutral graph of Boolean regulatory networks that share the same state sequences of the cell cycle of the fission yeast. The regulatory networks in the neutral graph are analyzed, identifying characteristics of the networks which belong to the connected component of the fission yeast cell cycle network and the regulatory networks that are not in the connected component. Results show not only topological differences, but also differences in the state space between networks in the connected component and the rest of the networks in the neutral graph. It was found that regulatory networks in the fission yeast cell cycle network connected component can be mutated (change in their interaction matrices) no more than three times, if more mutations occur, then the networks leave the connected component. Comparisons with a standard genetic algorithm shows the effectiveness of the proposed evolution strategy. I. I NTRODUCTION A neutral graph (also known as a neutral network) [1]– [3] is an undirected graph where each node represents a regulatory network, each one of these share a certain sequence in the state space. If two nodes are connected in the neutral graph this means that the Hamming distance (number of entries in which two adjacency matrices differ) between the interaction (adjacency) matrix of one network and the other is one. The connectivity of a neutral graph can give an insight of the topological robustness of the regulatory networks. A neutral graph with large connected components (nodes) can be considered of having high robustness whereas a neutral graph with many small connected components (or disconnected) can be considered of having low robustness [1]. Amongst the many mathematical models for gene regu- latory networks, such as: probabilistic Boolean networks [4], petri nets [5], Bayesian networks [6], recurrent neural networks [7], differential equations [8], linear regression [9], etc., one of the simplest is the Boolean network introduced by Stuart Kauffman [10]. In this model, nodes represent genes that can be either active (value 1) or inactive (value 0), the edges of the network represent regulatory relations, and the dynamics of the network (how the nodes change their values) is governed by local Boolean functions. Using a discrete time clock, the network can update its values using different schemes [11]. The most common is the synchronous, also known as the parallel: where in each time step, every node is updated at the same time, the sequential: where in every time step, every node is updated in a defined sequence, and a mixture of the previous two, the block-sequential: the set of nodes, for a given sequence, is partitioned into blocks, the nodes in a same block are updated in parallel, but blocks follow each other sequentially. These are all deterministic updating schemes, and in general it is known that there are an exponential number of updating schemes [12]. Of course we can have non-deterministic updating schemes, like the asynchronous, where in each time step nodes are selected randomly to update their values. An interesting characteristic of Boolean networks are the attractors, which are the states of the network where each of the possible 2 n states, where n is the number of nodes in the network, converge after some updates of the network. Kauffman associated the attractors to different cell types [10]. For example, in [13], a Boolean network model of the floral morphogenesis in Arabidopsis thaliana is proposed. The Boolean network consists of 12 interacting chemical species, designated by EMF1, TFL1, LFY, AP1, CAL, LUG, UFO, BFU, AG, AP3, PI, and SUP. The BFU species, is a dimer of the AP3 and PI proteins, all the rest are proteins as well. This model contains 6 fixed points attractors : 1) (000100000000), 2) (000100010110), 3) (000000001000), 4) (000000011110), 5) (110000000000), 6) (110000010110). Each one has associated a cell type: 1) sepal, 2) petal, 3) carpel, 4) stamen, 5) inflorescence, 6) mutant (unobserved cell). It has also been reported in other works, that attractors have been considered as cancer cells [14]. In this paper, we will analyze the neutral graph of Boolean regulatory networks that have in common a portion of the dynamics of a Boolean model of the cell cycle of the yeast species Schizosaccharomyces pombe (fission yeast) [15]. The cell-cycle involves four phases, G 1 (Gap 1), S (Synthesis), G 2 (Gap 2) and M (Mitosis). In the G 1 phase, cells increase in size. Furthermore, inside this phase there is an important checkpoint, named ”Start point” in the yeast. G 1 checkpoint makes the key decision whether the cell should divide (enter to S phase), delay division, or enter a resting stage. This decision will depend on the environmental conditions. In the S phase, DNA replication occurs, in order to duplicate the genetic material. During this phase, synthesis is completed as quickly as possible due to the exposed base pairs being sensitive to external factors such as any drugs taken or any mutagens. During the gap between DNA synthesis and mitosis, G 2 phase, the cell will continue to grow. The G 2 checkpoint control mechanism ensures that everything is ready to enter the M (Mitosis) phase and divide. Finally, when the cell enters

Upload: independent

Post on 15-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

Neutral graph of regulatory Boolean networks usingevolutionary computation

Gonzalo A. RuzFacultad de Ingenierıa y Ciencias

Universidad Adolfo IbanezAv. Diagonal Las Torres 2640, Santiago, Chile

[email protected]

Eric GolesFacultad de Ingenierıa y Ciencias

Universidad Adolfo IbanezAv. Diagonal Las Torres 2640, Santiago, Chile

[email protected]

Abstract—An evolution strategy is proposed to construct neu-tral graphs. The proposed method is applied to the constructionof the neutral graph of Boolean regulatory networks that sharethe same state sequences of the cell cycle of the fission yeast.The regulatory networks in the neutral graph are analyzed,identifying characteristics of the networks which belong to theconnected component of the fission yeast cell cycle network andthe regulatory networks that are not in the connected component.Results show not only topological differences, but also differencesin the state space between networks in the connected componentand the rest of the networks in the neutral graph. It was foundthat regulatory networks in the fission yeast cell cycle networkconnected component can be mutated (change in their interactionmatrices) no more than three times, if more mutations occur,then the networks leave the connected component. Comparisonswith a standard genetic algorithm shows the effectiveness of theproposed evolution strategy.

I. INTRODUCTION

A neutral graph (also known as a neutral network) [1]–[3] is an undirected graph where each node represents aregulatory network, each one of these share a certain sequencein the state space. If two nodes are connected in the neutralgraph this means that the Hamming distance (number ofentries in which two adjacency matrices differ) between theinteraction (adjacency) matrix of one network and the other isone. The connectivity of a neutral graph can give an insightof the topological robustness of the regulatory networks. Aneutral graph with large connected components (nodes) can beconsidered of having high robustness whereas a neutral graphwith many small connected components (or disconnected) canbe considered of having low robustness [1].

Amongst the many mathematical models for gene regu-latory networks, such as: probabilistic Boolean networks [4],petri nets [5], Bayesian networks [6], recurrent neural networks[7], differential equations [8], linear regression [9], etc., oneof the simplest is the Boolean network introduced by StuartKauffman [10]. In this model, nodes represent genes that canbe either active (value 1) or inactive (value 0), the edges ofthe network represent regulatory relations, and the dynamics ofthe network (how the nodes change their values) is governedby local Boolean functions. Using a discrete time clock, thenetwork can update its values using different schemes [11].The most common is the synchronous, also known as theparallel: where in each time step, every node is updated atthe same time, the sequential: where in every time step, every

node is updated in a defined sequence, and a mixture of theprevious two, the block-sequential: the set of nodes, for agiven sequence, is partitioned into blocks, the nodes in a sameblock are updated in parallel, but blocks follow each othersequentially. These are all deterministic updating schemes,and in general it is known that there are an exponentialnumber of updating schemes [12]. Of course we can havenon-deterministic updating schemes, like the asynchronous,where in each time step nodes are selected randomly toupdate their values. An interesting characteristic of Booleannetworks are the attractors, which are the states of the networkwhere each of the possible 2n states, where n is the numberof nodes in the network, converge after some updates ofthe network. Kauffman associated the attractors to differentcell types [10]. For example, in [13], a Boolean networkmodel of the floral morphogenesis in Arabidopsis thalianais proposed. The Boolean network consists of 12 interactingchemical species, designated by EMF1, TFL1, LFY, AP1,CAL, LUG, UFO, BFU, AG, AP3, PI, and SUP. The BFUspecies, is a dimer of the AP3 and PI proteins, all the rest areproteins as well. This model contains 6 fixed points attractors: 1) (000100000000), 2) (000100010110), 3) (000000001000),4) (000000011110), 5) (110000000000), 6) (110000010110).Each one has associated a cell type: 1) sepal, 2) petal, 3) carpel,4) stamen, 5) inflorescence, 6) mutant (unobserved cell). It hasalso been reported in other works, that attractors have beenconsidered as cancer cells [14].

In this paper, we will analyze the neutral graph of Booleanregulatory networks that have in common a portion of thedynamics of a Boolean model of the cell cycle of the yeastspecies Schizosaccharomyces pombe (fission yeast) [15]. Thecell-cycle involves four phases, G1 (Gap 1), S (Synthesis),G2 (Gap 2) and M (Mitosis). In the G1 phase, cells increasein size. Furthermore, inside this phase there is an importantcheckpoint, named ”Start point” in the yeast. G1 checkpointmakes the key decision whether the cell should divide (enterto S phase), delay division, or enter a resting stage. Thisdecision will depend on the environmental conditions. In theS phase, DNA replication occurs, in order to duplicate thegenetic material. During this phase, synthesis is completedas quickly as possible due to the exposed base pairs beingsensitive to external factors such as any drugs taken or anymutagens. During the gap between DNA synthesis and mitosis,G2 phase, the cell will continue to grow. The G2 checkpointcontrol mechanism ensures that everything is ready to enter theM (Mitosis) phase and divide. Finally, when the cell enters

into the M phase, the growth of the cell stops and cellularenergy is focused on the orderly division into two daughtercells. A checkpoint in the middle of the mitosis (Metaphasecheckpoint) ensures that the cell is ready to complete the celldivision. After the M phase, the cell goes back to the stationaryG1 phase, waiting for a signal for another round of division.

The study of the dynamics of the fission yeast cell cy-cle network for all the deterministic updating schemes wasconducted in [16]. It was found that there are 15350 differentdynamics, from which 5513 contain only one limit cycle each,with cycle lengths in the range of 2 to 4. Also, the fixed pointwith the largest basin of attraction using the parallel updatingmode, maintains this property for all the different updatingschemes. Therefore, although the network has different limitcycles for different updates, it could be considered robust inthe sense that the basin of attraction for that fixed point for allthe deterministic dynamics does not vary that much.

In order to construct a neutral graph, we need to constructBoolean regulatory networks that have a certain property incommon, this opens the opportunity to use intelligent com-putational techniques, in particular evolutionary computationand related techniques to construct networks with predefinedproperties. Following this line of research, in [17] simulatedannealing was used to learn Boolean networks with prede-fined attractors for sequential updating mode only. The resultsshowed the existence of a power law between the frequency ofthe networks and the number of different updating sequencesthey can have without loosing the attractor. It was found thatthere is a decrease of the network’s robustness as the cyclelength grows. The swarm intelligence technique called the beesalgorithm has been formulated to construct Boolean networkswith predefined attractors [18]. It has been used to buildsynthetic networks of the budding yeast cell-cycle in [19], thatpromote cell proliferation for biotechnological applications.In [20], a reverse engineering technique was applied to thereconstruction of the mammalian cell cycle network using thebinary gene expression data generated by the logical modelin [21]. This reverse engineering method used an informationtheoretic approach combined with a modified version of theoriginal bees algorithm. More recently, extensions to that workhave been presented in [22] where synthetic networks areconstructed that share the same limit cycles (of length seven)and the fixed point of the mammalian cell cycle network.

In this paper we propose an evolution strategy to generatea neutral graph, in particular, we will generate a neutral graphof Boolean regulatory networks that share the same statesequences of the cell cycle of the yeast species Schizosaccha-romyces pombe (fission yeast). We analyze the resulting neutralgraph and compare characteristics of the regulatory networksthat appear in the connected component of the original yeastcell cycle network with the networks that are not in theconnected component. Also we compare the results of theproposes evolution strategy with a standard genetic algorithm.

The paper is organized as follows. Section II gives a briefintroduction to Boolean nets and the fission yeast cell cyclenetwork. The problem addressed in this paper is defined insection III. The proposed evolution strategy is described insection IV, and simulations using the evolution strategy insection V. Conclusions of the work are in section VI.

II. BACKGROUND

A. Boolean networks

Let x be a finite set of n variables, x = {x1, . . . , xn},with xi ∈ {0, 1} for i = 1, . . . , n. A Boolean network isa pair (G,F ), where G = (V,E) is a finite directed graph;V being the set of n nodes and E the set of edges. F isa Boolean function, F : {0, 1}n → {0, 1}n composed ofn local functions fi : {0, 1}n → {0, 1}. Furthermore, eachlocal function fi depends only on variables belonging to theneighborhood Vi = {j ∈ V|(j, i) ∈ E}. The indegree of vertexi is |Vi|. The updating schemes are repeated periodically, andsince the hypercube is a finite set, the dynamics of the networkconverges to attractors which are fixed points or limit cycles,defined by

• Fixed point: xi(t+ 1) = xi(t) for i = {1, . . . , n}.• Limit cycle: xi(t+ p) = xi(t) for i = {1, . . . , n}.

where p > 1 is a positive integer called the limit cycle length.The set of states that can lead the network to a specific attractoris called the basin of attraction. An alternative to working withlogical gates in each node, is to consider a threshold Booleannetwork, where updates of each node are computed by

xi(t+ 1) = fi(x) = u

(n∑

j=1

ωijxj(t)− θi

)(1)

u(z) =

{1, if z ≥ 00, if z < 0

with ωij the weight of the edge coming from node j into thenode i, and θi the activation threshold of node i. The weightsand thresholds are the network’s parameters.

B. Fission yeast cell cycle network

Let us consider the Boolean network model for the fissionyeast cell cycle studied in [15]. It is composed by ten nodesas shown in Fig. 1. Using a similar representation as in[15], the green/solid edges are positive weights (activations),the red/dashed edges are negative weights (inhibitory). Theorange/dashed loops are self-degradation, which are modeledmathematically by a negative weight. The gene updates arecomputed by a variant of (1), as defined in [15] :

xi(t+ 1) = u

(n∑

j=1

wijxj − θi

)(2)

=

0, if

∑nj=1 wijxj − θi < 0

1, if∑n

j=1 wijxj − θi > 0xi(t), if

∑nj=1 wijxj − θi = 0

The weight matrix and the threshold vector used in (2) togenerate the same dynamics exhibited in [15] can be found in[16].

Using the parallel updating scheme, the cell cycle ismodeled by starting from an initial state vector at time t = 1and then the dynamics of the network produces sequences ofstate vectors until t = 10, where the network converges toa fixed point which represent the G1 phase. Details of theprevious sequences are shown in Table I.

TABLE I: Temporal evolution of sate vectors defining the fission yeast cell cycle

Time Start SK Cdc2/Cdc13 Ste9 Rum1 Slp1 Cdc2/Cd13∗ Wee1/Mik1 Cdc25 PP Phase1 1 0 0 1 1 0 0 1 0 0 START2 0 1 0 1 1 0 0 1 0 0 G1

3 0 0 0 0 0 0 0 1 0 0 G1/S4 0 0 1 0 0 0 0 1 0 0 G2

5 0 0 1 0 0 0 0 0 1 0 G2

6 0 0 1 0 0 0 1 0 1 0 G2/M7 0 0 1 0 0 1 1 0 1 0 G2/M8 0 0 0 0 0 1 0 0 1 1 M9 0 0 0 1 1 0 0 1 0 1 M10 0 0 0 1 1 0 0 1 0 0 G1

Cdc2/Cd13*

SK

Start

Ste9Cdc2/Cd13 Rum1

Cdc25PP

Slp1Wee1/Mik1

Fig. 1: The fission yeast cell-cycle threshold Boolean net-work. Using a similar configuration as [15], (color online) thegreen/solid edges represent positive weights (activations), thered/dashed edges represent negative weights (inhibitory). Theorange/dashed loops represent self-degradation.

III. PROBLEM DESCRIPTION

As mentioned in the introduction, a neutral graph is ametagraph (network of networks) where each node represents aregulatory network that produces the same temporal evolutionfor a set of states out of the 2n possible configurations. In thispaper, we considered the ten state sequences shown in Table I.Following the terminology used in [3], regulatory networksthat reproduce this temporal evolution will be called functionalnetworks, while the original fission yeast cell cycle networkwill be called wildtype network, which is a functional networkas well given the previous definition. Although the wildtypenetwork and a functional network will share the same sate se-quences of Table I, they do not necessarily share the dynamicsproduced by the remaining 210 − 10 states. The connectivityin a neutral graph is given by the Hamming distance of theinteraction matrices (adjacency matrices) of the functionalnetworks. Two nodes are connected if the Hamming distanceof the respective functional network’s interaction matrices areequal to one, i.e., both matrices differ in only one element. Thesearch space for functional networks is huge, which makes ita difficult problem. The search consists in finding the weightmatrix elements wij and the threshold vector elements θithat can replicate the desired state sequences. Therefore, an

Initial random candidate networks

Found solution?

Random candidate networks

Fitness evaluation

Finish

Rank networks

Select top m%

Mutation

New candidate networks

yes

no

Fig. 2: Evolution strategy proposed to search for functionalnetworks

opportunity for intelligent search strategies arises, in particularthe use of evolutionary computation.

IV. EVOLUTIONARY COMPUTATION FOR NEUTRAL GRAPHCONSTRUCTION

To carry out the search for functional networks, we proposean evolution strategy (ES) illustrated in the flow chart in Fig. 2.In what follows we will describe each stage.

A. Initial random candidate networks

A user defined parameter popSize indicates the size ofthe initial population. These are generated in the followingway. Using as a base the wildtype adjacency matrix andthreshed vector, a new candidate network (solution) is obtainedby changing ngh times the wildtype adjacency matrix andthreshed vector. The parameter ngh is selected randomly in therange of [1, 30], for every new candidate network generated.The wildtype adjacency matrix is changed using the followingrule:Rule 1

1) Select randomly a position (i, j) in the matrix.2) If the position contains a non-zero number, then

replace by a zero.3) Else, replace with a value selected randomly from the

following set {−2,−1, 1, 2}.

The wildtype threshold vector is changed using the followingrule:Rule 2

1) Select randomly a position i in the vector.2) Replace with a value selected randomly from the

following set {−2,−1,−1/2, 0, 1/2, 1, 2}.

both rules are repeated ngh times.

B. Fitness evaluation

Each candidate network is evaluated in a fitness functiondefined as follows. The fitness function for the Boolean regula-tory network B, is computed by the deviation of the network’soutput, defined by oi for each node i, and the target value si(sequence of the cell cycle) for each node i:

fitness(B) =1

10n

10∑t=1

n∑i=1

(oi(t)− si(t))2 (3)

where n is the number of nodes in the network, and 10 isthe number of state vector sequences (from Table I) that thenetwork must contain.

C. Rank networks

Given that the fitness function defines the deviation of thenetwork’s output with respect to the desired target, the ESis formulated to solve a minimization problem, therefore, thecandidate networks are ranked from the less deviated to themore deviated.

D. Select top m%

A user defined parameter m indicates the percentage of theranked top solutions to be selected.

E. Mutation

Using the top m% solutions, (popSize−popSize×m%)/2new candidate networks are generated using the following rule:Rule 3

1) Select randomly one of the top m% solutions.2) Mutate the selected solution. This is done by applying

Rule1 and Rule2 with ngh = 1.

this is repeated until completing the (popSize − popSize ×m%)/2 new candidate networks.

F. Random candidate networks

To complete the popSize, the remaining (popSize −popSize ×m%)/2 is filled with random candidate networksgenerated using Rule1 and Rule2 with ngh selected randomlyin the range of [1, 30], for every new candidate networkgenerated.

G. New candidate networks

The new population to be evaluated in the fitness function,thus, completing the loop, is composed by the top m%solutions + the networks generates by the mutation stage +the networks generated randomly.

V. SIMULATIONS

The proposed ES is used to construct a neutral graph withfunctional networks that contain the fission yeast cell cyclestate sequences of Table I. In order to bound the search space,the elements of the weight matrices were constrained to thefollowing integer values: {−2,−1, 0, 1, 2}, and the thresholdvectors to: {−2,−1,−1/2, 0, 1/2, 1, 2}. Also, popSize = 20,m% = 20% and max iterations =100. Three simulations werecarried out, the first two using the ES, and the third using astandard genetic algorithm (GA) for comparison reasons.

A. Simulation 1

In this simulation, ES was set to search for 10000functional networks. Histograms of the resulting functionalnetworks topologies are shown in Fig. 3, where Fig. 3(a)shows the distribution of the total number of edges (non-zeroelements in the weight matrices) in the functional networks,Fig. 3(b) shows the distribution of the number of positivesedges and Fig. 3(c) the distribution of the negative edges.If we consider that the wildtype network has a total of 27edges composed of 8 positive edges and 19 negative edges,we can see from the histograms that in general the functionalnetworks are mostly concentrated in these values but theycan also have less or more edges then the wildtype. Thedistribution of the edges changes if we separate the functionalnetworks that belong to the wildtype connected componentand the functional networks that are not in the connectedcomponent. As we can see in Fig. 4, histograms on theleft side are from the wildtype connected component whichare more concentrated around the wildtype topology whereashistograms on the right are from functional networks that arenot in the connected component. For visualization purposes wesampled 100 and 1000 functional networks from the 10000found by the ES, to generate the neutral graph in Fig.5(a)and Fig. 5(b) respectively. The wildtype network appears inred (Color online). We notice that for functional networksnot in the wildtype connected component, it is rare to seother components, in particular, in Fig. 5(b) we see only oneadditional component besides the wildtype one, formed by twofunctional networks. From the wildtype connected componentwe notice that functional networks reach the wildtype in nomore than 3 steps.

The density of the basin of attraction of the G1 fixedpoint of the functional networks in the wildtype connectedcomponent (blue/dashed line) and the rest of the networks(green/solid line) appears in Fig. 6. We can appreciate thatboth densities are quite different, while the functional net-works in the connected component have a basin of attractionmostly concentrated between 700∼800, nevertheless, the restof the networks shows a density more spreaded out throughoutthe 400∼1000, with slightly more concentrated in the range900∼1000. Finally, two functional networks are shown inFig. 7, where Fig. 7(a) belongs to the connected component ofthe wildtype and Fig. 7(b) not in the connected component.

B. Simulation 2

We now consider the case where we search for the con-nected component of the wildtype network, i.e., functionalnetworks that are connected to the wildtype, or to other

22 24 26 28 30 32 34 36 38 40

Total number of edges

His

togra

m

0500

1500

2500

3500

(a)

7 8 9 10 11 12 13 14 15 16

Number of positive edgesH

isto

gra

m

01000

3000

5000

(b)

14 16 18 20 22 24 26 28

Number of negative edges

His

togra

m

0500

1500

250

0

(c)

Fig. 3: Histograms of the functional networks topologies of the neutral graph. (a) Total number of edges (b) Positive edges and(c) Negative edges.

(a) (b)

Fig. 5: Neutral graph obtained using the proposed evolution strategy. (Color online) the red node represents the wildtype network.(a) Neutral graph using 100 functional networks (b) Neutral graph using 1000 functional networks

26 27 28

Total number of edges

His

tog

ram

05

00

15

00

25

00

(a)

22 24 26 28 30 32 34 36 38 40

Total number of edges

His

tog

ram

02

00

60

01

00

01

40

0

(b)

8 9

Number of positive edges

His

tog

ram

05

00

15

00

25

00

35

00

(c)

7 8 9 10 11 12 13 14 15 16

Number of positive edges

His

tog

ram

05

00

10

00

15

00

(d)

17 18 19 20

Number of negative edges

His

tog

ram

05

00

10

00

15

00

20

00

(e)

14 16 18 20 22 24 26 28

Number of negative edges

His

tog

ram

02

00

60

01

00

0

(f)

Fig. 4: Histograms of the functional networks topologies of theneutral graph separated by the functional networks that are partof the connected component of the wildtype network (a)(c)(e),and the functional networks that are not in the connectedcomponent (b)(d)(f).

functional networks that have a path to the wildtype, in theneutral graph. To do this, we run the ES but using ngh = 1for the initial random candidate network stage. This shouldbias the solutions to functional network that are close in theneutral space. The simulation was set to find 200 solutions.

The resulting neutral graph is shown in Fig. 8(a). We cansee, as expected, that most of the 200 functional networks arepart of the wildtype connected component, only one functionalnetwork does not belong to this component. An interestingphenomena occurs if we eliminate the wildtype network fromthe graph, as shown in Fig. 8(b). Surprisingly, although allof the functional networks, except one, are very close to thewildtype network in the neutral space, we can see that byeliminating the wildtype network, the functional networks arerather disconnected showing only small connected compo-

400 500 600 700 800 900 1000

0.0

00

0.0

02

0.0

04

0.0

06

0.0

08

Basin size of G1 fixed point

De

nsity

Fig. 6: Density of the basin of attraction for the G1 fixedpoints of the functional networks in the wildtype component(blue/dashed line) and the rest of the networks (green/solidline).

Cdc2/Cd13*

SK

Start

Ste9Cdc2/Cd13 Rum1

Cdc25PP

Slp1Wee1/Mik1

(a)

Cdc2/Cd13*

SK

Start

Ste9Cdc2/Cd13 Rum1

Cdc25PP

Slp1Wee1/Mik1

(b)

Fig. 7: (a) Functional network in the connected componentof the wildtype. (b) Functional network not in the connectedcomponent of the wildtype.

Fig. 9: Neutral graph generated using a genetic algorithm

nents, meaning that functional networks in the same wildtypeconnected component, are not necessary close, in the neutralspace, amongst them.

C. Simulation 3

To see the effectiveness of the proposed ES, we used astandard real-valued GA with population size=30, crossoverprobability=0.8, and mutation probability=0.2, max genera-tions=1000. Solutions were represented by a vector of size110, where the first 100 positions contained the elements of thestretched out weight matrix, and the remaining ten positionscontained the threshold values. The solutions were bounded bythe real interval [-2,2]. Solutions were transformed to integervalues using the round() to evaluate in the fitness function. Weused the same initial population and fitness function as the oneused by the ES. Given that the GA took more time in findingsolutions we only limited to find 100 functional networks,these are shown in Fig. 9. We notice that no connectedcomponents appear and the closest functional network to thewildtype network is at distance 22.

VI. CONCLUSION

An evolution strategy was developed to construct a neutralgraph of Boolean regulatory networks. The algorithm was usedto construct a neutral graph of regulatory network that sharethe same state trajectory of the cell cycle of the yeast speciesSchizosaccharomyces pombe (fission yeast) [15]. Through sim-ulations it was found that in the generated neutral graph,about 40% of the nodes are part of the connected componentof the wildtype network, the remaining 60% are functionalnetworks that share the trajectory of the cell cycle but theirinteraction matrices have in general a Hamming distance morethan 3 with the wildtype, and more than 10 between the otherdisconnected functional networks. We found that there aresignificant differences between the functional networks in theconnected component of the wildtype network and the rest ofthe network, not only at a topological level, but also at the state

space level, where significant differences in the distribution ofthe basin of attraction for the G1 fixed point was found. Fromthe results one can see that in general functional networks inthe wildtype connected component, can mutate up to no morethan 3 times, then they reach a point of no return where thenetworks leave the connected component of the wildtype. Wenoticed that although a standard genetic algorithm was capableof finding functional networks, it was not able to obtain aneutral graph with connected components.

Finally, although the proposed evolution strategy was usedfor the fission yeast cell cycle model, it can be used to constructa neutral graph of other biological network models.

ACKNOWLEDGMENT

The authors would like to thank Conicyt-Chile under grantFondecyt 11110088 (G.A.R.), Fondecyt 1100003 (E.G.), andBasal(Conicyt)-CMM, for financially supporting this research.

REFERENCES

[1] S. Ciliberti, O. C. Martin, and A. Wagner, “Innovation and robustness incomplex regulatory gene networks,” PNAS, vol. 104, pp. 13 591–13 596,2007.

[2] ——, “Robustness can evolve gradually in complex regulatory genenetworks with varying topology,” PLoS Computational Biology, vol. 3,p. e15, 2007.

[3] G. Boldhaus and K. Klemm, “Regulatory networks and connectedcomponents of the neutral space,” Eur. Phys. J. B., vol. 77, pp. 233–237,2010.

[4] I. Shmulevich and E. R. Dougherty, Probabilistic Boolean Networks:The Modeling and Control of Gene Regulatory Networks. Philadelphia,PA: SIAM-Society for Industrial and Applied Mathematics, 2009.

[5] L. J. Steggles, R. Banks, and A. Wipat, “Modelling and analysinggenetic networks: From Boolean networks to petri nets,” ComputationalMethods in Systems Biology, vol. 4210, pp. 127–141, 2006.

[6] J. Yu, V. A. Smith, P. P. Wang, A. J. Hartemink, and E. D. Jarvis, “Ad-vances to Bayesian network inference for generating causal networksfrom observational biological data,” Bioinformatics, vol. 20, pp. 3594–3603, 2004.

[7] W. Lee and K. Yang, “Applying intelligent computing techniquesto modeling biological networks from expression data,” Genomics,Proteomics & Bioinformatics, vol. 6, pp. 111–120, 2008.

[8] M. D. Hoon, S. Imoto, and S. Miyano, “Inferring gene regulatorynetworks from time-ordered gene expression data of bacillus subtilisusing differential equations,” in Pac. Symp. Biocomput, 2003, pp. 17–28.

[9] S. Zhang, W. Ching, N. Tsing, H. Leung, and D. Guo, “A newmultiple regression approach for the construction of genetic regulatorynetworks,” Artificial Intelligence in Medicine, vol. 48, pp. 153–160,2010.

[10] S. A. Kauffman, “Metabolic stability and epigenesis in randomlyconstructed genetic nets,” Journal of Theoretical Biology, vol. 22, pp.437–467, 1969.

[11] J. Aracena, E. Goles, A. Moreira, and L. Salinas, “On the robustness ofupdate schedules in Boolean networks,” Biosystems, vol. 97, pp. 1–8,2009.

[12] G. A. Ruz, M. Montalva, and E. Goles, “On the preservation oflimit cycles in boolean networks under different updating schemes,”in Advances in Artificial Life, ECAL 2013, 12th European Conferenceon Artificial Life, 2013, pp. 1085–1090.

[13] L. Mendoza and E. R. Alvarez-Buylla, “Dynamics of the geneticregulatory network for arabidopsis thaliana flower morphogenesis,” JTheor Biol., vol. 193, pp. 307–319, 1998.

[14] S. Huang, “Cell state dynamics and tumorigenesis in Boolean regulatorynetworks,” in Unifying Themes in Complex Systems, A. A. Minai andY. Bar-Yam, Eds. Springer Berlin Heidelberg, 2006, pp. 293–305.

(a) (b)

Fig. 8: (a) Neutral graph of the wildtype component. The wildtype network is in the center in red (Color online). (b) The sameneutral graph of (a) but without the wildtype network.

[15] M. I. Davidich and S. Bornholdt, “Boolean network model predicts cellcycle sequence of fission yeast,” PLoS ONE, vol. 3(2), p. e1672, 2008.

[16] E. Goles, M. Montalva, and G. A. Ruz, “Deconstruction and dynamicalrobustness of regulatory networks: Application to the yeast cell cyclenetworks,” Bulletin of Mathematical Biology, vol. 75, pp. 939–966,2013.

[17] G. A. Ruz and E. Goles, “Learning gene regulatory networks withpredefined attractors for sequential updating schemes using simulatedannealing,” in Proc. of IEEE the Ninth International Conference onMachine Learning and Applications (ICMLA 2010), 2010, pp. 889–894.

[18] ——, “Learning gene regulatory networks using the bees algorithm,”Neural Computing and Applications, vol. 22, pp. 63–70, 2013.

[19] G. A. Ruz, T. Timmermann, and E. Goles, “Building synthetic networksof the budding yeast cell-cycle using swarm intelligence,” in Proceed-ings - 2012 11th International Conference on Machine Learning andApplications, ICMLA 2012, vol. 1, 2012, pp. 120–125.

[20] G. A. Ruz and E. Goles, “Reconstruction and update robustness ofthe mammalian cell cycle network,” in 2012 IEEE Symposium onComputational Intelligence and Computational Biology, CIBCB 2012,2012, pp. 397–403.

[21] A. Faure, A. Naldi, C. Chaouiya, and D. Thieffry, “Dynamical analysisof a generic Boolean model for the control of the mammalian cellcycle,” Bioinformatics, vol. 22, pp. e124–e131, 2006.

[22] G. A. Ruz, E. Goles, M. Montalva, and G. B. Fogel, “Dynamical andtopological robustness of the mammalian cell cycle network: A reverseengineering approach,” Biosystems, vol. 115, pp. 23–32, 2014.