a pontryagin maximum principle for multi–input boolean

1

A Pontryagin MaximumPrinciple for Multi–InputBoolean Control Networks⋆

Dmitriy Laschov and Michael Margaliot⋆⋆

School of Electrical Engineering–Systems, Tel Aviv University, Israel 69978.

Summary. A Boolean network consists of a set of Boolean variables whose state is deter-

mined by other variables in the network. Boolean networks have been studied extensively

as models for simple artificial neural networks. Recently, Boolean networks gained con-

siderable interest as models for biological systems composed of elements that can be in

one of two possible states. Examples include genetic regulation networks, where the ON

(OFF) state corresponds to the transcribed (quiescent) state of a gene, and cellular net-

works where the two possible logic states may represent the open/closed state of an ion

channel, basal/high activity of an enzyme, two possible conformational states of a pro-

tein, etc. Daizhan Cheng developed an algebraic state-space representation for Boolean

control networks using the semi–tensor product of matrices. This representation proved

quite useful for studying Boolean control networks in a control-theoretic framework. Using

this representation, we consider a Mayer-type optimal control problem for Boolean control

networks. Our main result is a necessary condition for optimality. This provides a parallel

of Pontryagin’s maximum principle to Boolean control networks.

1 Introduction

A Boolean network consists of a set of Boolean variables whose state is determined

by other variables in the network. Cellular automata, with two possible states per

⋆ Research supported in part by the Israel Science Foundation (ISF).⋆⋆ Corresponding author: Prof. Michael Margaliot, School of Electrical Engineering–

Systems, Tel Aviv University, Israel 69978. Homepage: www.eng.tau.ac.il/~michaelmEmail: [email protected]

2 Dmitriy Laschov and Michael Margaliot

cell, are a particular case of Boolean networks. Here the state of each variable at

time k+1 is determined by the state of its spatial neighbors at time k [1]. A Boolean

network with n variables has 2n possible states and therefore the dynamics for any

initial condition must fall into an attractor.

Boolean networks have been studied extensively as models for simple artificial

neural networks (see, e.g. [2]). Here each neuron realizes a threshold function that

attains the values zero or one. More recently, Boolean networks gained renewed

interest as models for biological systems composed of elements that can be in one

of two possible states (i.e., ON or OFF). S. A. Kauffman [3] modeled a gene as a

binary device, and studied the behavior of large, randomly constructed nets of these

binary genes. Kauffman’s simulations indicate that if each network node has two

or three inputs, then the dynamical behavior of the network demonstrates order

and stability. Kauffman also related the behavior of the random nets to various

cellular control processes including cell differentiation. The key idea being to view

each stable attractor as representing one possible cell type.

Kauffman’s pioneering ideas stimulated research in several directions. One di-

rection is the theoretical analysis of the dynamics of Boolean networks, espe-

cially using tools from the theory of complex systems and statistical physics (see,

e.g. [4, 5, 6, 7, 8, 9, 10]).

Another research direction is modeling various biological processes using Boolean

networks. Analyzing the behavior of the Boolean network may provide considerable

insight on the original biological process. This is a vast area of research and we

review here only a few examples.

1.1 Boolean networks modeling in biology

Boolean networks seem especially suitable for modeling genetic regulation networks

where the ON (OFF) state corresponds to the transcribed (quiescent) state of the

gene. There are several other motivations [11] for using Boolean networks in this con-

text, including the fact that many metabolic and genetic networks demonstrate some

form of bi-stability. An important example are epigenetic switches (see, e.g. [12, 13]).

Specific examples of genetic regulation networks modeled using Boolean networks

include: the cell–cycle regulatory network of the budding yeast [14]; the yeast tran-

scriptional network [15]; the network controlling the segment polarity genes in the

fly Drosophila melanogaster [16, 17]; the ABC network determining floral organ cell

fate in Arabidopsis [18] (see also [19]);

Boolean networks were also used for modeling various cellular processes. In this

context the two possible logic states may represent the open/closed state of an ion

channel, basal/high activity of an enzyme, two possible conformational states of

a protein, etc. Specific examples include: a detailed model for the highly complex

cellular signaling network controlling stomatal closure in plants [20]; and a model of

the molecular pathway between two neurotransmitter systems, the dopamine and

glutamate receptors [21].

1 A Pontryagin Maximum Principle for Multi–Input Boolean Control Networks 3

Szallasi and Liang [22] discuss the use of Boolean networks in modeling carcino-

genesis and for analyzing the effect of therapeutic intervention (see also [23]).

These studies suggest that Boolean networks provide a highly efficient modeling

tool for large–scale biological networks. These models are able to reproduce the

main characteristics of the biological network dynamics: attractors of the Boolean

network correspond to stationary biological states; large attraction basins indicate

robustness of the biological state, and so on.

Modeling using Boolean networks requires only coarse–grained qualitative infor-

mation (e.g., an interaction between two genes is either activating or inhibiting).

This is in sharp contrast to other models, for example, those based on differential

equations, that require knowledge of numerous parameter values (e.g., rate con-

stants). For a general exposition on various approaches for modeling gene regulation

networks, see [24].

Modeling a biological system involves considerable uncertainty. This is due to

the noise and perturbations that affect the biological system, and to the inaccuracies

of the measuring equipment. One approach for tackling this uncertainty is by using

Probabilistic Boolean Networks (PBNs) [25, 26]. These may be viewed as a collection

of (deterministic) Boolean networks combined with a probabilistic switching rule

determining which network is active at each time instant.

It is natural to extend the idea of Boolean networks to include input variables.

For example, an input may represent the dosage that is administered to a patient.

Boolean networks with (binary) inputs variables are referred to as Boolean Control

Networks (BCNs). PBNs with inputs were used to design and analyze therapeutic

intervention strategies. The idea here is to find a control that shifts the network from

an undesirable location (representing a “diseased” state) to a desirable state. Such

problems can be cast as stochastic optimal control problems, and solved numerically

using dynamic programming [27, 28].

Daizhan Cheng and his colleagues developed an algebraic state–space representa-

tion of BCNs using the semi–tensor product of matrices. This representation proved

quite useful for studying BCNs in a control–theoretic framework. Examples include

the analysis of disturbance decoupling [29], controllability and observability [30],

realization theory [31], and more [32, 33, 34].

Here we make use of this state–space representation to analyze a Mayer–type

optimal control problem for BCNs. Our main result is a necessary condition for a

control to be optimal. This provides a parallel of the celebrated Pontryagin max-

imum principle (PMP) (see, e.g., [35, 36, 37]) for BCNs. The proof of our main

result is motivated by the simple proof of a special case of the PMP used in the

variational analysis of switched systems [38] (see also [39, 40, 41]). The first result

in this direction appeared in our recent paper [42] describing a maximum principle

for the special case of single–input BCNs.

The remainder of this chapter is organized as follows. Section 2 reviews BCNs.

Section 3 describes Cheng’s algebraic state–space representation of BCNs using

the semi–tensor product of matrices. Section 4 details our main result which is a


x2x1

and

u2u1

andor

Fig. 1 Graphical representation of the BCN in Example 1.

new maximum principle (MP) for BCNs. Section 5 includes the proof of our main

result. In Section 6 we consider the so–called singular case where the MP does not

provide any direct information on the optimal control. Several synthetic examples

demonstrate the application of the new MP.

2 Boolean control networks

A Boolean control network is a discrete–time logical dynamic control system in the

form

x1(k + 1) = f1(x1(k), . . . , xn(k), u1(k), . . . , um(k)), (1)

...

xn(k + 1) = fn(x1(k), . . . , xn(k), u1(k), . . . , um(k)),

where xi, ui ∈ {True,False}, and each fi is a Boolean function.

A BCN may be represented graphically as a network with n nodes, representing

the xis, and m inputs. A directed edge from node i (input ui) to node j implies

that xj(k + 1) depends on xi(k) (ui(k)).

Example 1. Consider the two–state, two–input BCN

x1(k + 1) = x1(k) ∨ [x2(k) ∧ u1(k)], (2)

x2(k + 1) = x2(k) ∧ u2(k).

Fig. 1 depicts the graphical representation of this BCN.

It is worth noting that a BCN with m inputs is a Boolean switched system

switching between 2m possible subsystems, with the value of the control determining

which subsystem is active at every time step. To demonstrate this, note that in (2)

the control values may attain one of four values: (u1(k), u2(k)) ∈ {TT, TF, FT, FF},

where T (F ) is shorthand for True (False). With each of these four possible values

we can associate a corresponding dynamics, i.e. a subsystem. For example, when

u1(k) = u2(k) = T the corresponding subsystem is given by


x1(k + 1) = x1(k) ∨ x2(k),

x2(k + 1) = x2(k).

3 Algebraic state–space representation of BCNs

Control–theoretic problems for BCNs are best addressed in the algebraic state–space

representation for BCNs derived by Daizhan Cheng and his colleagues [43, 32, 34,

29, 31]. This is based on the semi–tensor product of matrices.

3.1 Semi–tensor product

Recall that the Kronecker product (see, e.g. [44, Chapter 7]) of two matrices A ∈

Rm×n and B ∈ R

p×q is

A ⊗ B =

a11B · · · a1nB...

. . ....

am1B · · · amnB

.

Note that (A ⊗ B) ∈ R(mp)×(nq).

Given two positive integers a, b, let lcm(a, b) denote the least common multiple

of a and b. For example, lcm(6, 8) = 24. Let In denote the n × n identity matrix.

Definition 1. The semi–tensor product of two matrices A ∈ Rm×n and B ∈ R

p×q

is

A ⋉ B = (A ⊗ Iα/n)(B ⊗ Iα/p),

where α = lcm(n, p).

Remark 1. Note that (A⊗ Iα/n) ∈ R(mα/n)×α and (B ⊗ Iα/p) ∈ R

α×(qα/p), so (A ⋉

B) ∈ R(mα/n)×(qα/p).

Remark 2. If n = p, then A⋉B = (A⊗I1)(B⊗I1) = AB, i.e. in this case we recover

the standard matrix product. Thus, we may view the semi–tensor product as a

generalization of the standard matrix product that provides a way to multiply two

matrices with arbitrary dimensions. Intuitively, this is based on first modifying A, B

to two matrices (A⊗Iα/n), (B⊗Iα/p) of compatible dimensions and then calculating

their standard matrix product. The following examples demonstrate this idea.

Example 2. Consider a ⋉ b where a, b ∈ R2. Here m = p = 2 and n = q = 1,

so α = lcm(n, p) = 2, and


a ⋉ b = (a ⊗ I2)(b ⊗ I1)

=

a1 0

0 a1

a2 0

0 a2

b

=[

a1b1 a1b2 a2b1 a2b2

]T

.

Example 3. Consider the semi–tensor product of a row–vector aT =[

a1 . . . an

]

and a column vector b =[

b1 . . . bp

]T

. Suppose that p divides n, i.e. s = n/p is an

integer. Then α = lcm(n, p) = n, so

aT⋉ b = (aT ⊗ I1)(b ⊗ Is)

= aT

b1Is

...

bpIs

.

Various properties of the semi–tensor product are analyzed in [43]. For our pur-

poses, it is sufficient to note that this product is associative

A ⋉ (B ⋉ C) = (A ⋉ B) ⋉ C,

and distributive

(A + B) ⋉ C = (A ⋉ C) + (B ⋉ C).

3.2 Algebraic representation of Boolean functions

The semi–tensor product allows representing Boolean functions in an algebraic form.

Let ein denote the ith column of the identity matrix In. Represent the Boolean values

True and False by e12 =

[

1

0

]

and e22 =

[

0

1

]

, respectively. Then any Boolean function

of n variables f : {True, False}n → {True,False} can be equivalently represented

as a mapping f : {e12, e

22}

n → {e12, e

22}. With some abuse of notation, we identify f

with f . In other words, from here on a Boolean variable xi is always a vector

in {e12, e

22}.

The next result shows that any Boolean function may be represented in an

algebraic form.

Theorem 1. [32] Let f : {e12, e

22}

n → {e12, e

22} be a Boolean function. There exists

a unique binary matrix Mf of dimensions 2 × 2n such that

f(x1, . . . , xn) = Mf ⋉ x1 ⋉ · · · ⋉ xn.

Mf is called the structure matrix of f .


Remark 3. To provide some intuition on this representation, consider the case n = 2,

i.e. f = f(x1, x2). Recall that xi ∈ {e12, e

22}, so x1 =

[

v v]T

and x2 =[

w w]T

,

with v, w ∈ {0, 1}. Then

x1 ⋉ x2 =[

vw vw vw vw]T

, (3)

i.e. x1 ⋉ x2 contains all the possible minterms of v and w. Recall that any Boolean

function may be represented as a sum of some minterms of its variables (see,

e.g. [45]). This is known as the sum of products (SOP) representation. The mul-

tiplication Mf ⋉ x1 ⋉ x2 provides such a representation. Note that (3) implies

that x1 ⋉ x2 ∈ {e14, . . . , e

44}. Indeed, one and only one minterm has the value 1 and

all the other must be 0.

Example 4. Consider the function f(x) = x, i.e. f is defined by f(e12) = e2

2

and f(e22) = e1

2. It is easy to verify that f(x) =

[

0 1

1 0

]

⋉ x. Consider the func-

tion g(x1, x2) = x1 ∧ x2. It is straightforward to verify that

g(x1, x2) = Mg ⋉ x1 ⋉ x2,

with Mg =

[

1 0 0 0

0 1 1 1

]

. For example,

Mg ⋉ e12 ⋉ e2

2 = Mg ⋉

[

0 1 0 0]T

= Mg

[

0 1 0 0]T

=[

0 1]T

,

= e22,

corresponding to (True ∧ False) = False.

3.3 Algebraic representation of BCNs

Since the dynamics of BCNs is described by a set of Boolean functions, it is clear

from the discussion above that the semi–tensor product can be used to provide an

algebraic state–space representation of BCNs.

Theorem 2. [33] Consider a BCN with state variables x1, . . . , xn and inputs u1, . . . , um,

where xi, ui ∈ {e12, e

22}. Denote x(k) = x1(k) ⋉ · · ·⋉ xn(k) and u(k) = u1(k) ⋉ · · ·⋉

um(k). There exists a unique matrix L ∈ R2n×2n+m

such that

x(k + 1) = L ⋉ u(k) ⋉ x(k). (4)

The matrix L is called the transition matrix of the BCN.


Algorithms for converting a BCN in the form (1) to its algebraic representation (4),

and vice versa, may be found in [32, 30].

Remark 4. The intuition behind this representation is very similar to the alge-

braic representation of a single Boolean function using the semi–tensor product.

To demonstrate this, consider a BCN with n = 2 and m = 1. Then (4) be-

comes x(k +1) = L⋉u1(k)⋉x1(k)⋉x2(k). To simplify the notation, we omit from

here on the dependence on k. Denote x1 =[

p p]T

, x2 =[

q q]T

, and u1 =[

v v]T

.

Then

u1 ⋉ x1 ⋉ x2 =[

vpq vpq vpq vpq vpq vpq vpq vpq]T

.

Thus, u⋉x includes all the possible minterms of the input and state variables. The

equation x(k + 1) = L ⋉ u(k) ⋉ x(k) amounts to a description of (every minterm

of) the next state in terms of the current state and inputs.

Remark 5. Note that since u(k) = u1(k)⋉ · · ·⋉um(k), with ui(k) ∈ {e12, e

22}, u(k) ∈

{e12m , . . . , e2m

2m}. For example, if m = 3, u1(k) = e12, u2(k) = e2

2, and u3(k) = e22,

then u(k) = e48.

Example 5. Consider the BCN in Example 1. Here n = 2 and m = 2, so x(k) =

x1(k) ⋉ x2(k) and u(k) = u1(k) ⋉ u2(k). Applying the algorithm described in [30]

yields the transition matrix

L =

1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0

0 1 0 0 1 1 1 0 0 1 0 0 1 1 0 0

0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0

0 0 0 1 0 0 0 1 0 0 0 1 0 0 1 1

.

To demonstrate the equivalence of the original dynamics and (4), consider for exam-

ple the case where x1(k) = False, x2(k) = True, u1(k) = True, and u2(k) = False.

Then (2) yields

x1(k + 1) = True, x2(k + 1) = False. (5)

In the algebraic framework, this corresponds to x1(k) = u2(k) = e22, x2(k) =

u1(k) = e12. Then

x(k + 1) = L ⋉ u(k) ⋉ x(k)

= L ⋉

[

1

0

]

⋉

[

0

1

]

⋉

[

0

1

]

⋉

[

1

0

]

= L ⋉

[

0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0]T

= L[

0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0]T

=[

0 1 0 0]T

.


Writing x1(k+1) =[

v v]T

and x2(k+1) =[

w w]T

yields x(k+1) =[

vw vw vw vw]T

,

so v = w = 1. Thus, x1(k + 1) = e12, x2(k + 1) = e2

2, and this agrees, of course,

with (5).

4 Main result

A fundamental problem for all dynamical control systems is to determine a con-

trol that is optimal in some sense. In other words, a control that maximizes (or

minimizes) a given cost–functional. Our main result is a necessary condition for op-

timality stated in the form of a maximum principle. This provides a parallel of the

PMP for BCNs. We begin by defining an optimal control problem for BCNs. Con-

sider a BCN in the algebraic state–space representation (4). Fix some (arbitrary)

initial condition x(0) = x0 ∈ {e12n , . . . , e2n

2n}.

4.1 Optimal control problem

Fix a final time N > 0. Let U denote the set of admissible controls, i.e. the set of all

the sequences {u(0), . . . , u(N−1)}, with u(i) ∈ {e12m , . . . , e2m

2m}. For a control u ∈ U,

let x(k;u) denote the solution of (4), with x(0) = x0, at time k. Fix a vector r ∈ R2n

,

and consider the cost–functional

J(u) = rT x(N ;u). (6)

We now pose a Mayer–type optimal control problem.

Problem 1. Find a control u∗ ∈ U that maximizes J .

This problem clearly admits a solution, as U is a finite set. We refer to a control

that maximizes J as an optimal control. In principle, Problem 1 may be solved

numerically by simply calculating x(N ;u) for any u ∈ U. However, this is clearly

not practical for large values of N .

Example 6. Suppose that n = 3, so that

x(N) = x1(N) ⋉ x2(N) ⋉ x3(N).

Denote x1(N) =[

v v]T

, x2(N) =[

w w]T

, and x3(N) =[

q q]T

. Then

x(N) =[

vwq vwq vwq vwq vwq vwq vwq vwq]T

.

Suppose that we take r =[

1 1 0 0 . . . 0]T

. Then rT x(N) = vw. Thus, maximiz-

ing (6) corresponds to trying to find a control u steering the BCN to x1(N) =

x2(N) = e12, if it exists.


Remark 6. Recall that x(N) consists of all the minterms of the Boolean state vari-

ables at time N . Hence any Boolean function f of the state at time N may be

represented in the form (6), i.e. as f = rTf x(N,u), where rf is a binary vector. In

this particular case, J(u) can attain only two values, namely, zero and one. This

yields a reachability problem that is quite relevant for BCNs that model biological

networks, as here states can usually be divided into desirable and non–desirable

states. For example, in a model of cell differentiation a non–desirable state corre-

sponds to uncontrolled cell proliferation (see, e.g. [27, 28]). For a different approach

for analyzing reachability in BCNs, see [30].

We are interested in developing an analytical characterization of optimal con-

trols. By iterating (4) we find that for any two integers k ≥ j ≥ 0,

x(k;u) = C(k, j;u) ⋉ x(j;u), (7)

where

C(k, j;u) = L ⋉ u(k − 1) ⋉ L ⋉ u(k − 2) ⋉ · · · ⋉ L ⋉ u(j), (8)

with C(k, k;u) = I2n . We refer to the 2n × 2n matrix C(k, j;u) as the transition

matrix from time j to time k corresponding to the control u. Note that (8) implies

that for any k ≥ l ≥ j,

C(k, j;u) = C(k, l;u) ⋉ C(l, j;u).

We can now state our main result.

Theorem 3. Consider the BCN (4). Suppose that u∗ = {u∗(0), . . . , u∗(N−1)} ∈ U

is an optimal control for Problem 1, and let x∗ denote the corresponding trajectory

of (4). Let the adjoint λ : {1, . . . , N} → R2n

be the solution of

λ(k) = (L ⋉ u∗(k))T⋉ λ(k + 1),

λ(N) = r, (9)

and define 2m switching functions αi : {0, 1, . . . , N − 1} → R, i = 1, . . . , 2m, by

αi(s) = λT (s + 1) ⋉ L ⋉ ei2m ⋉ x∗(s). (10)

For any time s, if for some index i

αi(s) > αj(s) for all j 6= i,

then

u∗(s) = ei2m . (11)

Theorem 3 provides a necessary condition for optimality in terms of the switching

functions αi. Note that this is somewhat similar to the PMP for discrete–time

dynamical systems (see, e.g. [46, Ch. 8][47]).


Remark 7. It is instructive to verify that αi(·) is indeed a scalar function. Since the

dimensions of λT (·) are 1×2n and those of L are 2n×2n+m (recall that we consider

a BCN with m inputs), it follows from Remark 2 that

λT (s + 1) ⋉ L = λT (s + 1)L ∈ R1×2n+m

.

Since the dimensions of x∗(·) are 2n × 1, Remark 1 implies that

ei2m ⋉ x∗(s) ∈ R

2n+m×1.

Thus,

αi(s) = λT (s + 1) ⋉ L ⋉ ei2m ⋉ x∗(s)

= λT (s + 1)L(

ei2m ⋉ x∗(s)

)

is indeed a scalar.

It is possible to state our main result in a ”Hamiltonian form”. To do so, define

H : R2n

× R2n

× R2m

→ R by H(x, λ, u) = λT L⋉u⋉x. By Remark 1, (λT L⋉u) ∈

R1×2n

, so we may also write H as

H(x, λ, u) = (λT L ⋉ u)x

= λT (L ⋉ u)x.

Then (9) can be written as

λ(k) =∂

∂xH(x∗(k), λ(k + 1), u∗(k)), (12)

the system dynamics (4) as

x(k + 1) =∂

∂λH(x∗(k), λ(k + 1), u∗(k)), (13)

and (11) may be written as

u∗(s) = arg maxv∈{e1

2m ,...e2m

2m}H(x∗(s), λ(s + 1), v). (14)

Furthermore, the function

H∗(s) = H∗(x∗(s), λ(s + 1), u∗(s))

is constant. Indeed,


H∗(s) = λT (s + 1)(L ⋉ u∗(s))x∗(s) (15)

= λT (s)x∗(s)

= λT (s)(L ⋉ u∗(s − 1))x∗(s − 1)

= H∗(s − 1).

Specializing Theorem 3 to the case m = 1, i.e., to BCNs with a single input,

yields the following result.

Corollary 1. [42] Consider the BCN (4) with m = 1. Suppose that u∗1 = {u∗

1(0), . . . , u∗1(N−

1)} ∈ U is an optimal control for Problem 1. Let the adjoint λ : {1, . . . , N} → R2n

be the solution of

λ(k) = (L ⋉ u∗1(k))T

⋉ λ(k + 1),

λ(N) = r, (16)

and let β(s) = λT (s + 1) ⋉ L ⋉

[

1

−1

]

⋉ x∗(s). Then

u∗1(s) =

e12, if β(s) > 0,

e22, if β(s) < 0.

(17)

Proof. In this case, the switching functions are

α1(s) = λT (s + 1) ⋉ L ⋉ e12 ⋉ x∗(s),

α2(s) = λT (s + 1) ⋉ L ⋉ e22 ⋉ x∗(s),

so α1(s) − α2(s) = β(s). Hence, the condition α1(s) > α2(s) (α2(s) > α1(s)) is

equivalent to β(s) > 0 (β(s) < 0). ⊓⊔

The next simple example demonstrates an application of Corollary 1.

Example 7. Consider the single–input BCN

x(k + 1) = x(k) ∧ u(k),

x(0) = True. (18)

Here n = m = 1, and the algebraic state–space form is

x(k + 1) = L ⋉ u(k) ⋉ x(k)

x(0) = e12,

with L =

[

1 0 0 0

0 1 1 1

]

. Fix some final time N > 0 and consider Problem 1 for r =

[

1

0

]

.

Letting x∗(N) =[

w w]T

, this implies that we are trying to maximize w, i.e. to find

a control u∗ steering the system to x∗(N) = e12, if it exists.


In this case,

β(N − 1) = λT (N) ⋉ L ⋉

[

1

−1

]

⋉ x∗(N − 1)

= rT⋉ L ⋉

[

1

−1

]

⋉ x∗(N − 1)

=[

1 0]

⋉

[

1 0 0 0

0 1 1 1

]

⋉

[

1

−1

]

⋉ x∗(N − 1)

=[

1 0 0 0]

⋉

[

1

−1

]

⋉ x∗(N − 1)

=[

1 0]

⋉ x∗(N − 1). (19)

We consider two cases.

Case 1. Suppose that x∗(N − 1) = e22. Then (4) yields

x∗(N) = L ⋉ u∗(N − 1) ⋉ e22

= L ⋉

[

v

v

]

⋉ e22

=

[

1 0 0 0

0 1 1 1

]

[

0 v 0 v]T

= e22,

so rT x∗(N) = 0.

Case 2. Suppose that x∗(N − 1) = e12. Then (19) yields β(N − 1) = 1, so by the

MP, u∗(N − 1) = e12 and, therefore, x∗(N) = e1

2. Using (16) yields

λ(N − 1) = (L ⋉ u∗(N − 1))T⋉ λ(N)

= (L ⋉ e12)

T⋉ e1

2

= e12.

Hence,

β(N − 2) = λT (N − 1) ⋉ L ⋉

[

1

−1

]

⋉ x∗(N − 2)

=[

1 0]

⋉ x∗(N − 2).

Comparing this with (19), we conclude that there are two possibilities. Either x∗(N) =

e22 (and then any control is optimal) or x∗(N) = e1

2 and then the (unique) optimal

control is u∗(k) = e12, for any k ∈ {0, 1 . . . , N − 1}. Thus, in this example the MP

provides a complete characterization of the optimal control.

The next section is devoted to the proof of Theorem 3.


5 Proof of main result

Fix an arbitrary time p ∈ {0, . . . , N −1} and an arbitrary vector v ∈ {e12m , . . . e2m

2m}.

Define a new control u ∈ U by a perturbation of u∗:

u(j) =

v, if j = p,

u∗(j), otherwise.(20)

In other words, u is identical to the optimal control u∗ except, perhaps, at the

time p. This is a parallel of the needle variation used in the proof of the PMP.

Let x∗(·) = x(·;u∗) denote the solution corresponding to u∗. It follows from the

definition of the transition matrix that

x∗(N) = C(N, p + 1;u∗) ⋉ C(p + 1, p;u∗) ⋉ C(p, 0;u∗) ⋉ x0

= C(N, p + 1;u∗) ⋉ L ⋉ u∗(p) ⋉ x∗(p).

Similarly, for x(·) = x(·;u), we have

x(N) = C(N, p + 1;u) ⋉ L ⋉ u(p) ⋉ x(p;u)

= C(N, p + 1;u∗) ⋉ L ⋉ v ⋉ x∗(p),

where the second equation follows from the definition of u in (20). Thus,

x∗(N) − x(N) = C(N, p + 1;u∗) ⋉ L ⋉ (u∗(p) − v) ⋉ x∗(p),

and

J(u∗) − J(u) = rT (x∗(N) − x(N))

= rT (C(N, p + 1;u∗) ⋉ L ⋉ (u∗(p) − v) ⋉ x∗(p)). (21)

To simplify this expression, let wT (p + 1) = rT C(N, p + 1;u∗). Then wT (N) =

rT C(N,N ;u∗) = rT , and

w(p) = CT (N, p;u∗)r

= (C(N, p + 1;u∗) ⋉ C(p + 1, p;u∗))T r

= (C(N, p + 1;u∗)C(p + 1, p;u∗))T r

= CT (p + 1, p;u∗)CT (N, p + 1;u∗)r

= CT (p + 1, p;u∗) ⋉ CT (N, p + 1;u∗)r

= CT (p + 1, p;u∗) ⋉ w(p + 1)

= (L ⋉ u∗(p))T⋉ w(p + 1).

Comparing this with (9) shows that w(p) = λ(p) for all p, and thus (21) yields

J(u∗) − J(u) = λT (p + 1) ⋉ L ⋉ (u∗(p) − v) ⋉ x∗(p). (22)


Now suppose that there exists an index i such that αi(p) > αj(p) for all j 6= i. We

need to show that u∗(p) = ei2m . Seeking a contradiction, assume that u∗(p) = ej

2m

for some j 6= i. Then for v = ei2m the right–hand side of (22) is

λT (p + 1) ⋉ L ⋉ (ej2m − ei

2m) ⋉ x∗(p) = αj(p) − αi(p)

< 0,

so J(u∗) − J(u) < 0. But this contradicts the optimality of u∗. Thus, u∗(p) = ei2m .

Since p is arbitrary, this completes the proof of Theorem 3. ⊓⊔

Summarizing, if there exists an index i such that αi(s) > αj(s) for all j 6=

i, then the MP uniquely determines u∗(s). In the next section, we consider the

complementary situation referred to as the singular case.

6 The Singular Case

The next result shows that the singular case is actually easy to handle.

Theorem 4. Suppose that u∗ is an optimal control. Assume that for some time s

there exists a subset of indexes I = {i1, . . . , il} such that αi1(s) = · · · = αil(s) and

αi1(s) > αj(s) for all j 6∈ I. Then u∗(s) ∈ {ei12m , . . . , eil

2m}. Furthermore, any control

in the form

w(j) =

z, if j = s,

u∗(j), otherwise,

with z ∈ {ei12m , . . . , eil

2m}, is also an optimal control.

Proof. For simplicity, assume that I = {i1, i2} (the proof for the general case is

similar). In this case, the conditions in the theorem are

αi1(s) > αj(s) for all j 6∈ {i1, i2}, (23)

and

0 = αi1(s) − αi2(s)

= λ(s + 1)T⋉ L ⋉ (ei1

2m − ei22m) ⋉ x∗(s).

Let u∗ ∈ U be an optimal control and denote z = u∗(s). Arguing as in the proof of

Theorem 3, it follows from (23) that either z = ei12m or z = ei2

2m . Assume that z =

ei22m . Define a new control u by

u(j) =

ei12m , if j = s,

u∗(j), otherwise.(24)

Then (22) yields


J(u∗) − J(u) = λT (s + 1) ⋉ L ⋉ (z − ei12m) ⋉ x∗(s)

= λT (s + 1) ⋉ L ⋉ (ei22m − ei1

2m) ⋉ x∗(s)

= αi2(s) − αi1(s)

= 0.

In other words, the new control u is also an optimal control. Since u(s) = ei12m , this

completes the proof of Theorem 4. ⊓⊔

The next example demonstrates an application of Theorems 3 and 4.

Example 8. Consider the two–state, two–input BCN

x1(k + 1) = [x1(k) ∧ x2(k)] ∨ [u1(k) ∧ u2(k) ∧ x2(k)] ∨ [u2(k) ∧ x1(k)],

x2(k + 1) = [u1(k) ∧ u2(k) ∧ x1(k) ∧ x2(k)] ∨ [u1(k) ∧ u2(k) ∧ x1(k)]

∨ [u1(k) ∧ (x1(k) ⊕ x2(k))]. (25)

Here n = m = 2. Suppose that the initial condition is x1(0) = x2(0) = False, and

consider Problem 1 with N = 3 and r = e14. In other words, the problem is to

determine a control that maximizes J(u) = (e14)

T x(3). Intuitively, this amounts to

finding a control steering the state to x1(3) = x2(3) = True, if it exists.

The algebraic state–space form is given by (4) with x(0) = e44, and

L =

0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0

1 0 0 0 1 0 0 0 0 0 1 0 1 1 0 0

0 1 1 0 0 0 1 0 0 1 0 0 0 0 0 1

0 0 0 1 0 0 0 1 0 0 0 1 0 0 1 0

.

To analyze this problem using the MP consider the functions

αi(2) = λT (3) ⋉ L ⋉ ei4 ⋉ x∗(2)

= rT⋉ L ⋉ ei

4 ⋉ x∗(2)

=[

1 0 0 0]

⋉ L ⋉ ei4 ⋉ x∗(2)

=[

0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0]

⋉ ei4 ⋉ x∗(2). (26)

Using the definition of the semi–tensor product yields

α1(2) =[

0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0]

⋉ e14 ⋉ x∗(2)

=[

0 0 0 0]

⋉ x∗(2)

= 0,

and similarly


α2(2) =[

0 1 0 0]

⋉ x∗(2),

α3(2) =[

1 0 0 0]

⋉ x∗(2),

α4(2) = 0.

We consider several cases.

Case 1. Assume that x∗(2) = e24. Then α2(2) = 1, and αi(2) = 0 for any i 6= 2.

Theorem 3 implies that u∗(2) = e24. Using (9) yields

λ(2) = (L ⋉ u∗(2))T λ(3)

= (L ⋉ e24)

T r

= e24, (27)

so

αi(1) = λT (2) ⋉ L ⋉ ei4 ⋉ x∗(1)

= (e24)

T⋉ L ⋉ ei

4 ⋉ x∗(1),

and this yields

α1(1) =[

1 0 0 0]

⋉ x∗(1),

α2(1) =[

1 0 0 0]

⋉ x∗(1),

α3(1) =[

0 0 1 0]

⋉ x∗(1),

α4(1) =[

1 1 0 0]

⋉ x∗(1). (28)

We consider two subcases.

Case 1.1. Suppose that x∗(1) = e34. Then α3(1) > αj(1) for any j 6= 3, so Theorem 3

implies that u∗(1) = e34. This yields

λ(1) = (L ⋉ u∗(1))T λ(2)

= (L ⋉ e34)

T e24

= e34,

so

αi(0) = λT (1) ⋉ L ⋉ ei4 ⋉ x∗(0)

= (e34)

T⋉ L ⋉ ei

4 ⋉ e44.

A calculation yields α1(0) = α2(0) = α3(0) = 0, and α4(0) = 1. By Theo-

rem 3, u∗(0) = e44. Summarizing, in this case we conclude that {u∗(0), u∗(1), u∗(2)} =

{e44, e

34, e

24} is the only control that satisfies the necessary condition for optimality. A

calculation shows that the corresponding trajectory is x∗(0) = e44, x∗(1) = e3

4, x∗(2) = e24,


and x∗(3) = e14, so this control indeed steers the system to the desired location. If we

are interested in finding one control that steers the system to the desired location,

then we may stop the calculations at this point. Otherwise, we continue to con-

sider the remaining cases. It turns out that for this particular example, Theorems 3

and 4 provide enough information to explicitly determine the optimal controls. To

demonstrate this, we consider one more subcase.

Case 1.2. Suppose that

x∗(1) = e14. (29)

Then (28) yields α1(1) = α2(1) = α4(1) = 1, and α3(1) = 0. Theorem 4 implies

that if u∗ is an optimal control, we may assume that u∗(1) = e14. Then

λ(1) = (L ⋉ u∗(1))T λ(2)

= (L ⋉ e14)

T e24

= e14,

so

αi(0) = λT (1) ⋉ L ⋉ ei4 ⋉ x∗(0)

= (e14)

T⋉ L ⋉ ei

4 ⋉ e44.

A calculation yields αi(0) = 0 for all i. Theorem 4 now implies that there exists an

optimal control satisfying u∗(0) = e14. Then

x∗(1) = L ⋉ u∗(0) ⋉ x∗(0)

= L ⋉ e14 ⋉ e4

4

= e44,

but this contradicts (29), so we conclude that Case 1.2 is not possible.

7 Conclusions

We considered a Mayer–type optimal control problem for BCNs. Using the alge-

braic state–space formulation developed by Daizhan Cheng, we derived a necessary

condition for optimality in the form of a maximum principle. We also analyzed the

singular case where the MP itself does not provide direct information on the optimal

control.

Several synthetic examples were used to demonstrate the application of the MP.

A natural direction for further research is the analysis of optimal controls in BCNs

that model real biological systems using the MP derived here. The main difficulty is

that an MP provides implicit information on the optimal control, as the necessary

condition for optimality is stated in terms of the switching functions that depend on

the (unknown) optimal control. Nevertheless, there are many important cases where


the PMP, combined with geometric tools, provides a complete characterization of

optimal controls (see, e.g., [48, 49, 50, 51, 52]). It might be interesting to search

for such special cases in the context of BCNs. Control problems for BCNs are in

general NP–hard [53], yet this does not preclude the existence of important special

cases that can be solved analytically.

We believe that further developments in optimal control theory for BCNs may

lead to important insights on the design of suitable controls for various real–world

systems modeled using BCNs.

References

1. J. L. Schiff, Cellular Automata: A Discrete View of the World. Wiley-Interscience,

2008.

2. M. H. Hassoun, Fundamentals of Artificial Neural Networks. MIT Press, 1995.

3. S. A. Kauffman, “Metabolic stability and epigenesis in randomly constructed genetic

nets,” J. Theoretical Biology, vol. 22, pp. 437–467, 1969.

4. R. Albert and A.-L. Barabasi, “Dynamics of complex systems: scaling laws or the

period of Boolean networks,” Phys. Rev. Lett., vol. 84, pp. 5660–5663, 2000.

5. M. Aldana, “Boolean dynamics of networks with scale–free topology,” Physica D, vol.

185, pp. 45–66, 2003.

6. B. Derrida and Y. Pomeau, “Random networks of automata: a simple annealed ap-

proximation,” Europhys. Lett., vol. 1, pp. 45–49, 1986.

7. B. Drossel, T. Mihaljev, and F. Greil, “Number and length of attractors in a critical

Kauffman model with connectivity one,” Phys. Rev. Lett., vol. 94, 2005, 088701.

8. S. A. Kauffman, Origins of Order: Self–Organization and Selection in Evolution. Ox-

ford University Press, 1993.

9. B. Luque and R. V. Sole, “Lyapunov exponents in random Boolean networks,” Physica

A: Statistical Mechanics and its Applications, vol. 284, pp. 33–45, 2000.

10. B. Samuelsson and C. Troein, “Superpolynomial growth in the number of attractors

in Kauffman networks,” Phys. Rev. Lett., vol. 90, 2003, 098701.

11. S. Huang, “Regulation of cellular states in mammalian cells from a genomewide view,”

in Gene Regulation and Metabolism, J. Collado-Vides and R. Hofestadt, Eds. MIT

Press, 2002, pp. 181–220.

12. M. Ptashne, A Genetic Switch, 3rd ed. Cold Spring Harbor, 2004.

13. D. Laschov and M. Margaliot, “Mathematical modeling of the λ switch: a fuzzy logic

approach,” J. Theoretical Biology, vol. 260, pp. 475–489, 2009.

14. F. Li, T. Long, Y. Lu, Q. Ouyang, and C. Tang, “The yeast cell–cycle network is

robustly designed,” Proc. Natl. Acad. Sci. U.S.A., vol. 101, pp. 4781–4786, 2004.

15. S. Kauffman, C. Peterson, B. Samuelsson, and C. Troein, “Random Boolean network

models and the yeast transcriptional network,” Proc. Natl. Acad. Sci. U.S.A., vol. 100,

pp. 14 796–14 799, 2003.

16. R. Albert and H. G. Othmer, “The topology of the regulatory interactions predicts

the expression pattern of the segment polarity genes in Drosophila melanogaster,” J.

Theoretical Biology, vol. 223, pp. 1–18, 2003.

17. M. Chaves, R. Albert, and E. D. Sontag, “Robustness and fragility of Boolean models

for genetic regulatory networks.” J. Theoretical Biology, vol. 235, pp. 431–449, 2005.


18. C. Espinosa-Soto, P. Padilla-Longoria, and E. R. Alvarez-Buylla, “A gene regulatory

network model for cell–fate determination during Arabidopsis thaliana flower devel-

opment that is robust and recovers experimental gene expression profiles,” Plant Cell,

vol. 16, pp. 2923–2939, 2004.

19. A. Chaos, M. Aldana, C. Espinosa-Soto, B. G. P. de Leon, A. G. Arroyo, and E. R.

Alvarez-Buylla, “From genes to flower patterns and evolution: dynamic models of gene

regulatory networks,” J. Plant Growth Regul., vol. 25, pp. 278–289, 2006.

20. S. Li, S. M. Assmann, and R. Albert, “Predicting essential components of signal trans-

duction networks: a dynamic model of guard cell abscisic acid signaling,” PLoS Biol.,

vol. 4, pp. 1732–1748, 2006.

21. S. Gupta, S. S. Bisht, R. Kukreti, S. Jain, and S. K. Brahmachari, “Boolean network

analysis of a neurotransmitter signaling pathway,” J. Theoretical Biology, vol. 244, pp.

463–469, 2007.

22. Z. Szallasi and S. Liang, “Modeling the normal and neoplastic cell cycle with “realistic

Boolean genetic networks”: their application for understanding carcinogenesis and

assessing therapeutic strategies,” Pac. Symp. Biocomput., vol. 3, pp. 66–76, 1998.

23. S. Kauffman, “Differentiation of malignant to benign cells,” J. Theoretical Biology,

vol. 31, pp. 429–451, 1971.

24. H. Bolouri, Computational Modelling of Gene Regulatory Networks–A Primer. Im-

perial College Press, 2008.

25. I. Shmulevich, E. R. Dougherty, S. Kim, and W. Zhang, “Probabilistic Boolean net-

works: a rule-based uncertainty model for gene regulatory networks,” Bioinformatics,

vol. 18, pp. 261–274, 2002.

26. I. Shmulevich, E. R. Dougherty, and W. Zhang, “From Boolean to probabilistic

Boolean networks as models of genetic regulatory networks,” Proc. of the IEEE, vol. 90,

pp. 1778–1792, 2002.

27. A. Datta, R. Pal, A. Choudhary, and E. R. Dougherty, “Control approaches for prob-

abilistic gene regulatory networks,” IEEE Signal Processing Magazine, vol. 24, pp.

54–63, 2010.

28. Q. Liu, X. Guo, and T. Zhou, “Optimal control for probabilistic Boolean networks,”

IET Systems Biology, vol. 4, pp. 99–107, 2010.

29. D. Cheng, “Disturbance decoupling of Boolean control networks,” IEEE Trans. Auto-

matic Control, vol. 56, pp. 2–10, 2011.

30. D. Cheng and H. Qi, “Controllability and observability of Boolean control networks,”

Automatica, vol. 45, pp. 1659–1667, 2009.

31. D. Cheng, Z. Li, and H. Qi, “Realization of Boolean control networks,” Automatica,

vol. 46, pp. 62–69, 2010.

32. D. Cheng and H. Qi, “A linear representation of dynamics of Boolean networks,” IEEE

Trans. Automatic Control, vol. 55, pp. 2251–2258, 2010.

33. ——, “State-space analysis of Boolean networks,” IEEE Trans. Neural Networks,

vol. 21, pp. 584–594, 2010.

34. D. Cheng, “Input-state approach to Boolean networks,” IEEE Trans. Neural Networks,

vol. 20, pp. 512–521, 2009.

35. A. A. Agrachev and Y. L. Sachkov, Control Theory From The Geometric Viewpoint,

ser. Encyclopedia of Mathematical Sciences. Springer-Verlag, 2004, vol. 87.

36. H. J. Sussmann and J. C. Willems, “300 years of optimal control: from the brachys-

tochrone to the maximum principle,” IEEE Control Systems Magazine, vol. 17, pp.

32–44, 1997.


37. B. Bonnard and M. Chyba, Singular Trajectories and their Role in Control Theory.

Springer, 2003.

38. M. Margaliot, “Stability analysis of switched systems using variational principles: An

introduction,” Automatica, vol. 42, pp. 2059–2077, 2006.

39. M. Margaliot and M. S. Branicky, “Nice reachability for planar bilinear control systems

with applications to planar linear switched systems,” IEEE Trans. Automatic Control,

vol. 54, pp. 1430–1435, 2009.

40. Y. Sharon and M. Margaliot, “Third-order nilpotency, finite switchings and asymptotic

stability,” J. Diff. Eqns., vol. 233, pp. 136–150, 2007.

41. M. Margaliot and D. Liberzon, “Lie–algebraic stability conditions for nonlinear

switched systems and differential inclusions,” Systems Control Lett., vol. 55, pp. 8–16,

2006.

42. D. Laschov and M. Margaliot, “A maximum principle for single-input Boolean control

networks,” IEEE Trans. Automatic Control, vol. 56, pp. 913–917, 2011.

43. D. Cheng and Y. Dong, “Semi-tensor product of matrices and its some applications

to physics,” Methods Appl. Anal., vol. 10, pp. 565–588, 2003.

44. D. S. Bernstein, Matrix Mathematics. Princeton University Press, 2005.

45. G. Langholz, J. L. Mott, and A. Kandel, Foundations of Digital Logic Design. World

Scientific, 1998.

46. S. P. Sethi and G. L. Thompson, Optimal Control Theory: Applications to Management

Science and Economics. Kluwer Academic Publishers, 2000.

47. T. Monovich and M. Margaliot, “Analysis of discrete–time linear switched systems: A

variational approach,” SIAM J. Control Optim., vol. 49, pp. 808–829, 2011.

48. H. J. Sussmann, “The structure of time-optimal trajectories for single-input systems

in the plane: The general real analytic case,” SIAM J. Control Optim., vol. 25, pp.

868–904, 1987.

49. M. Athans and E. Tse, “A direct derivation of the optimal linear filter using the

maximum principle,” IEEE Trans. Automatic Control, vol. 12, pp. 690–698, 1967.

50. A. E. Bryson and Y.-C. Ho, Applied Optimal Control: Optimization, Estimation and

Control. Taylor & Francis, 1975.

51. U. Ledzewicz and H. Schattler, “Antiangiogenic therapy in cancer treatment as an

optimal control problem,” SIAM J. Control Optim., vol. 46, pp. 1052–1079, 2007.

52. H. J. Sussmann and G. Tang, “Shortest paths for the Reeds-Shepp car: A worked

out example of the use of geometric techniques in nonlinear optimal control,” 1991.

[Online]. Available: http://www.math.rutgers.edu/∼sussmann

53. T. Akutsu, M. Hayashida, W.-K. Ching, and M. K. Ng, “Control of Boolean networks:

Hardness results and algorithms for tree structured networks,” J. Theoretical Biology,

vol. 244, pp. 670–679, 2007.

a pontryagin maximum principle for multi–input boolean

Documents