a pontryagin maximum principle for multi–input boolean
TRANSCRIPT
1
A Pontryagin MaximumPrinciple for Multi–InputBoolean Control Networks⋆
Dmitriy Laschov and Michael Margaliot⋆⋆
School of Electrical Engineering–Systems, Tel Aviv University, Israel 69978.
Summary. A Boolean network consists of a set of Boolean variables whose state is deter-
mined by other variables in the network. Boolean networks have been studied extensively
as models for simple artificial neural networks. Recently, Boolean networks gained con-
siderable interest as models for biological systems composed of elements that can be in
one of two possible states. Examples include genetic regulation networks, where the ON
(OFF) state corresponds to the transcribed (quiescent) state of a gene, and cellular net-
works where the two possible logic states may represent the open/closed state of an ion
channel, basal/high activity of an enzyme, two possible conformational states of a pro-
tein, etc. Daizhan Cheng developed an algebraic state-space representation for Boolean
control networks using the semi–tensor product of matrices. This representation proved
quite useful for studying Boolean control networks in a control-theoretic framework. Using
this representation, we consider a Mayer-type optimal control problem for Boolean control
networks. Our main result is a necessary condition for optimality. This provides a parallel
of Pontryagin’s maximum principle to Boolean control networks.
1 Introduction
A Boolean network consists of a set of Boolean variables whose state is determined
by other variables in the network. Cellular automata, with two possible states per
⋆ Research supported in part by the Israel Science Foundation (ISF).⋆⋆ Corresponding author: Prof. Michael Margaliot, School of Electrical Engineering–
Systems, Tel Aviv University, Israel 69978. Homepage: www.eng.tau.ac.il/~michaelmEmail: [email protected]
2 Dmitriy Laschov and Michael Margaliot
cell, are a particular case of Boolean networks. Here the state of each variable at
time k+1 is determined by the state of its spatial neighbors at time k [1]. A Boolean
network with n variables has 2n possible states and therefore the dynamics for any
initial condition must fall into an attractor.
Boolean networks have been studied extensively as models for simple artificial
neural networks (see, e.g. [2]). Here each neuron realizes a threshold function that
attains the values zero or one. More recently, Boolean networks gained renewed
interest as models for biological systems composed of elements that can be in one
of two possible states (i.e., ON or OFF). S. A. Kauffman [3] modeled a gene as a
binary device, and studied the behavior of large, randomly constructed nets of these
binary genes. Kauffman’s simulations indicate that if each network node has two
or three inputs, then the dynamical behavior of the network demonstrates order
and stability. Kauffman also related the behavior of the random nets to various
cellular control processes including cell differentiation. The key idea being to view
each stable attractor as representing one possible cell type.
Kauffman’s pioneering ideas stimulated research in several directions. One di-
rection is the theoretical analysis of the dynamics of Boolean networks, espe-
cially using tools from the theory of complex systems and statistical physics (see,
e.g. [4, 5, 6, 7, 8, 9, 10]).
Another research direction is modeling various biological processes using Boolean
networks. Analyzing the behavior of the Boolean network may provide considerable
insight on the original biological process. This is a vast area of research and we
review here only a few examples.
1.1 Boolean networks modeling in biology
Boolean networks seem especially suitable for modeling genetic regulation networks
where the ON (OFF) state corresponds to the transcribed (quiescent) state of the
gene. There are several other motivations [11] for using Boolean networks in this con-
text, including the fact that many metabolic and genetic networks demonstrate some
form of bi-stability. An important example are epigenetic switches (see, e.g. [12, 13]).
Specific examples of genetic regulation networks modeled using Boolean networks
include: the cell–cycle regulatory network of the budding yeast [14]; the yeast tran-
scriptional network [15]; the network controlling the segment polarity genes in the
fly Drosophila melanogaster [16, 17]; the ABC network determining floral organ cell
fate in Arabidopsis [18] (see also [19]);
Boolean networks were also used for modeling various cellular processes. In this
context the two possible logic states may represent the open/closed state of an ion
channel, basal/high activity of an enzyme, two possible conformational states of
a protein, etc. Specific examples include: a detailed model for the highly complex
cellular signaling network controlling stomatal closure in plants [20]; and a model of
the molecular pathway between two neurotransmitter systems, the dopamine and
glutamate receptors [21].
1 A Pontryagin Maximum Principle for Multi–Input Boolean Control Networks 3
Szallasi and Liang [22] discuss the use of Boolean networks in modeling carcino-
genesis and for analyzing the effect of therapeutic intervention (see also [23]).
These studies suggest that Boolean networks provide a highly efficient modeling
tool for large–scale biological networks. These models are able to reproduce the
main characteristics of the biological network dynamics: attractors of the Boolean
network correspond to stationary biological states; large attraction basins indicate
robustness of the biological state, and so on.
Modeling using Boolean networks requires only coarse–grained qualitative infor-
mation (e.g., an interaction between two genes is either activating or inhibiting).
This is in sharp contrast to other models, for example, those based on differential
equations, that require knowledge of numerous parameter values (e.g., rate con-
stants). For a general exposition on various approaches for modeling gene regulation
networks, see [24].
Modeling a biological system involves considerable uncertainty. This is due to
the noise and perturbations that affect the biological system, and to the inaccuracies
of the measuring equipment. One approach for tackling this uncertainty is by using
Probabilistic Boolean Networks (PBNs) [25, 26]. These may be viewed as a collection
of (deterministic) Boolean networks combined with a probabilistic switching rule
determining which network is active at each time instant.
It is natural to extend the idea of Boolean networks to include input variables.
For example, an input may represent the dosage that is administered to a patient.
Boolean networks with (binary) inputs variables are referred to as Boolean Control
Networks (BCNs). PBNs with inputs were used to design and analyze therapeutic
intervention strategies. The idea here is to find a control that shifts the network from
an undesirable location (representing a “diseased” state) to a desirable state. Such
problems can be cast as stochastic optimal control problems, and solved numerically
using dynamic programming [27, 28].
Daizhan Cheng and his colleagues developed an algebraic state–space representa-
tion of BCNs using the semi–tensor product of matrices. This representation proved
quite useful for studying BCNs in a control–theoretic framework. Examples include
the analysis of disturbance decoupling [29], controllability and observability [30],
realization theory [31], and more [32, 33, 34].
Here we make use of this state–space representation to analyze a Mayer–type
optimal control problem for BCNs. Our main result is a necessary condition for a
control to be optimal. This provides a parallel of the celebrated Pontryagin max-
imum principle (PMP) (see, e.g., [35, 36, 37]) for BCNs. The proof of our main
result is motivated by the simple proof of a special case of the PMP used in the
variational analysis of switched systems [38] (see also [39, 40, 41]). The first result
in this direction appeared in our recent paper [42] describing a maximum principle
for the special case of single–input BCNs.
The remainder of this chapter is organized as follows. Section 2 reviews BCNs.
Section 3 describes Cheng’s algebraic state–space representation of BCNs using
the semi–tensor product of matrices. Section 4 details our main result which is a
4 Dmitriy Laschov and Michael Margaliot
x2x1
and
u2u1
andor
Fig. 1 Graphical representation of the BCN in Example 1.
new maximum principle (MP) for BCNs. Section 5 includes the proof of our main
result. In Section 6 we consider the so–called singular case where the MP does not
provide any direct information on the optimal control. Several synthetic examples
demonstrate the application of the new MP.
2 Boolean control networks
A Boolean control network is a discrete–time logical dynamic control system in the
form
x1(k + 1) = f1(x1(k), . . . , xn(k), u1(k), . . . , um(k)), (1)
...
xn(k + 1) = fn(x1(k), . . . , xn(k), u1(k), . . . , um(k)),
where xi, ui ∈ {True,False}, and each fi is a Boolean function.
A BCN may be represented graphically as a network with n nodes, representing
the xis, and m inputs. A directed edge from node i (input ui) to node j implies
that xj(k + 1) depends on xi(k) (ui(k)).
Example 1. Consider the two–state, two–input BCN
x1(k + 1) = x1(k) ∨ [x2(k) ∧ u1(k)], (2)
x2(k + 1) = x2(k) ∧ u2(k).
Fig. 1 depicts the graphical representation of this BCN.
It is worth noting that a BCN with m inputs is a Boolean switched system
switching between 2m possible subsystems, with the value of the control determining
which subsystem is active at every time step. To demonstrate this, note that in (2)
the control values may attain one of four values: (u1(k), u2(k)) ∈ {TT, TF, FT, FF},
where T (F ) is shorthand for True (False). With each of these four possible values
we can associate a corresponding dynamics, i.e. a subsystem. For example, when
u1(k) = u2(k) = T the corresponding subsystem is given by
1 A Pontryagin Maximum Principle for Multi–Input Boolean Control Networks 5
x1(k + 1) = x1(k) ∨ x2(k),
x2(k + 1) = x2(k).
3 Algebraic state–space representation of BCNs
Control–theoretic problems for BCNs are best addressed in the algebraic state–space
representation for BCNs derived by Daizhan Cheng and his colleagues [43, 32, 34,
29, 31]. This is based on the semi–tensor product of matrices.
3.1 Semi–tensor product
Recall that the Kronecker product (see, e.g. [44, Chapter 7]) of two matrices A ∈
Rm×n and B ∈ R
p×q is
A ⊗ B =
a11B · · · a1nB...
. . ....
am1B · · · amnB
.
Note that (A ⊗ B) ∈ R(mp)×(nq).
Given two positive integers a, b, let lcm(a, b) denote the least common multiple
of a and b. For example, lcm(6, 8) = 24. Let In denote the n × n identity matrix.
Definition 1. The semi–tensor product of two matrices A ∈ Rm×n and B ∈ R
p×q
is
A ⋉ B = (A ⊗ Iα/n)(B ⊗ Iα/p),
where α = lcm(n, p).
Remark 1. Note that (A⊗ Iα/n) ∈ R(mα/n)×α and (B ⊗ Iα/p) ∈ R
α×(qα/p), so (A ⋉
B) ∈ R(mα/n)×(qα/p).
Remark 2. If n = p, then A⋉B = (A⊗I1)(B⊗I1) = AB, i.e. in this case we recover
the standard matrix product. Thus, we may view the semi–tensor product as a
generalization of the standard matrix product that provides a way to multiply two
matrices with arbitrary dimensions. Intuitively, this is based on first modifying A, B
to two matrices (A⊗Iα/n), (B⊗Iα/p) of compatible dimensions and then calculating
their standard matrix product. The following examples demonstrate this idea.
Example 2. Consider a ⋉ b where a, b ∈ R2. Here m = p = 2 and n = q = 1,
so α = lcm(n, p) = 2, and
6 Dmitriy Laschov and Michael Margaliot
a ⋉ b = (a ⊗ I2)(b ⊗ I1)
=
a1 0
0 a1
a2 0
0 a2
b
=[
a1b1 a1b2 a2b1 a2b2
]T
.
Example 3. Consider the semi–tensor product of a row–vector aT =[
a1 . . . an
]
and a column vector b =[
b1 . . . bp
]T
. Suppose that p divides n, i.e. s = n/p is an
integer. Then α = lcm(n, p) = n, so
aT⋉ b = (aT ⊗ I1)(b ⊗ Is)
= aT
b1Is
...
bpIs
.
Various properties of the semi–tensor product are analyzed in [43]. For our pur-
poses, it is sufficient to note that this product is associative
A ⋉ (B ⋉ C) = (A ⋉ B) ⋉ C,
and distributive
(A + B) ⋉ C = (A ⋉ C) + (B ⋉ C).
3.2 Algebraic representation of Boolean functions
The semi–tensor product allows representing Boolean functions in an algebraic form.
Let ein denote the ith column of the identity matrix In. Represent the Boolean values
True and False by e12 =
[
1
0
]
and e22 =
[
0
1
]
, respectively. Then any Boolean function
of n variables f : {True, False}n → {True,False} can be equivalently represented
as a mapping f : {e12, e
22}
n → {e12, e
22}. With some abuse of notation, we identify f
with f . In other words, from here on a Boolean variable xi is always a vector
in {e12, e
22}.
The next result shows that any Boolean function may be represented in an
algebraic form.
Theorem 1. [32] Let f : {e12, e
22}
n → {e12, e
22} be a Boolean function. There exists
a unique binary matrix Mf of dimensions 2 × 2n such that
f(x1, . . . , xn) = Mf ⋉ x1 ⋉ · · · ⋉ xn.
Mf is called the structure matrix of f .
1 A Pontryagin Maximum Principle for Multi–Input Boolean Control Networks 7
Remark 3. To provide some intuition on this representation, consider the case n = 2,
i.e. f = f(x1, x2). Recall that xi ∈ {e12, e
22}, so x1 =
[
v v]T
and x2 =[
w w]T
,
with v, w ∈ {0, 1}. Then
x1 ⋉ x2 =[
vw vw vw vw]T
, (3)
i.e. x1 ⋉ x2 contains all the possible minterms of v and w. Recall that any Boolean
function may be represented as a sum of some minterms of its variables (see,
e.g. [45]). This is known as the sum of products (SOP) representation. The mul-
tiplication Mf ⋉ x1 ⋉ x2 provides such a representation. Note that (3) implies
that x1 ⋉ x2 ∈ {e14, . . . , e
44}. Indeed, one and only one minterm has the value 1 and
all the other must be 0.
Example 4. Consider the function f(x) = x, i.e. f is defined by f(e12) = e2
2
and f(e22) = e1
2. It is easy to verify that f(x) =
[
0 1
1 0
]
⋉ x. Consider the func-
tion g(x1, x2) = x1 ∧ x2. It is straightforward to verify that
g(x1, x2) = Mg ⋉ x1 ⋉ x2,
with Mg =
[
1 0 0 0
0 1 1 1
]
. For example,
Mg ⋉ e12 ⋉ e2
2 = Mg ⋉
[
0 1 0 0]T
= Mg
[
0 1 0 0]T
=[
0 1]T
,
= e22,
corresponding to (True ∧ False) = False.
3.3 Algebraic representation of BCNs
Since the dynamics of BCNs is described by a set of Boolean functions, it is clear
from the discussion above that the semi–tensor product can be used to provide an
algebraic state–space representation of BCNs.
Theorem 2. [33] Consider a BCN with state variables x1, . . . , xn and inputs u1, . . . , um,
where xi, ui ∈ {e12, e
22}. Denote x(k) = x1(k) ⋉ · · ·⋉ xn(k) and u(k) = u1(k) ⋉ · · ·⋉
um(k). There exists a unique matrix L ∈ R2n×2n+m
such that
x(k + 1) = L ⋉ u(k) ⋉ x(k). (4)
The matrix L is called the transition matrix of the BCN.
8 Dmitriy Laschov and Michael Margaliot
Algorithms for converting a BCN in the form (1) to its algebraic representation (4),
and vice versa, may be found in [32, 30].
Remark 4. The intuition behind this representation is very similar to the alge-
braic representation of a single Boolean function using the semi–tensor product.
To demonstrate this, consider a BCN with n = 2 and m = 1. Then (4) be-
comes x(k +1) = L⋉u1(k)⋉x1(k)⋉x2(k). To simplify the notation, we omit from
here on the dependence on k. Denote x1 =[
p p]T
, x2 =[
q q]T
, and u1 =[
v v]T
.
Then
u1 ⋉ x1 ⋉ x2 =[
vpq vpq vpq vpq vpq vpq vpq vpq]T
.
Thus, u⋉x includes all the possible minterms of the input and state variables. The
equation x(k + 1) = L ⋉ u(k) ⋉ x(k) amounts to a description of (every minterm
of) the next state in terms of the current state and inputs.
Remark 5. Note that since u(k) = u1(k)⋉ · · ·⋉um(k), with ui(k) ∈ {e12, e
22}, u(k) ∈
{e12m , . . . , e2m
2m}. For example, if m = 3, u1(k) = e12, u2(k) = e2
2, and u3(k) = e22,
then u(k) = e48.
Example 5. Consider the BCN in Example 1. Here n = 2 and m = 2, so x(k) =
x1(k) ⋉ x2(k) and u(k) = u1(k) ⋉ u2(k). Applying the algorithm described in [30]
yields the transition matrix
L =
1 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0
0 1 0 0 1 1 1 0 0 1 0 0 1 1 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0
0 0 0 1 0 0 0 1 0 0 0 1 0 0 1 1
.
To demonstrate the equivalence of the original dynamics and (4), consider for exam-
ple the case where x1(k) = False, x2(k) = True, u1(k) = True, and u2(k) = False.
Then (2) yields
x1(k + 1) = True, x2(k + 1) = False. (5)
In the algebraic framework, this corresponds to x1(k) = u2(k) = e22, x2(k) =
u1(k) = e12. Then
x(k + 1) = L ⋉ u(k) ⋉ x(k)
= L ⋉
[
1
0
]
⋉
[
0
1
]
⋉
[
0
1
]
⋉
[
1
0
]
= L ⋉
[
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0]T
= L[
0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0]T
=[
0 1 0 0]T
.
1 A Pontryagin Maximum Principle for Multi–Input Boolean Control Networks 9
Writing x1(k+1) =[
v v]T
and x2(k+1) =[
w w]T
yields x(k+1) =[
vw vw vw vw]T
,
so v = w = 1. Thus, x1(k + 1) = e12, x2(k + 1) = e2
2, and this agrees, of course,
with (5).
4 Main result
A fundamental problem for all dynamical control systems is to determine a con-
trol that is optimal in some sense. In other words, a control that maximizes (or
minimizes) a given cost–functional. Our main result is a necessary condition for op-
timality stated in the form of a maximum principle. This provides a parallel of the
PMP for BCNs. We begin by defining an optimal control problem for BCNs. Con-
sider a BCN in the algebraic state–space representation (4). Fix some (arbitrary)
initial condition x(0) = x0 ∈ {e12n , . . . , e2n
2n}.
4.1 Optimal control problem
Fix a final time N > 0. Let U denote the set of admissible controls, i.e. the set of all
the sequences {u(0), . . . , u(N−1)}, with u(i) ∈ {e12m , . . . , e2m
2m}. For a control u ∈ U,
let x(k;u) denote the solution of (4), with x(0) = x0, at time k. Fix a vector r ∈ R2n
,
and consider the cost–functional
J(u) = rT x(N ;u). (6)
We now pose a Mayer–type optimal control problem.
Problem 1. Find a control u∗ ∈ U that maximizes J .
This problem clearly admits a solution, as U is a finite set. We refer to a control
that maximizes J as an optimal control. In principle, Problem 1 may be solved
numerically by simply calculating x(N ;u) for any u ∈ U. However, this is clearly
not practical for large values of N .
Example 6. Suppose that n = 3, so that
x(N) = x1(N) ⋉ x2(N) ⋉ x3(N).
Denote x1(N) =[
v v]T
, x2(N) =[
w w]T
, and x3(N) =[
q q]T
. Then
x(N) =[
vwq vwq vwq vwq vwq vwq vwq vwq]T
.
Suppose that we take r =[
1 1 0 0 . . . 0]T
. Then rT x(N) = vw. Thus, maximiz-
ing (6) corresponds to trying to find a control u steering the BCN to x1(N) =
x2(N) = e12, if it exists.
10 Dmitriy Laschov and Michael Margaliot
Remark 6. Recall that x(N) consists of all the minterms of the Boolean state vari-
ables at time N . Hence any Boolean function f of the state at time N may be
represented in the form (6), i.e. as f = rTf x(N,u), where rf is a binary vector. In
this particular case, J(u) can attain only two values, namely, zero and one. This
yields a reachability problem that is quite relevant for BCNs that model biological
networks, as here states can usually be divided into desirable and non–desirable
states. For example, in a model of cell differentiation a non–desirable state corre-
sponds to uncontrolled cell proliferation (see, e.g. [27, 28]). For a different approach
for analyzing reachability in BCNs, see [30].
We are interested in developing an analytical characterization of optimal con-
trols. By iterating (4) we find that for any two integers k ≥ j ≥ 0,
x(k;u) = C(k, j;u) ⋉ x(j;u), (7)
where
C(k, j;u) = L ⋉ u(k − 1) ⋉ L ⋉ u(k − 2) ⋉ · · · ⋉ L ⋉ u(j), (8)
with C(k, k;u) = I2n . We refer to the 2n × 2n matrix C(k, j;u) as the transition
matrix from time j to time k corresponding to the control u. Note that (8) implies
that for any k ≥ l ≥ j,
C(k, j;u) = C(k, l;u) ⋉ C(l, j;u).
We can now state our main result.
Theorem 3. Consider the BCN (4). Suppose that u∗ = {u∗(0), . . . , u∗(N−1)} ∈ U
is an optimal control for Problem 1, and let x∗ denote the corresponding trajectory
of (4). Let the adjoint λ : {1, . . . , N} → R2n
be the solution of
λ(k) = (L ⋉ u∗(k))T⋉ λ(k + 1),
λ(N) = r, (9)
and define 2m switching functions αi : {0, 1, . . . , N − 1} → R, i = 1, . . . , 2m, by
αi(s) = λT (s + 1) ⋉ L ⋉ ei2m ⋉ x∗(s). (10)
For any time s, if for some index i
αi(s) > αj(s) for all j 6= i,
then
u∗(s) = ei2m . (11)
Theorem 3 provides a necessary condition for optimality in terms of the switching
functions αi. Note that this is somewhat similar to the PMP for discrete–time
dynamical systems (see, e.g. [46, Ch. 8][47]).
1 A Pontryagin Maximum Principle for Multi–Input Boolean Control Networks 11
Remark 7. It is instructive to verify that αi(·) is indeed a scalar function. Since the
dimensions of λT (·) are 1×2n and those of L are 2n×2n+m (recall that we consider
a BCN with m inputs), it follows from Remark 2 that
λT (s + 1) ⋉ L = λT (s + 1)L ∈ R1×2n+m
.
Since the dimensions of x∗(·) are 2n × 1, Remark 1 implies that
ei2m ⋉ x∗(s) ∈ R
2n+m×1.
Thus,
αi(s) = λT (s + 1) ⋉ L ⋉ ei2m ⋉ x∗(s)
= λT (s + 1)L(
ei2m ⋉ x∗(s)
)
is indeed a scalar.
It is possible to state our main result in a ”Hamiltonian form”. To do so, define
H : R2n
× R2n
× R2m
→ R by H(x, λ, u) = λT L⋉u⋉x. By Remark 1, (λT L⋉u) ∈
R1×2n
, so we may also write H as
H(x, λ, u) = (λT L ⋉ u)x
= λT (L ⋉ u)x.
Then (9) can be written as
λ(k) =∂
∂xH(x∗(k), λ(k + 1), u∗(k)), (12)
the system dynamics (4) as
x(k + 1) =∂
∂λH(x∗(k), λ(k + 1), u∗(k)), (13)
and (11) may be written as
u∗(s) = arg maxv∈{e1
2m ,...e2m
2m}H(x∗(s), λ(s + 1), v). (14)
Furthermore, the function
H∗(s) = H∗(x∗(s), λ(s + 1), u∗(s))
is constant. Indeed,
12 Dmitriy Laschov and Michael Margaliot
H∗(s) = λT (s + 1)(L ⋉ u∗(s))x∗(s) (15)
= λT (s)x∗(s)
= λT (s)(L ⋉ u∗(s − 1))x∗(s − 1)
= H∗(s − 1).
Specializing Theorem 3 to the case m = 1, i.e., to BCNs with a single input,
yields the following result.
Corollary 1. [42] Consider the BCN (4) with m = 1. Suppose that u∗1 = {u∗
1(0), . . . , u∗1(N−
1)} ∈ U is an optimal control for Problem 1. Let the adjoint λ : {1, . . . , N} → R2n
be the solution of
λ(k) = (L ⋉ u∗1(k))T
⋉ λ(k + 1),
λ(N) = r, (16)
and let β(s) = λT (s + 1) ⋉ L ⋉
[
1
−1
]
⋉ x∗(s). Then
u∗1(s) =
e12, if β(s) > 0,
e22, if β(s) < 0.
(17)
Proof. In this case, the switching functions are
α1(s) = λT (s + 1) ⋉ L ⋉ e12 ⋉ x∗(s),
α2(s) = λT (s + 1) ⋉ L ⋉ e22 ⋉ x∗(s),
so α1(s) − α2(s) = β(s). Hence, the condition α1(s) > α2(s) (α2(s) > α1(s)) is
equivalent to β(s) > 0 (β(s) < 0). ⊓⊔
The next simple example demonstrates an application of Corollary 1.
Example 7. Consider the single–input BCN
x(k + 1) = x(k) ∧ u(k),
x(0) = True. (18)
Here n = m = 1, and the algebraic state–space form is
x(k + 1) = L ⋉ u(k) ⋉ x(k)
x(0) = e12,
with L =
[
1 0 0 0
0 1 1 1
]
. Fix some final time N > 0 and consider Problem 1 for r =
[
1
0
]
.
Letting x∗(N) =[
w w]T
, this implies that we are trying to maximize w, i.e. to find
a control u∗ steering the system to x∗(N) = e12, if it exists.
1 A Pontryagin Maximum Principle for Multi–Input Boolean Control Networks 13
In this case,
β(N − 1) = λT (N) ⋉ L ⋉
[
1
−1
]
⋉ x∗(N − 1)
= rT⋉ L ⋉
[
1
−1
]
⋉ x∗(N − 1)
=[
1 0]
⋉
[
1 0 0 0
0 1 1 1
]
⋉
[
1
−1
]
⋉ x∗(N − 1)
=[
1 0 0 0]
⋉
[
1
−1
]
⋉ x∗(N − 1)
=[
1 0]
⋉ x∗(N − 1). (19)
We consider two cases.
Case 1. Suppose that x∗(N − 1) = e22. Then (4) yields
x∗(N) = L ⋉ u∗(N − 1) ⋉ e22
= L ⋉
[
v
v
]
⋉ e22
=
[
1 0 0 0
0 1 1 1
]
[
0 v 0 v]T
= e22,
so rT x∗(N) = 0.
Case 2. Suppose that x∗(N − 1) = e12. Then (19) yields β(N − 1) = 1, so by the
MP, u∗(N − 1) = e12 and, therefore, x∗(N) = e1
2. Using (16) yields
λ(N − 1) = (L ⋉ u∗(N − 1))T⋉ λ(N)
= (L ⋉ e12)
T⋉ e1
2
= e12.
Hence,
β(N − 2) = λT (N − 1) ⋉ L ⋉
[
1
−1
]
⋉ x∗(N − 2)
=[
1 0]
⋉ x∗(N − 2).
Comparing this with (19), we conclude that there are two possibilities. Either x∗(N) =
e22 (and then any control is optimal) or x∗(N) = e1
2 and then the (unique) optimal
control is u∗(k) = e12, for any k ∈ {0, 1 . . . , N − 1}. Thus, in this example the MP
provides a complete characterization of the optimal control.
The next section is devoted to the proof of Theorem 3.
14 Dmitriy Laschov and Michael Margaliot
5 Proof of main result
Fix an arbitrary time p ∈ {0, . . . , N −1} and an arbitrary vector v ∈ {e12m , . . . e2m
2m}.
Define a new control u ∈ U by a perturbation of u∗:
u(j) =
v, if j = p,
u∗(j), otherwise.(20)
In other words, u is identical to the optimal control u∗ except, perhaps, at the
time p. This is a parallel of the needle variation used in the proof of the PMP.
Let x∗(·) = x(·;u∗) denote the solution corresponding to u∗. It follows from the
definition of the transition matrix that
x∗(N) = C(N, p + 1;u∗) ⋉ C(p + 1, p;u∗) ⋉ C(p, 0;u∗) ⋉ x0
= C(N, p + 1;u∗) ⋉ L ⋉ u∗(p) ⋉ x∗(p).
Similarly, for x(·) = x(·;u), we have
x(N) = C(N, p + 1;u) ⋉ L ⋉ u(p) ⋉ x(p;u)
= C(N, p + 1;u∗) ⋉ L ⋉ v ⋉ x∗(p),
where the second equation follows from the definition of u in (20). Thus,
x∗(N) − x(N) = C(N, p + 1;u∗) ⋉ L ⋉ (u∗(p) − v) ⋉ x∗(p),
and
J(u∗) − J(u) = rT (x∗(N) − x(N))
= rT (C(N, p + 1;u∗) ⋉ L ⋉ (u∗(p) − v) ⋉ x∗(p)). (21)
To simplify this expression, let wT (p + 1) = rT C(N, p + 1;u∗). Then wT (N) =
rT C(N,N ;u∗) = rT , and
w(p) = CT (N, p;u∗)r
= (C(N, p + 1;u∗) ⋉ C(p + 1, p;u∗))T r
= (C(N, p + 1;u∗)C(p + 1, p;u∗))T r
= CT (p + 1, p;u∗)CT (N, p + 1;u∗)r
= CT (p + 1, p;u∗) ⋉ CT (N, p + 1;u∗)r
= CT (p + 1, p;u∗) ⋉ w(p + 1)
= (L ⋉ u∗(p))T⋉ w(p + 1).
Comparing this with (9) shows that w(p) = λ(p) for all p, and thus (21) yields
J(u∗) − J(u) = λT (p + 1) ⋉ L ⋉ (u∗(p) − v) ⋉ x∗(p). (22)
1 A Pontryagin Maximum Principle for Multi–Input Boolean Control Networks 15
Now suppose that there exists an index i such that αi(p) > αj(p) for all j 6= i. We
need to show that u∗(p) = ei2m . Seeking a contradiction, assume that u∗(p) = ej
2m
for some j 6= i. Then for v = ei2m the right–hand side of (22) is
λT (p + 1) ⋉ L ⋉ (ej2m − ei
2m) ⋉ x∗(p) = αj(p) − αi(p)
< 0,
so J(u∗) − J(u) < 0. But this contradicts the optimality of u∗. Thus, u∗(p) = ei2m .
Since p is arbitrary, this completes the proof of Theorem 3. ⊓⊔
Summarizing, if there exists an index i such that αi(s) > αj(s) for all j 6=
i, then the MP uniquely determines u∗(s). In the next section, we consider the
complementary situation referred to as the singular case.
6 The Singular Case
The next result shows that the singular case is actually easy to handle.
Theorem 4. Suppose that u∗ is an optimal control. Assume that for some time s
there exists a subset of indexes I = {i1, . . . , il} such that αi1(s) = · · · = αil(s) and
αi1(s) > αj(s) for all j 6∈ I. Then u∗(s) ∈ {ei12m , . . . , eil
2m}. Furthermore, any control
in the form
w(j) =
z, if j = s,
u∗(j), otherwise,
with z ∈ {ei12m , . . . , eil
2m}, is also an optimal control.
Proof. For simplicity, assume that I = {i1, i2} (the proof for the general case is
similar). In this case, the conditions in the theorem are
αi1(s) > αj(s) for all j 6∈ {i1, i2}, (23)
and
0 = αi1(s) − αi2(s)
= λ(s + 1)T⋉ L ⋉ (ei1
2m − ei22m) ⋉ x∗(s).
Let u∗ ∈ U be an optimal control and denote z = u∗(s). Arguing as in the proof of
Theorem 3, it follows from (23) that either z = ei12m or z = ei2
2m . Assume that z =
ei22m . Define a new control u by
u(j) =
ei12m , if j = s,
u∗(j), otherwise.(24)
Then (22) yields
16 Dmitriy Laschov and Michael Margaliot
J(u∗) − J(u) = λT (s + 1) ⋉ L ⋉ (z − ei12m) ⋉ x∗(s)
= λT (s + 1) ⋉ L ⋉ (ei22m − ei1
2m) ⋉ x∗(s)
= αi2(s) − αi1(s)
= 0.
In other words, the new control u is also an optimal control. Since u(s) = ei12m , this
completes the proof of Theorem 4. ⊓⊔
The next example demonstrates an application of Theorems 3 and 4.
Example 8. Consider the two–state, two–input BCN
x1(k + 1) = [x1(k) ∧ x2(k)] ∨ [u1(k) ∧ u2(k) ∧ x2(k)] ∨ [u2(k) ∧ x1(k)],
x2(k + 1) = [u1(k) ∧ u2(k) ∧ x1(k) ∧ x2(k)] ∨ [u1(k) ∧ u2(k) ∧ x1(k)]
∨ [u1(k) ∧ (x1(k) ⊕ x2(k))]. (25)
Here n = m = 2. Suppose that the initial condition is x1(0) = x2(0) = False, and
consider Problem 1 with N = 3 and r = e14. In other words, the problem is to
determine a control that maximizes J(u) = (e14)
T x(3). Intuitively, this amounts to
finding a control steering the state to x1(3) = x2(3) = True, if it exists.
The algebraic state–space form is given by (4) with x(0) = e44, and
L =
0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0
1 0 0 0 1 0 0 0 0 0 1 0 1 1 0 0
0 1 1 0 0 0 1 0 0 1 0 0 0 0 0 1
0 0 0 1 0 0 0 1 0 0 0 1 0 0 1 0
.
To analyze this problem using the MP consider the functions
αi(2) = λT (3) ⋉ L ⋉ ei4 ⋉ x∗(2)
= rT⋉ L ⋉ ei
4 ⋉ x∗(2)
=[
1 0 0 0]
⋉ L ⋉ ei4 ⋉ x∗(2)
=[
0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0]
⋉ ei4 ⋉ x∗(2). (26)
Using the definition of the semi–tensor product yields
α1(2) =[
0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0]
⋉ e14 ⋉ x∗(2)
=[
0 0 0 0]
⋉ x∗(2)
= 0,
and similarly
1 A Pontryagin Maximum Principle for Multi–Input Boolean Control Networks 17
α2(2) =[
0 1 0 0]
⋉ x∗(2),
α3(2) =[
1 0 0 0]
⋉ x∗(2),
α4(2) = 0.
We consider several cases.
Case 1. Assume that x∗(2) = e24. Then α2(2) = 1, and αi(2) = 0 for any i 6= 2.
Theorem 3 implies that u∗(2) = e24. Using (9) yields
λ(2) = (L ⋉ u∗(2))T λ(3)
= (L ⋉ e24)
T r
= e24, (27)
so
αi(1) = λT (2) ⋉ L ⋉ ei4 ⋉ x∗(1)
= (e24)
T⋉ L ⋉ ei
4 ⋉ x∗(1),
and this yields
α1(1) =[
1 0 0 0]
⋉ x∗(1),
α2(1) =[
1 0 0 0]
⋉ x∗(1),
α3(1) =[
0 0 1 0]
⋉ x∗(1),
α4(1) =[
1 1 0 0]
⋉ x∗(1). (28)
We consider two subcases.
Case 1.1. Suppose that x∗(1) = e34. Then α3(1) > αj(1) for any j 6= 3, so Theorem 3
implies that u∗(1) = e34. This yields
λ(1) = (L ⋉ u∗(1))T λ(2)
= (L ⋉ e34)
T e24
= e34,
so
αi(0) = λT (1) ⋉ L ⋉ ei4 ⋉ x∗(0)
= (e34)
T⋉ L ⋉ ei
4 ⋉ e44.
A calculation yields α1(0) = α2(0) = α3(0) = 0, and α4(0) = 1. By Theo-
rem 3, u∗(0) = e44. Summarizing, in this case we conclude that {u∗(0), u∗(1), u∗(2)} =
{e44, e
34, e
24} is the only control that satisfies the necessary condition for optimality. A
calculation shows that the corresponding trajectory is x∗(0) = e44, x∗(1) = e3
4, x∗(2) = e24,
18 Dmitriy Laschov and Michael Margaliot
and x∗(3) = e14, so this control indeed steers the system to the desired location. If we
are interested in finding one control that steers the system to the desired location,
then we may stop the calculations at this point. Otherwise, we continue to con-
sider the remaining cases. It turns out that for this particular example, Theorems 3
and 4 provide enough information to explicitly determine the optimal controls. To
demonstrate this, we consider one more subcase.
Case 1.2. Suppose that
x∗(1) = e14. (29)
Then (28) yields α1(1) = α2(1) = α4(1) = 1, and α3(1) = 0. Theorem 4 implies
that if u∗ is an optimal control, we may assume that u∗(1) = e14. Then
λ(1) = (L ⋉ u∗(1))T λ(2)
= (L ⋉ e14)
T e24
= e14,
so
αi(0) = λT (1) ⋉ L ⋉ ei4 ⋉ x∗(0)
= (e14)
T⋉ L ⋉ ei
4 ⋉ e44.
A calculation yields αi(0) = 0 for all i. Theorem 4 now implies that there exists an
optimal control satisfying u∗(0) = e14. Then
x∗(1) = L ⋉ u∗(0) ⋉ x∗(0)
= L ⋉ e14 ⋉ e4
4
= e44,
but this contradicts (29), so we conclude that Case 1.2 is not possible.
7 Conclusions
We considered a Mayer–type optimal control problem for BCNs. Using the alge-
braic state–space formulation developed by Daizhan Cheng, we derived a necessary
condition for optimality in the form of a maximum principle. We also analyzed the
singular case where the MP itself does not provide direct information on the optimal
control.
Several synthetic examples were used to demonstrate the application of the MP.
A natural direction for further research is the analysis of optimal controls in BCNs
that model real biological systems using the MP derived here. The main difficulty is
that an MP provides implicit information on the optimal control, as the necessary
condition for optimality is stated in terms of the switching functions that depend on
the (unknown) optimal control. Nevertheless, there are many important cases where
1 A Pontryagin Maximum Principle for Multi–Input Boolean Control Networks 19
the PMP, combined with geometric tools, provides a complete characterization of
optimal controls (see, e.g., [48, 49, 50, 51, 52]). It might be interesting to search
for such special cases in the context of BCNs. Control problems for BCNs are in
general NP–hard [53], yet this does not preclude the existence of important special
cases that can be solved analytically.
We believe that further developments in optimal control theory for BCNs may
lead to important insights on the design of suitable controls for various real–world
systems modeled using BCNs.
References
1. J. L. Schiff, Cellular Automata: A Discrete View of the World. Wiley-Interscience,
2008.
2. M. H. Hassoun, Fundamentals of Artificial Neural Networks. MIT Press, 1995.
3. S. A. Kauffman, “Metabolic stability and epigenesis in randomly constructed genetic
nets,” J. Theoretical Biology, vol. 22, pp. 437–467, 1969.
4. R. Albert and A.-L. Barabasi, “Dynamics of complex systems: scaling laws or the
period of Boolean networks,” Phys. Rev. Lett., vol. 84, pp. 5660–5663, 2000.
5. M. Aldana, “Boolean dynamics of networks with scale–free topology,” Physica D, vol.
185, pp. 45–66, 2003.
6. B. Derrida and Y. Pomeau, “Random networks of automata: a simple annealed ap-
proximation,” Europhys. Lett., vol. 1, pp. 45–49, 1986.
7. B. Drossel, T. Mihaljev, and F. Greil, “Number and length of attractors in a critical
Kauffman model with connectivity one,” Phys. Rev. Lett., vol. 94, 2005, 088701.
8. S. A. Kauffman, Origins of Order: Self–Organization and Selection in Evolution. Ox-
ford University Press, 1993.
9. B. Luque and R. V. Sole, “Lyapunov exponents in random Boolean networks,” Physica
A: Statistical Mechanics and its Applications, vol. 284, pp. 33–45, 2000.
10. B. Samuelsson and C. Troein, “Superpolynomial growth in the number of attractors
in Kauffman networks,” Phys. Rev. Lett., vol. 90, 2003, 098701.
11. S. Huang, “Regulation of cellular states in mammalian cells from a genomewide view,”
in Gene Regulation and Metabolism, J. Collado-Vides and R. Hofestadt, Eds. MIT
Press, 2002, pp. 181–220.
12. M. Ptashne, A Genetic Switch, 3rd ed. Cold Spring Harbor, 2004.
13. D. Laschov and M. Margaliot, “Mathematical modeling of the λ switch: a fuzzy logic
approach,” J. Theoretical Biology, vol. 260, pp. 475–489, 2009.
14. F. Li, T. Long, Y. Lu, Q. Ouyang, and C. Tang, “The yeast cell–cycle network is
robustly designed,” Proc. Natl. Acad. Sci. U.S.A., vol. 101, pp. 4781–4786, 2004.
15. S. Kauffman, C. Peterson, B. Samuelsson, and C. Troein, “Random Boolean network
models and the yeast transcriptional network,” Proc. Natl. Acad. Sci. U.S.A., vol. 100,
pp. 14 796–14 799, 2003.
16. R. Albert and H. G. Othmer, “The topology of the regulatory interactions predicts
the expression pattern of the segment polarity genes in Drosophila melanogaster,” J.
Theoretical Biology, vol. 223, pp. 1–18, 2003.
17. M. Chaves, R. Albert, and E. D. Sontag, “Robustness and fragility of Boolean models
for genetic regulatory networks.” J. Theoretical Biology, vol. 235, pp. 431–449, 2005.
20 Dmitriy Laschov and Michael Margaliot
18. C. Espinosa-Soto, P. Padilla-Longoria, and E. R. Alvarez-Buylla, “A gene regulatory
network model for cell–fate determination during Arabidopsis thaliana flower devel-
opment that is robust and recovers experimental gene expression profiles,” Plant Cell,
vol. 16, pp. 2923–2939, 2004.
19. A. Chaos, M. Aldana, C. Espinosa-Soto, B. G. P. de Leon, A. G. Arroyo, and E. R.
Alvarez-Buylla, “From genes to flower patterns and evolution: dynamic models of gene
regulatory networks,” J. Plant Growth Regul., vol. 25, pp. 278–289, 2006.
20. S. Li, S. M. Assmann, and R. Albert, “Predicting essential components of signal trans-
duction networks: a dynamic model of guard cell abscisic acid signaling,” PLoS Biol.,
vol. 4, pp. 1732–1748, 2006.
21. S. Gupta, S. S. Bisht, R. Kukreti, S. Jain, and S. K. Brahmachari, “Boolean network
analysis of a neurotransmitter signaling pathway,” J. Theoretical Biology, vol. 244, pp.
463–469, 2007.
22. Z. Szallasi and S. Liang, “Modeling the normal and neoplastic cell cycle with “realistic
Boolean genetic networks”: their application for understanding carcinogenesis and
assessing therapeutic strategies,” Pac. Symp. Biocomput., vol. 3, pp. 66–76, 1998.
23. S. Kauffman, “Differentiation of malignant to benign cells,” J. Theoretical Biology,
vol. 31, pp. 429–451, 1971.
24. H. Bolouri, Computational Modelling of Gene Regulatory Networks–A Primer. Im-
perial College Press, 2008.
25. I. Shmulevich, E. R. Dougherty, S. Kim, and W. Zhang, “Probabilistic Boolean net-
works: a rule-based uncertainty model for gene regulatory networks,” Bioinformatics,
vol. 18, pp. 261–274, 2002.
26. I. Shmulevich, E. R. Dougherty, and W. Zhang, “From Boolean to probabilistic
Boolean networks as models of genetic regulatory networks,” Proc. of the IEEE, vol. 90,
pp. 1778–1792, 2002.
27. A. Datta, R. Pal, A. Choudhary, and E. R. Dougherty, “Control approaches for prob-
abilistic gene regulatory networks,” IEEE Signal Processing Magazine, vol. 24, pp.
54–63, 2010.
28. Q. Liu, X. Guo, and T. Zhou, “Optimal control for probabilistic Boolean networks,”
IET Systems Biology, vol. 4, pp. 99–107, 2010.
29. D. Cheng, “Disturbance decoupling of Boolean control networks,” IEEE Trans. Auto-
matic Control, vol. 56, pp. 2–10, 2011.
30. D. Cheng and H. Qi, “Controllability and observability of Boolean control networks,”
Automatica, vol. 45, pp. 1659–1667, 2009.
31. D. Cheng, Z. Li, and H. Qi, “Realization of Boolean control networks,” Automatica,
vol. 46, pp. 62–69, 2010.
32. D. Cheng and H. Qi, “A linear representation of dynamics of Boolean networks,” IEEE
Trans. Automatic Control, vol. 55, pp. 2251–2258, 2010.
33. ——, “State-space analysis of Boolean networks,” IEEE Trans. Neural Networks,
vol. 21, pp. 584–594, 2010.
34. D. Cheng, “Input-state approach to Boolean networks,” IEEE Trans. Neural Networks,
vol. 20, pp. 512–521, 2009.
35. A. A. Agrachev and Y. L. Sachkov, Control Theory From The Geometric Viewpoint,
ser. Encyclopedia of Mathematical Sciences. Springer-Verlag, 2004, vol. 87.
36. H. J. Sussmann and J. C. Willems, “300 years of optimal control: from the brachys-
tochrone to the maximum principle,” IEEE Control Systems Magazine, vol. 17, pp.
32–44, 1997.
1 A Pontryagin Maximum Principle for Multi–Input Boolean Control Networks 21
37. B. Bonnard and M. Chyba, Singular Trajectories and their Role in Control Theory.
Springer, 2003.
38. M. Margaliot, “Stability analysis of switched systems using variational principles: An
introduction,” Automatica, vol. 42, pp. 2059–2077, 2006.
39. M. Margaliot and M. S. Branicky, “Nice reachability for planar bilinear control systems
with applications to planar linear switched systems,” IEEE Trans. Automatic Control,
vol. 54, pp. 1430–1435, 2009.
40. Y. Sharon and M. Margaliot, “Third-order nilpotency, finite switchings and asymptotic
stability,” J. Diff. Eqns., vol. 233, pp. 136–150, 2007.
41. M. Margaliot and D. Liberzon, “Lie–algebraic stability conditions for nonlinear
switched systems and differential inclusions,” Systems Control Lett., vol. 55, pp. 8–16,
2006.
42. D. Laschov and M. Margaliot, “A maximum principle for single-input Boolean control
networks,” IEEE Trans. Automatic Control, vol. 56, pp. 913–917, 2011.
43. D. Cheng and Y. Dong, “Semi-tensor product of matrices and its some applications
to physics,” Methods Appl. Anal., vol. 10, pp. 565–588, 2003.
44. D. S. Bernstein, Matrix Mathematics. Princeton University Press, 2005.
45. G. Langholz, J. L. Mott, and A. Kandel, Foundations of Digital Logic Design. World
Scientific, 1998.
46. S. P. Sethi and G. L. Thompson, Optimal Control Theory: Applications to Management
Science and Economics. Kluwer Academic Publishers, 2000.
47. T. Monovich and M. Margaliot, “Analysis of discrete–time linear switched systems: A
variational approach,” SIAM J. Control Optim., vol. 49, pp. 808–829, 2011.
48. H. J. Sussmann, “The structure of time-optimal trajectories for single-input systems
in the plane: The general real analytic case,” SIAM J. Control Optim., vol. 25, pp.
868–904, 1987.
49. M. Athans and E. Tse, “A direct derivation of the optimal linear filter using the
maximum principle,” IEEE Trans. Automatic Control, vol. 12, pp. 690–698, 1967.
50. A. E. Bryson and Y.-C. Ho, Applied Optimal Control: Optimization, Estimation and
Control. Taylor & Francis, 1975.
51. U. Ledzewicz and H. Schattler, “Antiangiogenic therapy in cancer treatment as an
optimal control problem,” SIAM J. Control Optim., vol. 46, pp. 1052–1079, 2007.
52. H. J. Sussmann and G. Tang, “Shortest paths for the Reeds-Shepp car: A worked
out example of the use of geometric techniques in nonlinear optimal control,” 1991.
[Online]. Available: http://www.math.rutgers.edu/∼sussmann
53. T. Akutsu, M. Hayashida, W.-K. Ching, and M. K. Ng, “Control of Boolean networks:
Hardness results and algorithms for tree structured networks,” J. Theoretical Biology,
vol. 244, pp. 670–679, 2007.