optimizing information flow in small genetic networks. ii

18
University of Pennsylvania University of Pennsylvania ScholarlyCommons ScholarlyCommons Department of Physics Papers Department of Physics 4-6-2010 Optimizing Information Flow in Small Genetic Networks. II. Feed- Optimizing Information Flow in Small Genetic Networks. II. Feed- forward Interactions forward Interactions Aleksandra M. Walczak Princeton University Gasper Tkačik University of Pennsylvania William Bialek Princeton University Follow this and additional works at: https://repository.upenn.edu/physics_papers Part of the Physics Commons Recommended Citation Recommended Citation Walczak, A. M., Tkačik, G., & Bialek, W. (2010). Optimizing Information Flow in Small Genetic Networks. II. Feed-forward Interactions. Retrieved from https://repository.upenn.edu/physics_papers/26 Suggested Citation: Walczak, A.M., G. Tkačik and W. Bialek. (2010). "Optimizing information flow in small genetic networks. II. Feed- forward interactions." Physical Review E. 81, 041905. © The American Physical Review http://dx.doi.org/10.1103/PhyRevE.81.041905 This paper is posted at ScholarlyCommons. https://repository.upenn.edu/physics_papers/26 For more information, please contact [email protected].

Upload: others

Post on 04-Feb-2022

5 views

Category:

Documents


0 download

TRANSCRIPT

University of Pennsylvania University of Pennsylvania

ScholarlyCommons ScholarlyCommons

Department of Physics Papers Department of Physics

4-6-2010

Optimizing Information Flow in Small Genetic Networks. II. Feed-Optimizing Information Flow in Small Genetic Networks. II. Feed-

forward Interactions forward Interactions

Aleksandra M. Walczak Princeton University

Gasper Tkačik University of Pennsylvania

William Bialek Princeton University

Follow this and additional works at: https://repository.upenn.edu/physics_papers

Part of the Physics Commons

Recommended Citation Recommended Citation Walczak, A. M., Tkačik, G., & Bialek, W. (2010). Optimizing Information Flow in Small Genetic Networks. II. Feed-forward Interactions. Retrieved from https://repository.upenn.edu/physics_papers/26

Suggested Citation: Walczak, A.M., G. Tkačik and W. Bialek. (2010). "Optimizing information flow in small genetic networks. II. Feed-forward interactions." Physical Review E. 81, 041905.

© The American Physical Review http://dx.doi.org/10.1103/PhyRevE.81.041905

This paper is posted at ScholarlyCommons. https://repository.upenn.edu/physics_papers/26 For more information, please contact [email protected].

Optimizing Information Flow in Small Genetic Networks. II. Feed-forward Optimizing Information Flow in Small Genetic Networks. II. Feed-forward Interactions Interactions

Abstract Abstract Central to the functioning of a living cell is its ability to control the readout or expression of information encoded in the genome. In many cases, a single transcription factor protein activates or represses the expression of many genes. As the concentration of the transcription factor varies, the target genes thus undergo correlated changes, and this redundancy limits the ability of the cell to transmit information about input signals. We explore how interactions among the target genes can reduce this redundancy and optimize information transmission. Our discussion builds on recent work [Tkačik et al., Phys. Rev. E 80, 031920 (2009)], and there are connections to much earlier work on the role of lateral inhibition in enhancing the efficiency of information transmission in neural circuits; for simplicity we consider here the case where the interactions have a feed forward structure, with no loops. Even with this limitation, the networks that optimize information transmission have a structure reminiscent of the networks found in real biological systems.

Disciplines Disciplines Physical Sciences and Mathematics | Physics

Comments Comments Suggested Citation: Walczak, A.M., G. Tkačik and W. Bialek. (2010). "Optimizing information flow in small genetic networks. II. Feed-forward interactions." Physical Review E. 81, 041905.

© The American Physical Review http://dx.doi.org/10.1103/PhyRevE.81.041905

This journal article is available at ScholarlyCommons: https://repository.upenn.edu/physics_papers/26

Optimizing information flow in small genetic networks. II. Feed-forward interactions

Aleksandra M. Walczak,1,* Gašper Tkačik,2,† and William Bialek1,‡

1Joseph Henry Laboratories of Physics, Lewis-Sigler Institute for Integrative Genomics, and Princeton Center for Theoretical Science,Princeton University, Princeton, New Jersey 08544, USA

2Department of Physics and Astronomy, University of Pennsylvania, Philadelphia, Pennsylvania 19104-6396, USA�Received 31 December 2009; published 6 April 2010�

Central to the functioning of a living cell is its ability to control the readout or expression of informationencoded in the genome. In many cases, a single transcription factor protein activates or represses the expres-sion of many genes. As the concentration of the transcription factor varies, the target genes thus undergocorrelated changes, and this redundancy limits the ability of the cell to transmit information about input signals.We explore how interactions among the target genes can reduce this redundancy and optimize informationtransmission. Our discussion builds on recent work �Tkačik et al., Phys. Rev. E 80, 031920 �2009��, and thereare connections to much earlier work on the role of lateral inhibition in enhancing the efficiency of informationtransmission in neural circuits; for simplicity we consider here the case where the interactions have a feedforward structure, with no loops. Even with this limitation, the networks that optimize information transmis-sion have a structure reminiscent of the networks found in real biological systems.

DOI: 10.1103/PhysRevE.81.041905 PACS number�s�: 87.10.Vg, 87.15.R�, 87.18.Tt, 87.85.Ng

I. INTRODUCTION

The genomes of even the smallest bacteria encode thestructure of hundreds of proteins; for complicated organismslike us, the number of different proteins reaches into the tensof thousands �1�. During the course of its life, each cell hasto control how many copies of each protein molecule aresynthesized �2�. These decisions about the expression of ge-netic information occur on many scales, from �more or less�irreversible decisions during differentiation into different tis-sue types down to continuous modulations of the number ofenzyme molecules in response to varying metabolic needsand resources �3�. As with many biological processes, thecontrol of gene expression depends critically on the trans-mission of information. Because the relevant information isrepresented inside the cell by signaling molecules at lowconcentrations, the irreducibly stochastic behavior of indi-vidual molecules sets a physical limit to information trans-mission �4–7�. A more precise discussion is motivated by theemergence of experiments which measure the noise in geneexpression �8–15�. In this paper, building on our previouswork �16�, we explore how cells can structure their geneticcontrol circuitry to optimize information transmission in thepresence of these physical constraints, thus maximizing thecontrol power that they can achieve while using only a lim-ited number of molecules.

Although the problems of information transmission in ge-netic control are general, it is useful to have a concrete ex-ample in mind. In developing embryos, information aboutposition—and hence, ultimately, fate in the developedorganism—is encoded by spatially varying concentrations of“morphogen” molecules �17,18�. These molecules often aretranscription factors, and positional information then is trans-

mitted to the expression levels of the target genes �18–20�. Inthis scheme, maximizing information transmission maxi-mizes the richness of the spatial patterns which the embryocan construct.

In recent work we have considered the problem of maxi-mizing information transmission from a single transcriptionfactor to multiple target genes, in the case where the targetsare noninteracting �16�. Already, this problem generatessome structures that remind us of real genetic control net-works. But, when a single transcription factor controls manygenes independently, the signals carried by those genes nec-essarily are redundant �21�. To fully optimize the informationwhich can be carried by a given number of molecules, theremust be interactions among the target genes which reducethis redundancy. Our goal in this paper is to derive the formof these interactions. We emphasize that this is an ambitiousproject: we are trying to find the topology and parameters ofa genetic network from first principles, constrained only bythe limited concentration of all the relevant molecules. Tosimplify our task, we start here with the case in which thenetwork of interactions has no closed loops, and return to thefull problem in subsequent papers.

When we search possible networks for the ones that opti-mize information transmission, it is convenient to break thissearch into two parts. First, we enumerate network topolo-gies, and then we search parameters within each topology. Itis conventional to encode in the network topology the sign ofthe interactions, so that changing the ‘shape of an arrow’from activation to repression is counted as a qualitativelydifferent network �see Fig. 1�. Even if we assume that eachprotein can act either as activator or repressor for all its tar-gets �22�, with one transcription factor controlling K inter-acting genes in a feed forward network, there are 2K possiblenetwork topologies. For each of these possibilities we find asingle well defined optimum for all the continuously adjust-able parameters in the system, and these optimal parametervalues are determined only by the number of available mol-ecules. Thus, the structures of these networks really are de-rivable, quantitatively, from first principles. For related ap-

*[email protected][email protected][email protected]

PHYSICAL REVIEW E 81, 041905 �2010�

1539-3755/2010/81�4�/041905�16� ©2010 The American Physical Society041905-1

proaches to information flow in biochemical and geneticnetworks, see Refs. �24–28�.

Among the possible network topologies, there are some inwhich interaction strengths are driven to zero by the optimi-zation. Among the nontrivial solutions, however, we find thatthe different topologies achieve very similar information ca-pacities. Thus, it seems that there is not a single optimalnetwork, but rather an exponentially large number of nearlydegenerate local optima. All of these local optima are net-works with competing interactions, so that a single targetgene is activated and repressed by different inputs. Phenom-enologically, if we view the expression level of each targetgene as a function of the input transcription factor concen-tration, these competing interactions lead to nonmonotonicinput/output relations. It is this nonmonotonicity that allowsthe collection of target genes to explore more fully the spaceof expression levels, thus enhancing the capacity to transmitinformation. Although we consider only networks withoutfeedback, these nonmonotonic responses already are reminis-cent of the patterns seen in real genetic networks.

II. SETTING UP THE PROBLEM

The optimization problem that we are addressing in thispaper is a generalization of the problem discussed in ourprevious work �16�. To make this paper self-contained, how-ever, we begin by reviewing the general formulation of theproblem. We are interested in situations where a single tran-scription factor controls the expression level of severalgenes, labeled i=1,2 , . . . ,K. The expression levels of thesegenes, gi, carry information about the concentration c of theinput transcription factor. It is this information I�c ; �gi�� thatwe suggest is optimized by real networks.

To derive the consequences of our optimization principle,we require several ingredients: we need to describe the spaceof possible networks, we need to relate I�c ; �gi�� to somecalculable properties of these networks, we need to put thesepieces together to form the objective function for our opti-

mization problem, and finally we need to perform the opti-mization itself. In the interests of proceeding analytically asfar as possible down this path, before resorting to numerics,we adopt several approximations: we consider the casewhere expression levels have reached steady state valuesconsistent with the input transcription factor concentration,we assume that the noise in the system is small, and werestrict our attention to networks that have a feed forwardstructure. We have discussed the first two approximations inour previous work �16�, and our focus here on feed forwardnetworks is intended as a useful intermediate step betweenthe case of noninteracting target genes �as in Ref. �16�� andthe case of arbitrary interactions, to which we will return in asubsequent paper.

A. Describing the regulatory interactions

The expression levels of genes are determined by a com-plex sequence of events, especially in eukaryotic cells whereeven the transcription of DNA into messenger RNA involvesa complex of more than one hundred proteins �23�. As inmuch previous work, we abstract from this complexity to saythat expression levels are set by a balance of synthesis anddegradation reactions; in the simplest case only synthesis isregulated, and degradation is a first order process. Then thedeterministic kinetic equations that govern the expressionlevels are

dGi

dt= rmaxf i�c;�Gj�� −

1

�Gi, �1�

where � is the lifetime of the gene products against degrada-tion, rmax is the maximum number of molecules per secondthat can be synthesized, and f i�c ; �Gj�� is the “regulationfunction” that describes how the synthesis rate is modulatedby the other molecules in system. The regulation functions,which are positive and range between zero and one, will bedifferent for every gene, but we assume that the lifetimes andmaximum synthesis rates are the same for all of the genes�29�. In this formulation, Gi counts the number of moleculesof the protein coded by gene i, while it is conventional todefine c as a concentration. Before we are done, it will beuseful to normalize these quantities.

The steady state expression levels, Gi�c�, are the solutionsof the simultaneous nonlinear equations

Gi�c� = rmax�f i�c;�Gj�c��� . �2�

In general there can be multiple solutions, corresponding to anetwork that has more than one stationary state; as notedabove, we will confine our attention here to networks with afeed forward architecture, in which this cannot happen. InFig. 1�b�, we present an example of a feed forward network,where an input c regulates the expression of genes g1 and g2.The product proteins of g1 also regulate gene g2, but g2 doesnot regulate g1. Formally, the “no loops” condition definingfeed forward networks means that there is some assignmentof the labels i so that f i only depends upon Gj for j� i, andhence, as will be important below, the matrix �f i /�Gj islower triangular.

ACTIVATION:

A

B

B

A

REPRESSION:

A

B

B

A

a.

b. c

g g21

FIG. 1. Schematic models of transcriptional regulation. �a� Tran-scription factor proteins �A� can either activate �left� the expressionof genes �B� by recruiting the RNA polymerase, or repress �right�the expression of genes by blocking the polymerase. Throughoutthe paper we will use the diagrammatic representation, in which anarrow depicts activation and a blunted arrow depicts repression. �b�A feed forward network in which the input c regulates the expres-sion levels of g1 and g2, and g1 also regulates g2; the structure isfeed forward because g2 does not feedback on g1.

WALCZAK, TKAČIK, AND BIALEK PHYSICAL REVIEW E 81, 041905 �2010�

041905-2

A deterministic system with continuous inputs and out-puts can transmit an infinite amount of information. In real-ity, information transmission is limited by the discretenessand randomness of the individual molecular events. If thisnoise is small, it can be described by adding Langevin“forces” to the deterministic kinetic equations �30–34�. Tothe extent that synthesis and degradation rates can be writtenas depending on the instantaneous concentrations or expres-sion levels, as we have done in Eq. �1�, then the system hasno “hidden” memory, and the Langevin terms which describethe noise in the system should have effectively zero correla-tion time. We also will assume that the noise in the synthesisand degradation of different gene products are independent.Then we can write

dGi

dt= rmaxf i�c;�Gj�� −

1

�Gi + �i�t� , �3�

��i�t�� j�t��� = Ni�ij��t − t�� , �4�

where the Ni are the spectral densities of the noise forces.The spectral densities of noise have several components

which add together. First, if we take Eq. �1� seriously, itdescribes synthesis and degradation as simple first order pro-cesses; the terms which appear in the deterministic equationas rates should then be interpreted as the mean rates of Pois-son processes that describe the individual molecular transi-tions. Then we have

Ni�1� = rmaxf i�c;�Gj�� +

1

�Gi. �5�

We are interested in small fluctuations around the steady

state, so in this expression we should set Gi= Gi�c�. Further,we can account for the possibility that synthesis is a multi-step process by adding a “Fano factor” � to the spectraldensity of synthesis noise, so we have

Ni�1� = �rmaxf i�c;�Gj�c��� +

1

�Gi�c� , �6�

Ni�1� =

1 + �

�Gi�c� . �7�

The Fano factor is a measure of the dispersion of the prob-ability distribution, the ratio of the variance to the mean; herethe relevant random variable is the number of reaction eventsin a small window of time. As an example, if synthesis oc-curs in bursts, then we expect ��1 �15,31,35,36�. Bursts ofboth proteins and mRNAs have now been experimentallyobserved in a number of bacterial and eukaryotic systems�9,37–39�.

A second source of noise comes from the diffusive arrivalof transcription factor molecules at their targets. To describethis, we should remember that the regulation of transcriptiondepends on events that occur in a very small volume, oflinear dimension �. The relevant concentrations are those inthis volume, or more colloquially at the point where the tran-scription factors bind; binding and unbinding events act aslocalized sources and sinks for the diffusion equation that

describes the spatiotemporal variations in concentration. In-tuitively, this coupling generates noise because the diffusionof the transcription factors into the relevant volume is a sto-chastic process, so that even with the global concentrationfixed there are local concentration fluctuations �40�. Analysisof this problem �4,5,15,41� shows that, so long as the timefor diffusion through a distance � is short, the net effect ofthe diffusive fluctuations is captured by a Langevin term

Ni�2� = rmax

� f i�c;�Gj���c

2 c

D�, �8�

where D is the diffusion constant; again we should evaluate

this at Gj = Gj�c� to describe the small fluctuations around thesteady state.

The same arguments that apply to the input transcriptionfactor also apply to each of the target gene products whenthey act as transcription factors. If there are Gi molecules ofgene product i, then these proteins are present at concentra-tion Gi /�, where � is the relevant volume �42�, and theanalog of Eq. �8� is

Ni�3� = �

krmax

� f i�c;�Gj����Gk/�� 2 �Gk/��

D�, �9�

=�

D��

krmax

� f i�c;�Gj���Gk

2

Gk. �10�

Finally, in putting all these terms together, it is convenientto measure expression levels in units of the maximum mean

expression level, which is Gi=rmax� molecules; that is, wedefine gi=Gi / �rmax��. Then we have

�dgi

dt= f i�c;�gj�� − gi + i�t� , �11�

�i�t� j�t��� = �ij��t − t��1

rmax2 �Ni

�1� + Ni�2� + Ni

�3�� , �12�

=�ij��t − t���� 1 + �

rmax�gi�c� + � f i�c;�g���

�c 2 c

D��

+ �k � f i�c;�g���

�gk2 gk

D���rmax�/�����gk=gk�c��

.

�13�

The parameter combinations which appear in these equa-tions have simple interpretations. If �=1, the synthesis anddegradation of each gene product molecule really is an inde-pendent Poisson process, and hence rmax� is not just themaximum number of molecules that can be made �on aver-age�, but also the maximum number of independent mol-ecules, Ng in the notation of Ref. �16�. For more complexprocesses, where ��1, we can write Ng=2rmax� / �1+��.Similarly, the combination rmax� /� is a concentration, themaximum �mean� concentration of the gene products. Sincethese are acting as transcription factors, we will assume thatthis maximum concentration is also the maximum concentra-

OPTIMIZING INFORMATION… . II. FEED-FORWARD… PHYSICAL REVIEW E 81, 041905 �2010�

041905-3

tion of the input transcription factor, cmax. As discussed pre-viously �16�, there is a natural scale of concentration, c0=Ng / �D���, and if we use units in which c0=1 we can write

�i�t� j�t��� = ��t − t��Nij

= ��t − t���ij�

Ng� gi�c� + � f i�c;�g���

�c 2

c

+1

cmax�

k � f i�c;�g���

�gk2

gk���gk=gk�c��

.

�14�

To complete our description of the system we need to specifythe regulation functions f i�c ; �gj��.

B. Regulation functions

The central event in transcriptional regulation is the bind-ing of the transcription factor�s� to their specific sites alongthe DNA. These binding sites must be close enough to thestart of the coding sequence of the target gene that bindingcan influence the initiation of transcription by the RNA poly-merase and its associated proteins. In the simplest geometricpicture, which might actually be accurate in bacteria�43–46�, we can imagine the molecular interactions being sostrong that the normalized mean rate of transcription—whatwe write as the regulation function f i�c ; �gj��—is essentiallythe probability that certain binding sites are occupied or un-occupied. If n transcription factors bind cooperatively tonearby sites, and binding activates transcription, then wewould expect

f�c� =cn

Kn + cn , �15�

and if the binding represses transcription, so that the meanrate is proportional to the fraction of empty sites, we shouldhave

f�c� =Kn

Kn + cn . �16�

In each case, the constant K measures the concentration forhalf–occupancy of the binding sites, and so F=−kBT ln Kcan be interpreted as a binding energy for the transcriptionfactor to its specific site along the genome. In the biologicalliterature, these models are called Hill functions, after Hill’sclassical discussion of oxygen binding to hemoglobin �47�.While it is convenient to distinguish between activators andrepressors, we note that, within the Hill function model, wecan pass smoothly from one to the other by allowing n tochange sign.

The Hill function model identifies the transcription of agene as a logical function of binding site occupancy, so thatthe mean rate of transcription will be proportional to theprobability of the sites being occupied �or, in the case ofrepressors, not occupied�. Thus there is a natural generaliza-tion when multiple transcription factors converge to control asingle gene, as in Fig. 2: the rate of transcription should beproportional to logical functions on the occupancy of mul-

tiple binding sites. Thus, if the input transcription factor andthe gene j converge to activate gene i, transcription coulddepend on the AND of the two binding events; if these areindependent, then the regulation function becomes

f i�c,gj� =cni

Kini + cni

gjnij

Kijnij + gj

nij. �17�

Alternatively, transcription could depend on the OR of thetwo binding site occupancies, in which case

f i�c,gj� =cni

Kini + cni

+gj

nij

Kijnij + gj

nij, �18�

and we can construct functions that interpolate or mix ANDand OR. In general, we find that the pure AND functionstransmit the most information. Thus, we will write

f i�c,�gj�� =cni

Kini + cni

�j�i

gjnij

Kijnij + gj

nij, �19�

where the restriction j� i confines our attention to feed for-ward networks with no loops. If nij�0, then Kij→0 meansthat gene j does not regulate gene i; if nij�0 then the con-dition for no interaction is Kij→.

AND

a:

b:

c:

FIG. 2. Regulation of transcription in the presence of two tran-scription factors. �a� Two types of transcription factors, squares andtriangles, can bind to the regulatory region of the gene. This situa-tion does not uniquely determine the form of regulation. �b� TheHill regulatory model with the AND logic for the binding of the twotypes of transcription factors. In this case both types of transcriptionfactors must be bound to initiate transcription. The response of thegene is proportional to the product of the activation by each tran-scription factor separately. �c� The MWC model for regulation. Thetranscriptional complex has two states, active and inactive, andbinding of any type of transcription factor shifts the equilibriumbetween the two states.

WALCZAK, TKAČIK, AND BIALEK PHYSICAL REVIEW E 81, 041905 �2010�

041905-4

An alternative model for regulation envisions a large tran-scriptional complex that can be in one of two states, and therate of transcription is proportional to the population of the“active” state; transcription factors act by binding and shift-ing the equilibrium between the two states. This is essentiallythe model proposed by Monod, Wyman, and Changeaux forallosteric enzymes and cooperative binding �48�. The impor-tant idea is that, in each state, all effector molecules bindindependently, but the binding energies are different in thetwo states. By detailed balance, this difference in bindingenergy means that binding will shift the equilibrium betweenthe two states. In the case of one transcription factor at con-centration c, with n binding sites, the probability of being inthe active state is given by

f�c� =�1 + c/Kon�n

Ln�1 + c/Koff�n + �1 + c/Kon�n , �20�

where kBT ln L is the free energy difference between the twostates in the absence of any bound molecules. If bindingfavors the active state, and the gene has a very small prob-ability to be “on” in the absence of input �L�1�, then takingKon�Koff reduces Eq. �20� to

f�c� =�1 + c/Kon�n

Ln + �1 + c/Kon�n �21�

=1

1 + exp − n ln 1 + c/Kon

1 + c1/2/Kon� , �22�

where we have introduced the constant c1/2 to simplify nota-tion, L= �1+c1/2 /Kon�, and c1/2 is the input concentration forhalf maximal activation �49�. Note that activation and repres-sion differ only in the sign of n.

When more than one type of protein regulates expression�Fig. 2�c��, each binding event makes its independent contri-bution to shifting the equilibrium between active and inac-tive states. Thus, in the same limit of very unequal bindingconstants to the two states, we can write

f i�c,�gj�� =1

1 + exp�− Fi�c,�gj���, �23�

Fi�c,�gj�� = ni ln 1 + c/Ki

1 + c1/2�i� /Ki

+ �j�i

nij ln 1 + gj/Kij

1 + gj�c1/2�i� �/Kij

.

�24�

For a single transcription factor, there is not much differ-ence between the Hill and MWC models: the classes of func-tions are very similar, and when we go through the optimi-zation of information transmission, the optimal parametervalues are such that the input/output relations are nearly in-distinguishable. But, as we shall see, when multiple tran-scription factors converge, the Hill and MWC models lead tosignificantly different behaviors. This raises the possibilitythat the molecular details of the regulatory process are still

‘felt’ at the more macroscopic, phenomenological level, butthis is a huge set of questions that we leave aside for themoment, focusing just on these two models.

C. Information transmission

We are interested in connecting the properties of the net-works described in the previous section to the informationthat the expression levels carry about the input transcriptionfactor concentration. This requires only a modest generaliza-tion of the discussion in Ref. �16�, but, in the interest ofclarity, we review the derivation.

In the limit that noise is small, the solution of the Lange-vin equations in Sec. II A has the form of Gaussian fluctua-tions around the steady state. Formally, the probability dis-tribution of expression levels given the concentration of theinput transcription factor is

P��gi��c� =1

�2 �K/2exp� 1

2log det�K�

−1

2 �i,j=1

K

�gi − gi�c��Kij�gj − gj�c��� . �25�

The mean expression level of each gene, gi�c�, defines aninput/output relation for that gene, and the matrix K is theinverse covariance matrix of the fluctuations or noise in theexpression levels at fixed input,

��gi − gi�c���gj − gj�c��� = �K−1�ij; �26�

note that K is a function of c. The fact that the expressionlevels “carry information” about the input transcription fac-tor concentration means that it should be possible to estimatec from knowledge of the �gi�. From Bayes’ rule, the distri-bution of concentrations consistent with some set of expres-sion levels is given by

P�c��gi�� =P��gi��c�P�c�

P��gi��. �27�

In the limit that noise is small �K is large�, this too is anapproximately Gaussian distribution,

P�c��gi�� �1

�2 �c2exp −

�c − c���gi���2

2�c2 � , �28�

where c���gi�� is the most likely input given the output, andthe effective variance is determined by

1

�c2��gi��

= �i,j=1

K � dgi�c�dc

Kijdgj�c�

dc��

c=c���gi��. �29�

In general it is difficult to find an explicit expression forc���gi��, but we will see that this is not crucial. We can usethese ingredients to calculate the amount of information, inbits, that the expression levels �gi� provide about the inputconcentration c.

The mutual information between c and �gi� is defined,following Shannon �50,51�, as

OPTIMIZING INFORMATION… . II. FEED-FORWARD… PHYSICAL REVIEW E 81, 041905 �2010�

041905-5

I�c;�gi�� =� dc� dKgP�c,�gi��log2 P�c,�gi��P�c�P��gi��

� .

�30�

We can rewrite this as a difference in entropies,

I�c;�gi�� = −� dcP�c�log2 P�c� −� dKgP��gi��

� −� dcP�c��gi��log2 P�c��gi��� , �31�

where the first term is the entropy of the overall distributionof input concentrations, and the second term is the �mean�entropy of the distribution of inputs conditional on the out-put. With the Gaussian approximation from Eq. �28�, we canevaluate

−� dcP�c��gi��log2 P�c��gi�� =1

2log2�2 e�c

2��gi��� .

�32�

Further, when the noise is small, we can replace an averageover expression levels by an average over inputs, setting theexpression levels to their mean values along this path:

� dKgP��gi��� ¯ � � � dcP�c��i

��gi − gi�c��� ¯ � .

�33�

Putting these pieces together, we find an expression for theinformation in the low noise limit,

I�c;�gi�� = −� dcP�c�log2 P�c�

−1

2� dcP�c�log2�2 e�c

2��gi�c���� . �34�

As Eq. �34� makes clear, the information transmittedthrough the regulatory network depends on the input/outputrelations, on the noise level, and on the distribution of inputsP�c�. The cell can optimize information transmission by ad-justing this input distribution to match the characteristics ofthe network �6,7�, but this matching is constrained by thecost of making the input molecules. We can implement thisconstraint by trading bits against a cost function that countsthe mean number of molecules, or more simply by insistingthat c is bounded by some maximum concentration cmax. InRef. �16�, we have shown that these different ways of imple-menting the constraint give essentially identical results, sowe use the cmax approach here. We still need a constraint toforce the normalization of P�c�, so we should maximize thefunctional

L = I�c,�gi�� − �� dcP�c� . �35�

The solution to this problem can be written, using the expres-sions for I�c , �gi�� from above, as

P��c� �1

�c��gi�c����36�

=1

Z 1

2 e�i,j=1

Kdgi�c�

dcKij�c�

dgj�c�dc

�1/2

, �37�

where the normalization constant

Z = �0

cmax

dc 1

2 e�i,j=1

Kdgi�c�

dcKij�c�

dgj�c�dc

�1/2

. �38�

Finally, the information itself, evaluated with the optimal in-put distribution, is

I��c;�gi�� = log2 Z . �39�

This maximal mutual information still depends on all theparameters that describe the network, and our goal is to findthe parameter values that maximize I�. To do this, it is usefulto be a little more explicit about how the parameters enterinto the computation of Z.

D. Putting the pieces together

To evaluate Z in Eq. �38�, we need to know the inversecovariance matrix Kij that describes the noise in gene expres-sion. This can be calculated from the Langevin equations inSec. II A. To do this, we recall that we are looking at thesmall noise limit, so we linearize the Langevin equationsaround the steady state gi= gi�c�, and then Fourier transform.With gi�t�= gi+�gi�t�, and

�gi�t� =� d�

2 e−i�t�gi��� , �40�

Equation �11� becomes

�1 − i�� 0 0 . .

− �21 1 − i�� 0 . .

− �31 − �32 1 − i�� . .

. . . . . . . .���g1���

�g2����g3���

. .� = �

1���

2���

3���. .

� ,

�41�

where

�ij =� � f i�c,�gk���gj

��gk=gk�c��

. �42�

In this expression we explicitly restrict ourselves to networkswith a feed-forward architecture, which �as noted above�means that we can assign labels i so that gene j regulatesgene i only if j� i. We recall from Eq. �14� that

�i�t� j�t��� = ��t − t��Nij, �43�

where the noise matrix is diagonal,

WALCZAK, TKAČIK, AND BIALEK PHYSICAL REVIEW E 81, 041905 �2010�

041905-6

Nij = �ij�

Ng� gi�c� + � f i�c;�g���

�c 2

c

+1

cmax�

k � f i�c;�g���

�gk2

gk���gk=gk�c��

. �44�

The delta function in time means that the noise is white,

�i��� j������ = 2 ��� − ���Nij. �45�

Equations �41� are of the form

A��� · �g��� = ��� , �46�

where A��� is the matrix describing the linearized dynamics,

�g���= ��g1��� ,�g2��� ,¯�, and similarly for ���. Evi-

dently, �g���= A−1������. We are interested in calculatingthe covariance matrix of the fluctuations in �g. Since theseare stationary, in the frequency domain we will have

��gi����gj������ = 2 ��� − ���Sij��� , �47�

and the equal time correlations that we want to calculate arerelated to the power spectrum through

��gi�gj� =� d�

2 Sij��� . �48�

Following through the algebra, we have

��gi����gj������

= �A−1����ik�A−1�����jm� ��k����m

� ����� , �49�

=�A−1����ik�A−1�����jm� Nkm2 ��� − ��� , �50�

⇒Sij��� = �A−1����ikNkm�A−1����jm�

= �A−1���N�A−1����†�ij, �51�

where M† denotes the Hermitian conjugate of the matrix M.The covariance matrix, which is the inverse of Kij in Eq.�26�, then takes the form

�K−1�ij =� d�

2 �A−1���N�A−1����†�ij. �52�

To complete the calculation of Z we need to do this fre-quency integral, and then an integral over the concentrationc, as in Eq. �38�. Some details of these calculations are givenin the Appendix. Here we draw attention to two key points.

First, because the noise is white—that is, N is indepen-dent of frequency—all of the structure in the frequency inte-

gral comes from the inverse of the matrix A���. Since the

matrix elements of A are linear in frequency, this results inan integrand which is a rational function of �, so the inte-grals can all be done by closing a contour in the complex �plane. Again, examples are in the Appendix.

Second, and most importantly, when we finish computingZ it depends on all the parameters we want to optimize—e.g.,in the Hill description of the regulation functions, all of the

half-maximal points Kij and the cooperativities nij—and inaddition it depends on the maximal concentration of the in-put transcription factor, cmax. But there are no other param-eters in the problem. Thus, if we search for the maximum ofZ, and hence the optimum information transmission, we willfind the parameters of the network as a function of the avail-able number of molecules, with no remaining arbitrariness.

III. OPTIMAL NETWORKS AND THEIRPARAMETERS

In previous work �16�, we considered the case of manynoninteracting genes regulated by a single transcription fac-tor. Optimizing information transmission in these systemsleads to two qualitatively different families of solutions, de-pending on the available dynamic range cmax /c0. At smallcmax, the optimal strategy is for the different output genes torespond in the same way, allowing the system to make mul-tiple independent “measurements” of the input concentra-tion; in this case the target genes are completely redundant.At large cmax, a tiling solution appears, in which the genesare turned on sequentially at different concentrations; theprecise setting of the thresholds is determined by a compro-mise between minimizing the noise and maximizing the useof the dynamic range. Even in this regime, however, thetarget genes are �partially� redundant. Here we explore howthe interactions between target genes can be tuned to reduceredundancy and increase the capacity of the system to trans-mit information.

A. Two target genes

We first consider feed forward networks in which an inputtranscription factor at concentration c regulates the expres-sion of two genes, with normalized expression levels g1 andg2; in addition, the product proteins of gene 1 regulate theexpression of gene 2. Within this class of networks, we cancalculate the information transmission following the outlineabove, and then search for optimal values of all the param-eters.

We find that the general problem has many well definedlocal optima, corresponding to different topologies of thenetwork, as shown in Fig. 3. With just two target genes anda single input, the constraint of considering feed forwardnetworks means that the locations of the arrows in the graphdescribing the network are fixed, but it is conventional todistinguish different topologies based on the signs associatedto the arrows, that is whether a transcription factor acts as anactivator �A� or repressor �R�, leaving us with multiple pos-sibilities. For convenience we describe, for example, a net-work in which the input activates both target genes, whilegene 1 represses gene 2, as A1A2−R12.

Our intuition is that the role of interactions is to reduceredundancy, and this is borne out by the solutions associatedto each topology. In particular, if the input activates bothtargets, then having one target activate the other can’t help toreduce redundancy, and correspondingly we find that in theA1A2−A12 topology the interaction between the target genesis actually driven to zero by the optimization, and this is true

OPTIMIZING INFORMATION… . II. FEED-FORWARD… PHYSICAL REVIEW E 81, 041905 �2010�

041905-7

whether we use the Hill model of regulation or the MWCmodel �top and bottom left panels of Fig. 3�. The same intu-ition suggests that the R1R2−A12 topology should also bedriven to zero interaction, and this too is what we find �rightpanels of Fig. 3�. As an aside, these comparisons also verifythat, in the noninteracting case, the Hill and MWC modelsachieve very similar input/output relations once we find theparameters that optimize information transmission.

The networks, which are optimized with nontrival inter-actions, are the ones where one target gene represses theother, A1A2−R12 and R1R2−R12. For the Hill model, both ofthese networks are optimized when one of the target genes isexpressed only in a finite band or stripe of input concentra-

tions, so that if we monitor only the expression level g2, wesee a nonmonotonic dependence on the input transcriptionfactor. If we think of the input concentrations as varyingalong one axis of a two dimensional space—as they do inthe genetic networks controlling embryonic development�17–20�—then this nonmonotonic behavior means that oneof the genes will be expressed in a “stripe” through themiddle of the spatial domain. The other gene, which is itselfa repressor, is expressed only at the extremes of the concen-tration range, either at high or low input concentrations de-pending on whether the input is an activator or repressor,respectively. As in the case of noninteracting target genes, allof this structure depends on the available dynamic range of

10−2 10−1 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1g1g2g1g2

10−2 10−1 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1g1g2g1g2

A 1A 2 − A 12

A 1A 2 − R 12

A 1A 2 − A 12

A 1A 2 − R 12

c/cmax

c/cmax

g ig i

c

g1 g2

c

g1 g2

A 1A 2 − A 12 A 1A 2 − R 12

A 1A 2 − A 12

A 1A 2 − R 12

A 1A 2 − A 12

A 1A 2 − R 12

10−2 10−1 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

c/cmax

c/cmax10−2 10−1 1000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

g ig i

c

g1 g2

c

g1 g2

R 1R 2 − R 12 R 1R 2 − A 12

g1g2g1g2

R1R 2−R12

R1R 2−R12

R1R 2−A12

R1R 2−A12

g1g2g1g2

R1R 2−R12

R1R 2−R12

R1R 2−A12

R1R 2−A12

(b)(a)

FIG. 3. Locally optimal networks for four topologies of the two gene feed forward networks, with cmax /c0=10. Top panels are for thecase of the Hill model, Eq. �19�, and bottom panels are for the MWC model, Eq. �24�. As explained in the text, the optimal parameters forthe A1A2−A12 and the R1R2−A12 networks correspond to zero interaction between the target genes so these solutions also provide a referencefor input/output relations in the optimal noninteracting networks. Note that the noninteracting solutions are very similar for Hill and MWC,while the optimal interacting networks are qualitatively different.

WALCZAK, TKAČIK, AND BIALEK PHYSICAL REVIEW E 81, 041905 �2010�

041905-8

inputs. As we make cmax /c0 smaller, the optimal strength ofthe interactions becomes smaller, and the networks that op-timize information transmission are more nearly noninteract-ing; if cmax /c0 is sufficiently small, the optimal solutionagain is for all targets to be completely redundant, allowingthe system to make multiple independent measurements ofthe same signal, rather than using the different targets torespond to different portions of the input dynamic range.

We know from the analysis of the noninteracting systemthat suboptimal settings of all the parameters really do incurlarge penalties, substantially reducing information transmis-sion �16�. This remains true for the interacting case, in that ifwe choose a topology but set the parameters arbitrarily, thereis a significant loss of information. On the other hand, oncewe choose parameters optimally, the different topologieshave very similar information capacities, at least in the caseof two target genes, as shown in Fig. 4. To set a scale forthese results on information transmission, we could askwhether there really is an advantage to having two targetgenes. After all, noise is reduced by making more molecules,and with two targets the maximal number of output mol-ecules will be increased. A useful comparison, then, is with asystem that has just one target gene, but can generate twiceas many output molecules, and this is shown in Fig. 5. Wesee that, as the structure of the optimal networks collapses atlow values of cmax /c0, so too does the advantage of havingtwo separate target genes. At very low concentrations of theinput transcription factors, it really is better to have one tar-get with twice the number of molecules, but as soon as thereare enough molecules available to favor two target genes, theinteractions between these genes become important in build-ing the networks that optimize information flow.

One of the most interesting results of the optimizationproblem is that the Hill and MWC models, which are almostindistinguishable in the noninteracting case, are driven to

qualitatively different solutions in the interacting case. Thebasic idea of how interactions relieve redundancy can beunderstood in a limit where each gene is either fully ex-pressed �g=1� or completely turned off �g=0�. For the non-interacting network, for example, with the A1A2 topology, thegenes “turn on” in sequence, and so the system accessesstates 00, 01, and 11 in order as the input concentration isincreased. In the interacting network A1A2−R12, the optimalHill model �cf. Fig. 3, top left panel� accesses state 00, 01,and then switches to 10, not quite achieving the state 11. Incontrast, the MWC model, once optimized, accesses all thestates in sequence, 00→01→10→11, as the input concen-tration is increased, as we can see from the lower left panelof Fig. 3; the situation is reversed in the case of R1R2−R12.We can understand how this happens, because in the MWCmodel each binding event shifts the equilibrium between theactive and inactive states; thus there is a competition on gene2 between the activating effect of the input and the repress-ing effect of gene 1, and this competition continues through-out the range of input concentration. In the MWC model, byadjusting parameters, it is possible that the final increase ininput concentration up to cmax overwhelms the repressiveeffect of gene 1, and indeed this is what happens at theparameter values that optimize information transmission.

The difference between the Hill and MWC models indi-cates that the relatively microscopic physics of protein-DNAinteractions which underlies transcriptional regulation can be“felt” at a more macroscopic, phenomenological level. Inthis spirit, we should also ask if it matters that we assumeeach protein can act only as an activator or a repressor, butnot both. It is easy to imagine the alternative in which, byproper placement of the binding sites, the transcription factorcan activate one gene by recruiting the polymerase, but can

10−1 100 101 1021

1.5

2

2.5

3

3.5

4

c /cmax 0

Z

A1A2−R12A1R2−A12R1A2−A12A1A2−A12R1R2−R12R1A2−R12A1R2−R12R1R2−A12

~

FIG. 4. Optimal information transmission in different networktopologies, with Hill model regulation. The information capacity ofa network depends on Z, from Eq. �38�, which in turn is propor-tional to the number of independent copies of the output molecules,

Ng. Here we plot Z=2 eZ /Ng as a function of the maximal con-centration of input transcription factor, cmax /c0.

10−1 100 101 1021

1.5

2

2.5

3

3.5

4

4.5

5

5.5

two

two interacting genestwo noninteracting genesa single gene, Nsingleg =2N twog

~ Z

c /cmax 0

FIG. 5. One target or two? We show the dependence of Z on themaximum concentration of input molecules, in the case of Hillregulation functions, comparing an interacting two gene network, anoninteracting two gene network, and a single output gene withtwice the number of molecules. Note that to make the comparisonmeaningful, we have to be careful about the definition of c0, sincethis concentration scale includes a factor of Ng; here we normalizeto c0 for the two gene networks.

OPTIMIZING INFORMATION… . II. FEED-FORWARD… PHYSICAL REVIEW E 81, 041905 �2010�

041905-9

repress other genes by occlusion. We have analyzed thesemixed networks, and find that the optimal information trans-mission in these cases is intermediate between the “pure”networks considered thus far. Thus, considering these addi-tional topologies does not change the global picture of theproblem, except to open the possibility of yet more localoptima.

B. Three target genes

The analysis of networks with three target genes confirmsand amplifies the lessons learned in the two gene case. Be-cause the space of possible networks is much larger, we onceagain restrict individual proteins to act only as activators orrepressors, and give explicit results only for the case of theHill function models of regulation, Eq. �19�; results areshown in Figs. 6 and 7. As with two genes, having the targetgenes activate one another cannot reduce the redundancy thatis generated in the noninteracting networks, and hence theA1A2A3−A12A13A23 topology networks are driven back to thenoninteracting A1A2A3 when we solve the optimization prob-lem, and similarly for R1R2R3−A12A13A23 Topologies that in-clude some repressive interactions generate nontrivial solu-tions, with progressively richer structure the more repressionwe allow.

The signature of the repressive interactions in the caseof two targets was the emergence of nonmonotonic depen-dencies of the expression levels on the input transcription

factor concentration. In the case of three targets, the param-eters which maximize information transmission again lead tononmonotonicity. In the case where the input is an activator�Fig. 6�, allowing for one repressive interaction �A1A2A3−A12A13R23� leads to one gene having a nonmonotonic re-sponse. If we allow two repressive interactions �A1A2A3−R12R13A23� then two genes have nonmonotonic responses.Finally, with the maximum of three repressive interactions�A1A2A3−R12R13R23� one of the two nonmonotonic re-sponses becomes yet more complex, with two “stripes” ofexpression in different ranges of the input concentration.In the case of an input repressor, we see a similar pattern�Fig. 7�.

A striking feature of the networks with two target geneswas that local optima with different topologies were nearlydegenerate. This is less true for the case of three targets—thespread in capacities associated with the different topologiesis larger, and the network A1A2A3−R12R13R23 is the clearglobal optimum. Corresponding to the larger spread in ca-pacities, the enhancement of information transmission by in-teractions is larger, and in the difference in capacity betweenthe Hill and MWC models also is larger in the three genenetworks �Fig. 8�. Including the possibility that individualtranscription factors could act as both activators and repres-sors does not change these conclusions.

As a caveat, we note that we have assigned the samemaximal concentration to all the transcription factors in thenetwork. This is equivalent to assuming that all these mol-ecules come at the same cost to the organism. One could

10−3 10−2 10−1 1000

5

10

15

10−3 10−2 10−1 1000

2

4

6

8

10

12

14

10−3 10−2 10−1 1000

2

4

6

8

10

12

14

10−3 10−2 10−1 1000

0.2

0.4

0.6

0.8

1

g i

g1g2g3

10−3 10−2 10−1 1000

0.2

0.4

0.6

0.8

1

10−3 10−2 10−1 1000

0.2

0.4

0.6

0.8

1

10−3 10−2 10−1 1000

0.2

0.4

0.6

0.8

1

10−3 10−2 10−1 1000

2

4

6

8

10

12

14

p c p cp c p c

c

g1 g2 g3

c

g1 g2 g3

c

g1 g2 g3

c

g1 g2 g3

A 1A 2A 3 − R 12R 13A 23 A 1A 2A 3 − A 12A 13R 23 A 1A 2A 3 − A 12A 13A 23A 1A 2A 3 − R 12R 13R 23

g i g i g i

c/cmax c/cmaxc/cmax c/cmax

c/cmax c/cmaxc/cmax c/cmaxFIG. 6. Locally optimal networks with the three target genes, with the input acting as an activator. Regulation is according to the Hill

model, Eq. �19�, and we choose parameters to maximize information transmission with cmax /c0=10. Top row shows the network topologies,and the middle row shows the average expression levels of the different target genes, as in Fig. 3. The bottom panel shows the distributionof input concentrations that matches the optimal networks, from Eq. �37�.

WALCZAK, TKAČIK, AND BIALEK PHYSICAL REVIEW E 81, 041905 �2010�

041905-10

imagine that there are significant differences between themolecules, and hence that one could be expressed at higherlevels than the others for the same cost. If this symmetry isbroken explicitly, then the ordering of the local optima asso-ciated to different network topologies can be shifted. While itis satisfying to identify a global optimum, it probably thusstill makes sense to think of the problem as having manylocal optima that might all be relevant in different biological

contexts. An interesting feature of these different optima isthat, despite the changing structure of the input/output rela-tions for the target genes, the distribution of input concentra-tions that optimizes information transmission �cf. Eq. �37�� isalmost the same for all the cases where the input is an acti-vator �bottom panels of Fig. 6� and again for all the caseswhere the input is a repressor �bottom panels of Fig. 7�.Although maximizing information flow requires matching ofthe input distribution to the characteristics of the regulatorynetwork �6,7�, it seems that once the parameters of the net-work itself have been optimized, the solution to the matchingproblem has a more universal structure, with almost the samedistribution providing the best match to several different lo-cal optima.

Finally, we consider the parameter values in these optimalnetworks. For simplicity we focus on the globally optimalsolutions with Hill model regulation functions for two�A1A2−R12� and three �A1A2−R12R13R23� target genes, plot-ting in Fig. 9 the evolution of the different constants Ki andKij as a function of the maximal input concentration cmax /c0.As noted above, the optimal interacting networks collapseonto the noninteracting case at small cmax; for repressors,interactions become negligible if Kij becomes large, and thisis what we see at small cmax. At the opposite extreme, ascmax /c0 becomes large, the optimal Ki become nearly con-stant fractions of cmax, while the interactions become stron-ger and stronger even up to cmax /c0�102. Put another way,the concentrations at which the target genes are turned on bythe input become distributed at fixed fractions of the avail-able dynamic range, while the points at which the targets areturned off by their mutual repression evolves as a function of

10−3 10−2 10−1 1000

0.2

0.4

0.6

0.8

1

10−3 10−2 10−1 1000

2

4

6

8

10

10−3 10−2 10−1 1000

2

4

6

8

10

10−3 10−2 10−1 1000

2

4

6

8

10

10−3 10−2 10−1 1000

0.2

0.4

0.6

0.8

1

10−3 10−2 10−1 1000

0.2

0.4

0.6

0.8

1

10−3 10−2 10−1 1000

0.2

0.4

0.6

0.8

1

10−3 10−2 10−1 1000

2

4

6

8

10c/cmax c/cmaxc/cmax c/cmax

c/cmax c/cmaxc/cmax c/cmax

g i g i g i g i

p c p cp c p c

c

g1 g2 g3

c

g1 g2 g3

c

g1 g2 g3

c

g1 g2 g3

R 1R 2R 3 − R 12R 13A 23 R 1R 2R 3 − A 12A 13R 23 R 1R 2R 3 − A 12A 13A 23R 1R 2R 3 − R 12R 13R 23

g1g2g3

FIG. 7. Locally optimal networks with the three target genes, with the input acting as a repressor; all else as in Fig. 6.

10−1 100 101 1021

2

3

4

5

6

7

8MWC A1A2−R12MWC A1A2Hill A1A2−R12Hill A1A2MWC A1A2A3−R12R13R23MWC A1A2A3Hill A1A2A3−R12R13R23Hill A1A2A3~

c /cmax 0

Z

}}

3 GENES

2 GENES

FIG. 8. Dependence of Z on the maximum available proteinconcentrations cmax /c0 for two and three gene networks. We showresults for the noninteracting cases, and for the globally optimaltopologies, using both the Hill and MWC models of regulation.

OPTIMIZING INFORMATION… . II. FEED-FORWARD… PHYSICAL REVIEW E 81, 041905 �2010�

041905-11

the maximal input concentration. Further, in the interactingnetworks, the difference between two and three genes is notsimply an extra target, but a readjustment of the interactionsto make optimal use of this extra output channel.

IV. DISCUSSION

The expression levels of target genes provide informationabout the concentration of the transcription factors whichprovide input to the regulatory network. Our goal here hasbeen to understand how the structure of the network can beadjusted to optimize this information transmission. Earlierwork �16� addressed this problem in the limit where the tar-get genes do not interact, so the input is simply “broadcast”to multiple targets; here we considered the role of interac-tions within a network of target genes, limiting ourselves, forsimplicity, to feed forward structures. We have found thatthere are an exponentially large number of locally optimalnetworks, each with parameters tuned to a well defined pointthat depends only on the maximal number of available mol-ecules. These optimal networks have two components. As inthe noninteracting case, the affinities of the input transcrip-tion factor for its multiple targets must be adjusted to balancethe need to use the full dynamic range of inputs against theneed to avoid noise at low input concentrations and outputexpression levels. Unique to the interacting case, mutual re-pression among the target genes is tuned to reduce the redun-dancy among the network’s outputs.

The problem of redundancy reduction has a long historyin the context of neural coding, dating back to the first de-cade after Shannon’s original work �52–54�. As first empha-sized for the retina, nearby neurons often receive correlatedsignals; since the dynamic range of neural outputs is limited,this redundancy compromises the ability of the system totransmit information. The solution to this problem is lateralinhibition— neighboring neurons that receive correlated in-puts inhibit one another, so that each neuron transmits some-

thing which approximates the difference between its inputand that of its neighbors. Lateral inhibition, sometimes me-diated through a more complex network, is a common motifin neural circuitry, and in the retina it is expressed by the“center-surround” organization of the individual neurons’ re-ceptive fields. A more careful analysis shows that there is atradeoff between redundancy reduction and noise reduction,so that the extent of inhibitory surrounds should be reduced,even to zero, as background light intensity is reduced andphoton shot noise is increased, and this is in semiquantitativeagreement with experiment �55,56�. The relation of theseideas in neural coding to our present discussion of geneticnetworks should be clear. Although there are many questionsabout how these ideas connect more quantitatively to experi-ment, it is attractive to think that similar physical principles,and even analogous implementations of these principles,might be relevant across such a wide range of biologicalorganization.

There are relatively few transcriptional regulatory net-works that have been characterized to the point of comparingquantitatively with the sorts of models we have exploredhere. In contrast, there is a growing literature on the qualita-tive, topological structure of these networks. In particular, anumber of groups have focused on the local “motif” structureof large networks, searching for patterns of interaction thatare over-represented relative to some randomized ensembleof networks �57,58�. One such motif is the feed forwardloop, which essentially captures the whole set of feed for-ward networks with two target genes that we have consid-ered here. In the language of Fig. 1�b�, much of the discus-sion about the importance of this motif has centered on thenature of the transformation from the input concentration c tothe expression level of gene 2. By choosing coherent �in ournotation, A1A2−A12� or incoherent �A1A2−R12� loops, thesystem can achieve different temporal dynamics as well asnonmonotonic input/output relations �59–64�. Since we haveconsidered system in steady state, it is this last point which issignificant for our discussion. Our results comparing Hill and

10−1 100 101 1020

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

c /cxma 0

iK/cmax K1: 3 INT

K1: 3 NINTK2: 3 INTK2: 3 NINTK3: 3 INTK3: 3 NINTK1: 2 INTK1: 2 NINTK2: 2 INTK2: 2 NINT

10−1 100 101 10210−2

10−1

100

101

Kij

K12: 3 INTK13: 3 INTK23: 3 INTK12: 2 INT

c /cxma 0(b)(a)

FIG. 9. Optimal parameters for Hill regulatory functions in the globally optimal topologies of two �A1A2−R12 gray lines� and three�A1A2A3−R12R13R23 black lines� gene networks compared to the parameters of the optimal noninteracting networks �A1A2 ,A1A2A3�. Param-eters are presented as a function of the maximum concentration of available input proteins cmax /c0.

WALCZAK, TKAČIK, AND BIALEK PHYSICAL REVIEW E 81, 041905 �2010�

041905-12

MWC models of regulation emphasize, however, that it isnot just the topology of the network which is important forthe structure of these input/output relations.

One of the best opportunities for quantitative comparisonwith experiment is in the genetic networks controlling theearly events of embryonic development. In these systems,the primary morphogen molecules are sometimes transcrip-tion factors, so that we can literally see the variations ininput concentration laid out in space �18–20�. Along theanterior-posterior axis of the Drosophila embryo, for ex-ample, the maternally provided morphogen Bicoid varies inconcentration almost exponentially, so that the long axis ofthe embryo is essentially a logarithmic concentration axis�65,66�. Bicoid is a transcription factor that provides input tothe network of gap genes, hunchback, krüppel, giant, andknirps. Although Bicoid is an activator, these genes exhibitnonmonotonic expression levels, forming stripes along thelength of the embryo, and these nonmonotonicities are me-diated by mutual inhibitory interactions �67–71�. Qualita-tively, the structure of the optimal networks that we derivehere �especially the globally optimal A1A2A3−R12R13R23� re-sembles the gap gene network, and the expression profilesalong the log c axis are reminiscent of the spatial profiles ofgap expression. Generalizations of the experiments in Ref.�14� should make it possible to map the input/output rela-tions more quantitatively, and perhaps even to measure theinteractions directly by detecting the predicted correlationsamong the noise in the expression levels of different gapgenes. To make truly quantitative comparisons, however, weneed to solve our optimization problem allowing for net-works with feedback.

ACKNOWLEDGMENTS

We thank J. Dubuis, T. Gregor, W. de Ronde, T. Mora, E.F. Wieschaus, and especially C. G. Callan for helpful discus-

sions. Work at Princeton was supported in part by NSF underGrant No. PHY–0650617, and by NIH under Grant Nos. P50GM071508 and R01 GM077599. G.T. was supported in partby NSF under Grant Nos. DMR04–25780 and IBN–0344678, and by the Vice Provost for Research at the Uni-versity of Pennsylvania. W.B. also thanks his colleagues atthe Center for Studies in Physics and Biology at RockefellerUniversity for their hospitality during a portion of this work.

APPENDIX

This appendix collects some technical details in the cal-culation of the inverse covariance matrix K, following theframework established in Sec. II D. We start by consideringthe case of one input transcription factor at concentration ccontrolling the expression of two output genes, with normal-ized expression levels g1 and g2. In this case the Langevinequations of motion describing the network are

�dg1

dt= f1�c� − g1 + 1�t� , �A1�

�dg2

dt= f2�c,g1� − g2 + 2�t� . �A2�

We linearize the equations of motion around the steady statesolutions �g1 , g2� and Fourier transform to obtain Eq. �46�,where the matrix A is explicitly given by

A��� = 1 − i�� 0

− �21 1 − i��� , �A3�

and hence

A−1��� =1

�1 − i���2 1 − i�� 0

�21 1 − i��� . �A4�

The covariance matrix then can found, from Eq. �52�

K−1 =� d�

2 �A−1���N�A−1����†� =� d�

2

1

�1 − i���21 − i�� 0

�21 1 − i��N11 0

0 N22 1

�1 + i���21 + i�� �21

0 1 + i�� ,

�A5�

=� d�

2

1

�1 + ����2�2 N11�1 + ����2� �21N11�1 − i����21N11�1 + i��� N22�1 + ����2� + �21N11�1 + i��� � , �A6�

=� d�

2

1

�1 + ����2�2 N11�1 + ����2� �21N11

�21N11 N22�1 + ����2� + �21N11� . �A7�

The elements of Nij are given explicitly by

N11 =�

Ng g1�c� + � g1�c�

�c2

c� , �A8�

N22 =�

Ng g2�c� + � g1�c�

�c2

c +g1

cmax�21

2 � , �A9�

where we have chosen the maximum concentrations for alltypes of proteins to be equal to cmax, and we have set c0=1,

OPTIMIZING INFORMATION… . II. FEED-FORWARD… PHYSICAL REVIEW E 81, 041905 �2010�

041905-13

as described in the text. To finish the calculation, we need theintegrals

� d�

2

1

�1 + ����2�=

1

2�, �A10�

� d�

2

1

�1 + ����2�2 =1

4�. �A11�

Then

K−1 =1

2� N11 �21N11/2

�21N11/2 N22 + �21N11/2� , �A12�

K =2�

N11�N22 + �21N11/2� − N112 �21

2 /4

� N22 + �21N11/2 − �21N11/2− �21N11/2 N11

� �A13�

Lastly, we substitute the obtained covariance matrix into theexpression for the normalization constant Z, as obtained inEq. �38�

Z = �0

cmax

dc 1

2 e�i,j=1

2dgi�c�

dcKij�c�

dgj�c�dc �1/2

.

�A14�

Since the information we are trying to maximize is, from Eq.�39�, I�=log2 Z, we now can optimize Z over the parametersof the regulation function. It is now clear, having set theconcentration variables in terms of the natural scale c0=1,that the only parameter in the problem is cmax; all other pa-rameters will be determined by maximizing Z.

In the case of a three gene network, the calculation is

analogous to the two gene case. The matrix A is explicitlygiven by

A��� = �1 − i�� 0 0

− �21 1 − i�� 0

− �31 − �21 1 − i��� , �A15�

and the elements of the noise spectrum matrix are

N11 =1

Ng � g1�c�

�c2

c + g1� , �A16�

N22 =1

Ng � � f2�c,g1�

�c �g1=g1�c�

2

c + �212 g1

cmax+ g2� ,

�A17�

N33 =1

Ng � � f3�c,g1,g2�

�c ��gi=gi�c��

2

c

+ �312 g1

cmax+ �32

2 g2

cmax+ g3� . �A18�

Following Eq. �52� to find the inverse covariance integral,analogously to the two gene case, requires evaluating thefrequency integrals. The general form of these integrals is:

Iks =� d�

2

1

�1 + i���k�1 − i���s =1

�� d�

2

1

�1 + i��k�1 − i��s

=1

�s + k − 2

k − 121−s−k, �A19�

When the dust settles, the elements of the inverse of thecovariance matrix, K−1 from Eq. �52�, are

K11−1 = N11, �A20�

K12−1 =

1

2N11�21, �A21�

K13−1 =

1

4N11��21�32 + 2�31� , �A22�

K21−1 =

1

2N11�21, �A23�

K22−1 = N22 +

1

2N11��21�2, �A24�

K23−1 =

1

2N11���21�2�32 + �31�21� +

1

2N22�32, �A25�

K31−1 =

1

4N11��21�32 + 2�31� , �A26�

K32−1 =

1

2N11���21�2�32 + �21�31� +

1

2N22�32, �A27�

K33−1 = N33 +

1

2�N22��32�2 + N11��31�2�

+ N11�21�32�31 +3

8N11��21�32�2. �A28�

WALCZAK, TKAČIK, AND BIALEK PHYSICAL REVIEW E 81, 041905 �2010�

041905-14

�1� B. Alberts, A. Johnson, J. Lewis, M. Raff, K. Roberts, and P.Walter, Molecular Biology of the Cell �Garland Science, NewYork, 2002�.

�2� M. Ptashne and A. Gann, Genes and Signals �Cold SpringHarbor Press, New York, 2002�.

�3� R. Wagner, Transcription Regulation in Prokaryotes �OxfordUniversity Press, Oxford, 2000�.

�4� W. Bialek and S. Setayeshgar, Proc. Natl. Acad. Sci. U.S.A.102, 10040 �2005�.

�5� W. Bialek and S. Setayeshgar, Phys. Rev. Lett. 100, 258101�2008�.

�6� G. Tkačik, C. G. Callan, Jr., and W. Bialek, Proc. Natl. Acad.Sci. U.S.A. 105, 12265 �2008�.

�7� G. Tkačik, C. G. Callan, Jr., and W. Bialek, Phys Rev E 78,011910 �2008�.

�8� M. B. Elowitz, A. J. Levine, E. D. Siggia, and P. D. Swain,Science 297, 1183 �2002�.

�9� E. Ozbudak, M. Thattai, I. Kurtser, A. D. Grossman, and A.van Oudenaarden, Nat. Genet. 31, 69 �2002�.

�10� W. J. Blake, M. Kaern, C. R. Cantor, and J. J. Collins, Nature�London� 422, 633 �2003�.

�11� J. M. Raser and E. K. O’Shea, Science 304, 1811 �2004�.�12� N. Rosenfeld, J. W. Young, U. Alon, P. S. Swain, and M. B.

Elowitz, Science 307, 1962 �2005�.�13� J. M. Pedraza and A. van Oudenaarden, Science 307, 1965

�2005�.�14� T. Gregor, D. W. Tank, E. F. Wieschaus, and W. Bialek, Cell

130, 153 �2007�.�15� G. Tkačik, T. Gregor, and W. Bialek, PLoS ONE 3, e2774

�2008�.�16� G. Tkačik, A. M. Walczak, and W. Bialek, Phys Rev E 80,

031920 �2009�.�17� L. Wolpert, J. Theor. Biol. 25, 1 �1969�.�18� P. A. Lawrence, The Making of a Fly: The Genetics of Animal

Design �Blackwell, Oxford, 1992�.�19� J. B. Gurdon and P. Y. Bourillot, Nature �London� 413, 797

�2001�.�20� C. Nüsslein-Volhard, Coming to Life: How Genes Drive De-

velopment �Kales, Carlsbad, 2008�.�21� A comparable problem exists in the case of neurons. When

multiple neurons respond independently to sensory inputs, thepopulation of cells necessarily carries less information thanwould be expected by summing the information carried by theindividual cells. This redundancy among conditionally inde-pendent events is quite general, and can be seen also for suc-cessive action potentials generated by the same neuron overtime. See, for example, N. Brenner, S. P. Strong, R. Koberle,W. Bialek, and R. R. de Ruyter van Steveninck, Neural Com-put. 12, 1531 �2000�.

�22� There are examples of transcription factors that act on onetarget as an activator and on another target as a repressor, but itis unusual that this happens with the factor acting alone. Ineukaroytes, where transcription factors often form complexes,an activator without its partner can function as a repressor�Ref. �23��.

�23� A. H. Brivanlou and J. E. Darnell, Jr., Science 295, 813�2002�.

�24� E. Ziv, I. Nemenman, and C. H. Wiggins, PLoS ONE 2, e1077�2007�.

�25� A. Mugler, E. Ziv, I. Nemenman, and C. H. Wiggins, IET Syst.

Biol. 3, 379 �2009�.�26� R. C. Yu, C. G. Pesce, A. Colman-Lerner, L. Lok, D. Pincus,

E. Serra, M. Holl, K. Benjamin, A. Gordon, and R. Brent,Nature �London� 456, 755 �2008�.

�27� F. Tostevin and P. R. ten Wolde, Phys. Rev. Lett. 102, 218101�2009�.

�28� P. Mehta, S. Goyal, T. Long, B. L. Bassler, and N. S. Win-green, Mol. Syst. Biol. 5, 325 �2009�..

�29� In bacteria, it often is assumed that the dominant term in thedecay of protein concentrations is growth and division of thecells, which would make all lifetimes equal. More generally,we consider all transcription factors to operate under similarconstraints, so that all of the network structure comes out ofthe optimization problem, rather than being imposed by hand.

�30� N. G. van Kampen, Stochastic Processes in Physics andChemistry �North Holland, Amsterdam, 2007�.

�31� M. Thattai and A. van Oudenaarden, Proc. Natl. Acad. Sci.U.S.A. 98, 8614 �2001�.

�32� J. Hasty, F. Isaacs, M. Dolnik, and D. McMillen, Chaos 11,207 �2001�.

�33� J. Hasty, D. McMillen, F. Isaacs, and J. Collins, Nat. Rev.Genet. 2, 268 �2001�.

�34� J. Paulsson, Nature �London� 427, 415 �2004�.�35� S. Iyer-Biswas, F. Hayot, and C. Jayaprakash, Phys. Rev. E 79,

031911 �2009�.�36� P. Mehta, R. Mukhopadhyay, and N. S. Wingreen, Phys. Biol.

5, 026005 �2008�.�37� A. Raj, C. S. Peskin, D. Tranchina, D. Y. Vargas, and S. Tyagi,

PLoS Biol. 4, e309 �2006�.�38� I. Golding, J. Paulsson, S. M. Zawilski, and E. C. Cox, Cell

123, 1025 �2005�.�39� P. J. Choi, L. Cai, K. Frieda, and X. S. Xie, Science 322, 442

�2008�.�40� H. C. Berg and E. M. Purcell, Biophys. J. 20, 193 �1977�.�41� G. Tkačik and W. Bialek, Phys Rev E 79, 051901 �2009�.�42� We note that, in eukaryotes, the relevant volume for sensing

the concentration of transcription factor proteins is the nucleus,however proteins are produced in the cytoplasm and then im-ported into the nucleus. Allowing for this partitioning, � be-comes an effective volume, not the literal volume of the cell ornucleus.

�43� L. Bintu, N. E. Buchler, H. Garcia, U. Gerland, T. Hwa, J.Kondev, and R. Phillips, Curr. Opin. Genet. Dev. 15, 116�2005�.

�44� L. Bintu, N. E. Buchler, H. Garcia, U. Gerland, T. Hwa, J.Kondev, T. Kuhlman, and R. Phillips, Curr. Opin. Genet. Dev.15, 125 �2005�.

�45� T. Kuhlman, Z. Zhang, M. H. Saier, Jr., and T. Hwa, Proc.Natl. Acad. Sci. U.S.A. 104, 6043 �2007�.

�46� J. B. Kinney, G. Tkačik, and C. G. Callan, Jr., Proc. Natl.Acad. Sci. U.S.A. 104, 501 �2007�.

�47� A. V. Hill, J. Physiol. 40, iv �1910�.�48� J. Monod, J. Wyman, and J. P. Changeux, J. Mol. Biol. 12, 88

�1965�.�49� We have also studied the general model, with both Kon and Koff

parameters. The optimum information transmission always oc-curs at L�1, and the details of the optimal solution reproducethe behavior in the simpler limit considered here.

�50� C. E. Shannon, Bell Syst. Tech. J. 27, 379 �1948�; Reprinted inC. E. Shannon and W. Weaver, The Mathematical Theory of

OPTIMIZING INFORMATION… . II. FEED-FORWARD… PHYSICAL REVIEW E 81, 041905 �2010�

041905-15

Communication �University of Illinois Press, Urbana, 1949�.�51� T. M. Cover and J. A. Thomas, Elements of Information

Theory �Wiley, New York, 1991�.�52� F. Attneave, Psychol. Rev. 61, 183 �1954�.�53� H. B. Barlow, in Sensory Mechanisms, the Reduction of Re-

dundancy, and Intelligence, Proceedings of the Symposium onthe Mechanization of Thought Processes, edited by D. V. Blakeand A. M. Utlley �HM Stationery Office, London, 1959�, Vol.2, pp. 537–574.

�54� H. B. Barlow, in Sensory Communication, edited by W. Rosen-blith �MIT Press, Cambridge, 1961�, pp. 217–234.

�55� J. J. Atick and A. N. Redlich, Neural Comput. 2, 308 �1990�.�56� J. H. van Hateren, Nature �London� 360, 68 �1992�.�57� S. S. Shen-Orr, R. Milo, S. Mangan, and U. Alon, Nat. Genet.

31, 64 �2002�.�58� U. Alon, Nat. Rev. Genet. 8, 450 �2007�.�59� S. Mangan, A. Zaslaver, and U. Alon, J. Mol. Biol. 334, 197

�2003�.�60� S. Mangan and U. Alon, Proc. Natl. Acad. Sci. U.S.A. 100,

11980 �2003�.

�61� S. Ishihara, K. Fujimoto, and T. Shibata, Genes Cells 10, 1025�2005�.

�62� M. E. Wall, M. J. Dunlop, and W. S. Hlavacek, J. Mol. Biol.349, 501 �2005�.

�63� S. Kalir, S. Mangan, and U. Alon, Mol. Syst. Biol. 1,2005.0006 �2005�.

�64� S. Kaplan, A. Bren, E. Dekel, and U. Alon, Mol. Syst. Biol. 4,203 �2008�.

�65� B. Houchmandzadeh, E. Wieschaus, and S. Leibler, Nature�London� 415, 798 �2002�.

�66� T. Gregor, E. F. Wieschaus, A. P. McGregor, W. Bialek, and D.W. Tank, Cell 130, 141 �2007�.

�67� H. Jäckle, D. Tautz, R. Schuh, E. Seifert, and R. Lehmann,Nature �London� 324, 668 �1986�.

�68� R. Rivera-Pomar and H. Jäckle, Trends Genet. 12, 478 �1996�.�69� L. Sanchez and D. Thieffry, J. Theor. Biol. 211, 115 �2001�.�70� M. Levine and E. H. Davidson, Proc. Natl. Acad. Sci. U.S.A.

102, 4936 �2005�.�71� D. Yu and S. Small, Curr. Biol. 18, 868 �2008�.

WALCZAK, TKAČIK, AND BIALEK PHYSICAL REVIEW E 81, 041905 �2010�

041905-16