multivariate analytic combinatorics for cost constrained

Multivariate Analytic Combinatorics for Cost

Constrained Channels and Subsequence Enumeration

Andreas Lenz∗ Stephen Melczer† Cyrus Rashtchian‡ Paul H. Siegel§

Abstract

Analytic combinatorics in several variables is a powerful tool for deriving the asymptoticbehavior of combinatorial quantities by analyzing multivariate generating functions. We studyinformation-theoretic questions about sequences in a discrete noiseless channel under cost andforbidden substring constraints. Our main contributions involve the relationship between thegraph structure of the channel and the singularities of the bivariate generating function whosecoefficients are the number of sequences satisfying the constraints. We combine these new resultswith methods from multivariate analytic combinatorics to solve questions in many applicationareas. For example, we determine the optimal coded synthesis rate for DNA data storage whenthe synthesis supersequence is any periodic string. This follows from a precise characterizationof the number of subsequences of an arbitrary periodic strings. Along the way, we providea new proof of the equivalence of the combinatorial and probabilistic definitions of the cost-constrained capacity, and we show that the cost-constrained channel capacity is determined bya cost-dependent singularity, generalizing Shannon’s classical result for unconstrained capacity.

∗Institute for Communications Engineering, Technical University of Munich, Germany, [email protected] has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020research and innovation programme (grant agreement No 801434)

†Department of Combinatorics & Optimization, University of Waterloo, 200 University Avenue West, Waterloo,ON N2L 3G1, Canada, [email protected]. SM was supported by an NSERC Discovery Grant.

‡Google Research, [email protected]. Work partially completed while at UCSD.§Department of Electrical and Computer Engineering, University of California, San Diego, [email protected]

1

arX

iv:2

111.

0610

5v2

[cs

.IT

] 1

4 N

ov 2

021

[email protected]

[email protected]

[email protected]

[email protected]

Contents

1 Introduction 31.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Preliminaries 62.1 Labeled Graphs and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2 Costly Constrained Channels and Channel Capacity . . . . . . . . . . . . . . . . . . 82.3 Generating Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3 Our Results 9

4 Technical Overview 114.1 Analytical Combinatorics in Several Variables . . . . . . . . . . . . . . . . . . . . . . 114.2 On the Way to ACSV: Cost-Diversity and Matrix Analysis . . . . . . . . . . . . . . 134.3 Connecting ACSV and Cost-Diverse Graphs via Spectral Analysis . . . . . . . . . . 13

5 Constrained Codes and ACSV 155.1 Generating Function of the Follower Set Size . . . . . . . . . . . . . . . . . . . . . . 155.2 ACSV Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165.3 Cost-Diverse Graphs and Equivalent Conditions . . . . . . . . . . . . . . . . . . . . . 185.4 Singularity and Critical Point Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 255.5 Proof of Theorem 3.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295.6 Proof of the Other Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

6 Other Applications: DNA Synthesis and Subsequence Counting 316.1 Synthesis and Subsequence Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316.2 Synthesis Information Rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346.3 Constrained Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

7 Conclusion 397.1 Open Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

8 Acknowledgement 40

A Lemmas on Matrices 45

B Concavity and Maximality of Fixed-Length Capacity 47

C Variation of the Chinese Remainder Theorem 47

D Combinatorial Characterization of αloG and αup

G 48

E Examples and Illustrations of our Methods 50E.1 A Geometric Visualization of Contributing Singularities . . . . . . . . . . . . . . . . 52

2

σ τ(σ)

DOT 2DASH 4

LETTER 3WORD 6

and therefore

C = LimT!∞

logAXT0

T= logX0:

In case there are restrictions on allowed sequences we may still often obtain a difference equation of thistype and findC from the characteristic equation. In the telegraphy case mentioned above

N(t) = N(t�2)+N(t�4)+N(t�5)+N(t�7)+N(t�8)+N(t�10)

as we see by counting sequences of symbols according to the last or next to the last symbol occurring.HenceC is� log�0 where�0 is the positive root of 1= �2+�4+�5+�7+�8+�10. Solving this we findC = 0:539.

A very general type of restriction which may be placed on allowed sequences is the following: Weimagine a number of possible statesa1;a2; : : : ;am. For each state only certain symbols from the setS1; : : : ;Sn

can be transmitted (different subsets for the different states). When one of these has been transmitted thestate changes to a new state depending both on the old state and the particular symbol transmitted. Thetelegraph case is a simple example of this. There are two states depending on whether or not a space wasthe last symbol transmitted. If so, then only a dot or a dash can be sent next and the state always changes.If not, any symbol can be transmitted and the state changes if a space is sent, otherwise it remains the same.The conditions can be indicated in a linear graph as shown in Fig. 2. The junction points correspond to the

DASH

DOT

DASH

DOT

LETTER SPACE

WORD SPACE

Fig. 2—Graphical representation of the constraints on telegraph symbols.

states and the lines indicate the symbols possible in a state and the resulting state. In Appendix 1 it is shownthat if the conditions on allowed sequences can be described in this formC will exist and can be calculatedin accordance with the following result:

Theorem 1:Let b(s)i j be the duration of thesth symbol which is allowable in statei and leads to statej.Then the channel capacityC is equal tologW whereW is the largest real root of the determinantal equation:

��∑s

W�b(s)i j � �i j

��= 0

where�i j = 1 if i = j and is zero otherwise.

For example, in the telegraph case (Fig. 2) the determinant is:�� 1 (W�2+W�4)(W�3+W�6) (W�2+W�4�1)

��= 0:

On expansion this leads to the equation given above for this set of constraints.

2. THE DISCRETESOURCE OFINFORMATION

We have seen that under very general conditions the logarithm of the number of possible signals in a discretechannel increases linearly with time. The capacity to transmit information can be specified by giving thisrate of increase, the number of bits per second required to specify the particular signal used.

We now consider the information source. How is an information source to be described mathematically,and how much information in bits per second is produced in a given source? The main point at issue is the

4

P (x) =

[0 x2 + x4

x3 + x6 x2 + x4

]

Figure 1: Shannon’s telegraph channel: cost function, directed graph, and cost-enumerator matrix.

1 Introduction

Since their introduction in Part I of Shannon’s landmark 1948 paper, A Mathematical Theory ofCommunication [Sha48], discrete noiseless channels have been a focus of research by informationand coding theorists. They also have practical utilization in the design of transmission codes fordigital communication systems and recording codes for data storage systems [MRS01]. To extendthe possible applications, we study properties of discrete noiseless channels under an average costconstraint. Such constraints arise from system limitations on the average transmission power inan optical fiber [DRV10], the average recording voltage in a non-volatile memory [Liu20], or theaverage synthesis time per nucleotide in a DNA-based storage system [LLR+20].

Costly constrained channels. The iconic example of a discrete noiseless channel is Shannon’stelegraph channel, depicted in Figure 1. The channel allows the noiseless transmission of sequencesof symbols drawn from the finite alphabet Σ = {DOT,DASH,LETTERSPACE,WORDSPACE}.The constraints on the allowed sequences of symbols are represented by a finite directed graph,with edge labels taken from the symbol alphabet. Each edge has an associated real-valued cost, inthis case, the time duration τ(σ) of the edge label σ (here, meaning its equivalent number of bittransmissions), as shown in the table. The cost is assumed to be additive, so the cost of a sequencegenerated by a path in the graph is the sum of its edge costs. The directed graph structure andthe edge costs can be summarized in a (one-step) cost-enumerator matrix, from which the costsassociated with sequences/paths of a given length can be found by matrix multiplication.

Discrete noiseless channels in which all edges have unit cost are often referred to as constrainedsystems and are well studied [MRS01]. Our interest is in the more general setting of varying edgecosts, which we emphasize by use of the term costly constrained channel.

Cost-constrained capacity. Shannon introduced the concept of capacity, a fundamental com-binatorial quantity associated with information-theoretic aspects of a costly constrained channel.The capacity C is the asymptotic growth rate of the number of sequences as a function of the se-quence cost. It reflects the maximum rate, per unit cost, at which information can be transmittedon the channel. Under the assumption of integer edge costs, he analyzed a system of differenceequations and derived the now-classical result on the capacity, namely that the capacity is given asthe logarithm of the largest root of a determinental equation associated with the channel.

Khandekar, McEliece, and Rodemich [KMR00] and Bocherer, Mathar, da Rocha, and Pi-mentel [BJP10] extend Shannon’s result to non-integer symbol costs under a mild assumptionabout the density of sequence costs. They also interpret the capacity in terms of a singularity ofthe univariate generating function F (x) of the integer series N(t) that comprises the number of

3

sequences with cost at most t. In this context, a nod to analytic combinatorics in a single vari-able [FS09] was given in [KMR00], and we build on this direction. We consider an average costconstraint W , and we define the cost-constrained capacity or capacity-cost function as the asymp-totic growth rate per unit cost of the number N(Wn,n) of sequences of length n that have cost atmost Wn. Equivalently, it is the growth rate of the number N(t, bαtc) of sequences of length bαtcthat have cost at most t. We analyze a bivariate generating function associated with N(t, bαtc).

1.1 Contributions

Our results stem from an integration of contributions within and across three disparate areas,including (a) new results in the spectral theory pertaining to eigenvalues and eigenvectors of graphsand matrices associated with discrete noiseless channels; (b) a geometric and functional analysis ofsingularities of the complex bivariate generating function that encodes cost and length propertiesof the channel sequences; and (c) combining these results with a novel application of methods fromanalytic combinatorics in several variables (ACSV) to precisely evaluate the asymptotic behaviorof the diagonal coefficients of the bivariate generating function.

As a by-product of our main results, we solve two long-standing open problems in informationtheory. First, we show that the cost-constrained capacity (Definition 2.8) of the discrete noiselesschannel can be expressed explicitly as a simple function of a specific two-dimensional singularity ofthe bivariate generating function, thereby generalizing Shannon’s classical formula for the channelcapacity without cost constraint. This determines the exact asymptotics of N(t, bαtc) and fullycharacterizes the capacity-cost function. Second, the expression for the cost-constrained capacityprovides a direct proof of the equivalence of the combinatorial definition of cost-constrained capacityto the probabilistic definition. This equivalence was first established indirectly in 2006 via converseinequalities [SS06]. Finally, we apply our analytical results to two related problems of indepen-dent interest: the subsequence counting problem for periodic sequences and the determination ofinformation-theoretic limits on efficient DNA synthesis for DNA-based storage systems.

We next give a brief sketch of the technical underpinnings of our results.

Spectral analysis of channel graphs. Central to our analysis is a set of new results about astrongly-connected graph G. These results then illuminate properties of the spectral radius ρG(x)of the cost-enumerator matrix PG(x). We highlight two graph properties that we use, namely,cost-diversity and cost-periodicity. A cost-diverse graph has at least one pair of equal-length pathswith different costs that connect the same pair of vertices. In a cost-periodic graph, for each pair ofvertices, the costs of all connecting paths of the same length are congruent modulo a fixed integer.We show that cost periodicity is equivalent to a useful formulation of the edge cost function calleda periodic coboundary condition. Then, we use these definitions to prove structural properties ofthe eigenvalues of PG(x) on the complex unit circle and log-convexity properties of the spectralradius. We also show that the complement of cost-diverse graphs, namely cost-uniform graphs,plays a role in our characterization of costly constrained channels.

Generating functions and singularity analysis for cost-diverse graphs. We define a gener-ating function FG(x, y) whose coefficients encode information about the number of paths of givenlength and cost emanating from vertices of G. For a cost constraint W , we let α = W−1. To studythe asymptotics of N(t, bαtc) with the methods of ACSV, we use the previously derived propertiesof the spectral radius of PG(x) to characterize the singularities of FG(x, y). We first determine

4

the set of minimal singularities. For a cost-diverse graph, we further identify those singular pointsthat are smooth, satisfy the critical equation for α, and are non-degenerate.

Asymptotic expansions via ACSV. The singularity analysis allows the application of a funda-mental result in ACSV regarding asymptotic properties of coefficients of multivariate generatingfunctions [Mel21, Cor. 5.2]. This leads to our main contribution (Theorem 3.4), which gives acomplete characterization of the asymptotic behavior of NG(t, bαtc) for cost-diverse graphs in tworegions of α. These two regions provide a characterization of the capacity-cost function, CG(α), ina concise form that elegantly generalizes Shannon’s formula (Theorem 3.1): one corresponds to alinear scaling of CG(α), and the other to a non-linear, concave behavior. We identify the thresholdvalue of α separating the two regions, as well as the value α∗ in the concave region correspond-ing to the classical combinatorial capacity, CG(α∗) = CG. The expression for CG(α) maps di-rectly to the known formula for the cost-constrained capacity CG(W ) [MR83,JH84,KMR00,SS06].From these results, we extract an expansion for the number of paths generated by an arbitrarystrongly-connected graph and an exact representation of the number of the cost-constrained paths(Theorem 3.6).

Applications: Subsequence enumeration is a problem that arises in many areas (bioinformatics,information theory, and coding theory). However, developing tight and explicit formulas is anopen question in general. Current results are either unwieldy [MKB08] or only apply to specialcases, such as the alternating sequence [HR00]. Our motivation comes from theoretical models ofDNA synthesis in DNA-based storage systems, and specifically, for a parallel, array-based synthesisprocess.1 In [LLR+20], the authors provide a connection between costly constrained channels andefficient DNA synthesis. They show that the capacity of a suitably defined channel characterizes theinformation rate. However, their analysis focuses on the binary alternating supersequence, withoutconsidering other supersequences or non-cost constraints. We extend these results in a few ways: (i)we analyze the synthesis of strands with constraints, such as balanced GC content or homopolymerrunlength constraints, (ii) we consider arbitrary periodic synthesis supersequences, (iii) we derivethe capacity-cost function for alternating synthesis sequences over non-binary alphabets.

1.2 Related Work

Discrete Noiseless Channels. Since the publication of Shannon’s paper [Sha48], a number ofworks have addressed the problem of constructing efficient codes for costly constrained channels,focusing on minimizing the average cost per source symbol [Mar57, Kar61, Kra62, Csi69, Mel80,Gil95,GR98]. An optimal construction for memoryless channels was presented in [Var71].

Beginning in the late 1960’s, rapidly evolving magnetic and optical recording technologies in-spired a boom in research on coding for constrained systems (discrete noiseless channels with equaledge costs). Theoretical bounds on encoder and decoder properties were studied in [Ash87,Ash88,AM95,AMR95,Ash96,AKS96,AMR96], efficient construction algorithms were presented in [FW64,Fra68, Fra69, Fra70, TB70, ACH83, AFKM86], and codes for specific storage applications, some ofwhich became industry standards, were described in [Fra72,Pat75,AHM82, INOO85, Imm95]. Theextensive bibliographies in [MRS01, Imm04] provide additional references.

A number of works, some motivated by new applications in magnetic recording, flash memorystorage, wireless communication, and optical fiber transmission, have addressed the problem of

1A description of the biochemical synthesis process can be found in [MRRY21,KC14b,YKGR+15].

5

coding for a costly constrained channel with an average cost constraint per channel symbol [KN96,IMU97,Sor05,MK06,JFLMK10,BM11,Bo12,KKYW14,BSS15,BSS19,LHBS20a].

Coding for costly constrained channels is closely related to non-equiprobable signaling. Recog-nizing this connection, several papers have studied the relationship between costly channel codes,distribution matching codes, and random number generation [HU00,Uch01,Liu20].

Analytic Combinatorics. The use of univariate analytic combinatorics to study subsequencesor patterns in strings is, by now, classical; see, for instance, [FS09, Sect. I. 4.]. More recently, thedevelopment of analytic combinatorics in several variables [PW13, Mel21] has given a frameworkto address such problems using multivariate generating functions (for instance, sporadic examplesinvolving string enumeration can be found in [PW08]). Due to reasons that will become clearfrom our arguments, the multivariate generating functions we study are similar to those arisingfrom the transfer matrix method and the enumeration of walks on graphs. In particular, wesingle out [BBBP11] for a detailed analysis of quantum random walks using multivariate generatingfunctions whose denominators have the form det(M−zdM), where M is a matrix with polynomialentries in z1, . . . , zd−1. This type of denominator, typical in such walk problems, is similar to oneof the factors in the denominator in our generating functions, however the underlying nature ofa (quantum) random walk puts assumptions on M that are different than what we consider.Furthermore, the results of [BBBP11] rely on a smoothness assumption related to the pole set ofthe generating function that, due to an extra factor in the denominator, is not satisfied for us. Theloss of smoothness, among other complications, requires us to adapt advanced ACSV methods.

DNA Synthesis. There has been much recent work on storing digital data in DNA, inspiring theo-reticians to tackle various modeling and analysis questions that arise [CNS19,EZ17,OAC+18,SH21,YGM17, YKGR+15, YKG+15]. DNA synthesis is an expensive aspect of DNA-based storage sys-tems [Car13,KC14a,LWG+20], and researchers have studied theoretical models of the array-basedsynthesis process. For example, there is work on the synthesis cost of random strings when us-ing an alternating sequences for synthesis [MRRY21], coding schemes [AVA+19,LKG+19,LLR+20,JFSB20], and connections to related problems [BPRS20,CNT+20,KMP+02,LSWY20,NL06,NL11,Rah06, TF20]. Other works have circumvented the artificial synthesis of DNA by encoding infor-mation through the modification of existing DNA [TWA+20].

2 Preliminaries

2.1 Labeled Graphs and Matrices

Consider a labeled, directed graph G = (V,E, σ, τ) that has vertices V and edges E. Each edgee ∈ E has an initial vertex init(e) ∈ V and a terminal vertex term(e) ∈ V . Further, the edgesare labeled with σ : E 7→ Σ and have weights τ : E 7→ N. A path p = (e1, . . . , en) of length nis a sequence of edges e1, . . . , en ∈ E such that for all i ∈ {1, . . . , n − 1} the final vertex of theedge term(ei) is the same as the initial vertex of the edge init(ei+1). The path starts at init(e1)and ends at term(en). A path generates a word σ(p) = (σ(e1), . . . , σ(en)) ∈ Σn and has costτ(p) = τ(e1) + · · ·+ τ(en). Figure 2 shows a graph with edge labels and costs. A directed graph isstrongly connected if there is a path connecting any two vertices. A key graph property is that alldistinct paths emerging from a vertex generate distinct words. This is guaranteed by the followingnotion of a graph being deterministic, a.k.a., right-resolving [MRS01].

6

Definition 2.1. A labeled, directed graph G = (V,E, σ, τ) is deterministic if for all vertices v ∈ Vthe labels of all edges e ∈ E with the same initial vertex init(e) = v are distinct.

We next introduce another graph property that underlies many of our theorems and forms thebasis of a characterization of costly channels that we can analyze (Section 5.3 has examples).

Definition 2.2. A strongly connected graph G = (V,E, σ, τ) is cost-uniform if for each pair ofvertices vi, vj and each length m, the costs of all length-m paths p connecting vi and vj are thesame. If G is not cost-uniform, then we say that G is cost-diverse.

Examples illustrating cost-uniformity and cost-diversity are displayed in Figure 3 on page 18.For cost-diverse graphs, we further introduce the following notion of cost-periodicity.

Definition 2.3. Let G = (V,E, σ, τ) be a strongly connected graph. We say that G has cost-periodc ∈ N if for each pair of vertices vi, vj and each length m the costs τ(p) of all length-m paths pconnecting vi and vj are congruent modulo c.

We further define the following common notion of the period of a graph.

Definition 2.4. Let G = (V,E, σ, τ) be a strongly connected graph. We say G has period d if foreach pair of vertices vi, vj, the lengths of all paths p connecting vi and vj are congruent modulo d.

With this definition, a graph can have several periods. In particular, all divisors of a period dare also periods of the graph. We proceed with a novel property that significantly facilitates ouranalysis and which we will prove to be equivalent to cost-diversity in Lemma 5.8 on page 18.

Definition 2.5. A strongly connected graph G = (V,E, σ, τ) satisfies the c-periodic coboundarycondition if there exists a function B : V → R and a constant b ∈ Q such that if e ∈ E is an edgefrom state vi to state vj then the edge cost satisfies

τ(e) ≡ b+B(vj)−B(vi) (mod c).

We say that a graph satisfies the coboundary condition if the congruence above holds without themodulo operation.

For many of our results, we analyze the spectrum of an adjacency matrix associated with thegraph. In fact, we consider a family of adjacency matrices PG(x) parameterized by a value x ∈ R.

Definition 2.6. Let G = (V,E, σ, τ) be a strongly connected graph, where v1, . . . , v|V | is an arbitraryordering of the vertices V . The adjacency matrix PG(x) of G is the |V | × |V | matrix with entries

[PG(x)]ij =∑

e∈E:init(e)=vi,term(e)=vj

xτ(e).

The spectral radius of PG(x) is denoted by ρG(x) and defined as the largest absolute value of theeigenvalues of PG(x). For real x > 0, the spectral radius ρG(x) is equal to the unique real eigenvalueof PG(x) with the largest magnitude, which we refer to as the Perron root.

Later we will see that ρG(x) plays a central role in the asymptotic behavior of NG,v(t, n).

7

A|1 C|2

(a) Exemplary directed, labeled and costly graph

A|1

C|1

C|2

A|3

(b) Graph with period 2 and cost-period 3

Figure 2: Exemplary directed graphs describing costly constrained channels with Σ = {A,C}. Theedges e are marked by their labels and costs σ(e)|τ(e).

2.2 Costly Constrained Channels and Channel Capacity

Such graphs G induce a costly language [Sha48,KMR00], which comprises all words that are gen-erated by paths through the graph G of some limited cost. We thus use costly constrained channelto refer to a labeled, directed, weighted graph G that is deterministic and strongly connected. Animportant quantity is the number of distinct words that are contained in the language of a system.

Definition 2.7. Given G = (V,E, σ, τ), for an arbitrary vertex v ∈ V we define LG,v(t) to be thecost-t follower set of v, i.e., the set of all words that are generated by some path of cost at most tthat starts in v. The size of the cost-t follower set is denoted as NG,v(t) , |LG,v(t)|. Accordingly wedefine LG,v(t, n) , LG,v(t)∩Σn to be the fixed-length follower set with size NG,v(t, n) , |LG,v(t, n)|.

Now we can formally define the capacity of a costly constrained channel. The capacity of astrongly connected graph is independent of the starting vertex, and we omit this in the definition.

Definition 2.8. The variable-length capacity of a costly constrained channel G is defined to be

CG = lim supt→∞

log2(NG,v(t))

t.

Similarly, the fixed-length capacity is defined as

CG(α) = lim supt→∞

log2(NG,v(t, bαtc))t

.

Here we use the terms variable-length and fixed-length capacity to stress their defining nature.In the literature the two quantities are often referred to as (combinatorial) capacity and cost-constrained capacity. Shannon also introduced a natural probabilistic definition of capacity interms of the entropy H(P) of stationary Markov chains P on the channel graph. The Markov chaindefines a probability distribution on the edges of the graph, and a corresponding average cost T (P).

Definition 2.9. The probabilistic capacity of a costly constrained channel G is defined as

Cprob = supP

H(P)

T (P)

where the supremum is taken over all stationary Markov chains on G.

Shannon remarked that Cprob ≤ CG, and then explicitly identified a Markov chain that achievesa normalized entropy equal to CG, thus proving the fundamental equivalence CG = Cprob.

As with combinatorial capacity, there is a natural generalization of the probabilistic capacityto the costly constrained setting in which the supremum is taken over Markov chains with averagecost at most W .

8

Definition 2.10. The cost-constrained probabilistic capacity of a costly constrained channelG with average symbol cost constraint W is defined as

Cprob(W ) = supP:T (P)≤W

H(P)

T (P)

where the supremum is over stationary Markov chains on G with average cost no more than W .

A concise parametric characterization of Cprob(W ), found by constrained optimization methods,is stated in [JH84,KMR00,MR83].

2.3 Generating Functions

The methods of analytic combinatorics derive asymptotic properties of a sequence from analyticproperties of its generating function [FS09]; see Section 4.1 for more details. We consider bivariatesequences NG,v(t, n) and denote their generating functions by

FG,v(x, y) =∑n≥0

∑t≥0

NG,v(t, n)xtyn,

for complex variables x and y. As the sequence NG,v(t, n) admits a linear recursion in the variablest and n, which we will elaborate on later, the generating function FG,v(x, y) is a ratio

FG,v(x, y) =QG,v(x, y)

HG(x, y)

of polynomials QG,v(x, y), HG(x, y). Since NG,v(t) =∑

n≥0NG,v(t, n), for the variable length case,

we will regularly abbreviate FG,v(x) , FG,v(x, 1) as the generating function of the integer seriesNG,v(t) with numerator QG,v(x) , QG,v(x, 1) and denominator HG(x) , HG(x, 1).

3 Our Results

Our core results comprise the exact asymptotics of the number of limited-cost followers inside agiven costly graph. Intriguingly, our first theorem shows that the cost-constrained capacity can beexpressed as a logarithmic function of a two-dimensional contributing singularity (x0, 1/ρG(x0))of a bivariate generating function. This generalizes Shannon’s iconic result, which expressed thecapacity of a constrained system by means of a one-dimensional minimal singularity.

Theorem 3.1. Let G be a strongly connected, deterministic, and cost-diverse graph. AbbreviateαloG , ρG(1)/ρ′G(1) and αup

G , limx→0+

ρG(x)/(xρ′G(x)). For all α with 0 ≤ α ≤ αloG, we have

CG(α) = α log2 ρG(1).

For all α with αloG < α < αup

G ,

CG(α) = − log2 x0 + α log2 ρG(x0),

where x0 is the unique real solution to αxρ′G(x) = ρG(x) in the interval 0 < x < 1. For all α > αupG ,

we have that CG(α) = 0.

9

Theorem 3.1 improves over previous work [MR83, JH84, KMR00, SS06] in several ways. First,[MR83,JH84,KMR00] only consider the cost-constrained probabilistic capacity. Next, none of themrecognizes the role of cost-diversity. Moreover, they do not address the full domain of the cost-constrained capacity. In contrast, our results explicitly determine the fixed-length capacity, theycan be readily evaluated for cost-diverse graphs, and we consider the entire domain of the capacityfunction. Specifically, we identify a region for small α in which the capacity exhibits a linear scaling,we determine the exact slope in that region, and we find the threshold between the linear and non-linear regions. Interestingly, our formula for the fixed-length capacity is identical to the formula forthe cost-constrained probabilistic capacity in [MR83], [JH84] (up to differences in notation and asimple argument to address the linear scaling region). Thus, an immediate corollary of Theorem 3.1is the equivalence between fixed-length capacity and cost-constrained probabilistic capacity. Thisfundamental result was first proved in [SS06] from [JH84] by establishing converse inequalities usingtypical sequence arguments, optimization techniques, and a variant of Lemma A.4.

Corollary 3.2. For any strongly connected and cost-diverse graph G, the cost-constrained proba-bilistic capacity Cprob(α−1) is equal to the fixed-length capacity CG(α).

Remark 3.3. The inverse of αloG is the average cost per edge, asymptotically in n, over all paths

of length n in G. Equivalently, it is the average cost per edge associated with the unique stationaryMarkov chain of maximum entropy on G. The inverse of αup

G is the minimum average cost per edgeamong the cycles in G. For details, see Appendix D.

Theorem 3.1 follows from a stronger result that gives the precise behavior of NG,v(t, αt). In thefollowing, we mean by largest period d and largest cost-period c the largest integers such that thegraph G has period d and cost-period c, respectively.

Theorem 3.4. Let G be a strongly connected, deterministic, and cost-diverse graph with largest pe-riod d and largest cost-period c. Denote by b and B(vj) the quantities from the c-periodic coboundarycondition. For all α with 0 < α < αlo

G and for any v ∈ V , NG,v(t, αt) has the asymptotic expansion

NG,v(t, αt) =d−1∑j=0

(λj(1))αt[uTj (1)vj(1)1T]v +O

(τ t),

where 0 < τ < (ρG(1))α and uj(x),vj(x), vj(x)uTj (x) = 1 are the right and left Eigenvectors of

PG(x), corresponding to the Eigenvalues λj(x) = ρG(x)e2πij/d. For all α with αloG < α < αup

G ,

NG,v(t, αt) =

c−1∑k=0

d−1∑j=0

((e2πibk/cλj(x0))

α

x0e2πik/c

)tt−1/2√

2παJ(x0)

([D−1k u

Tj (x0)vj(x0)Dk1

T]v

(1− x0e2πik/c)+O

(1

t

)),

where J(es) = ∂2

∂s2log ρG(es), x0 is the unique positive solution to αxρ′G(x) = ρG(x), and the Dk

are the diagonal matrices with [Dk]jj = e2πikB(vj)/c. For all α > αupG , NG,v(t, αt) is eventually 0.

To the best of our knowledge this is the first first-order approximation of the size of the fol-lower set of arbitrary strongly connected graphs. Notably, the multiplicative term following t−1/2

is (asymptotically) independent of t and only depends on α. Further terms in the asymptoticexpansion are effectively computable (with each successive term becoming ever-more unwieldy).

We further present the results for the case of variable-length sequences. With the following the-orem, we give an alternate proof of one of Shannon’s results on discrete noiseless channels [Sha48].

10

Theorem 3.5. Let G be a strongly connected and deterministic graph and denote by x0 the uniquepositive solution to ρG(x) = 1. Then, the combinatorial capacity of G satisfies CG = − log2 x0.

We also obtain an exact explicit representation of the size of the costly constrained languageNG,v(t). This uses a univariate singularity analysis of the generating function FG,v(x).

Theorem 3.6. Let G be a strongly connected and deterministic graph and denote by x1, . . . , xmthe solutions to (1 − x) det(I − PG(x)) = 0. Then, for any vertex v ∈ V , there exist polynomialsΠG,v,i(t), calculable from the generating function FG,v(x), such that

NG,v(t) =m∑i=1

ΠG,v,i(t)x−ti .

The degree of the polynomial ΠG,v,i(t) is equal to the multiplicity of the root xi minus one.

Example E.1 illustrates that the polynomials ΠG,v,i(t) can easily be computed with a partialfraction decomposition of the generating function.

Applications. Section 6 contains our results on DNA synthesis and the size of deletion balls.The analysis focuses on a graph G(s), where paths of a given cost correspond to subsequencesof a periodic sequence s (Definition 6.1 and 6.3). Then, for this graph G(s) and a character v,we have that NG(s),v(t) counts all subsequences of the t characters of s starting at v. Similarly,NG(s),v(t, bαtc) counts subsequences that also have length bαtc. For DNA synthesis, the capacityof G(s) maps to certain information rates. We use Theorem 3.4, Theorem 3.5, and Theorem 3.6 toderive new results, such as Proposition 6.7, Example 6.12, and Example 6.13.

4 Technical Overview

We now provide an overview of the tools of analytic combinatorics in several variables [Mel21] andmatrix spectral analysis [HJ12] that we use to analyze the singularities of a bivariate generatingfunction. The behavior of the generating function near specific singularities maps to the asymptoticexpansion of the diagonal coefficients corresponding to an average cost constraint. The analysisinvolves the relationship between graph structure, channel cost function, and the singular set.

4.1 Analytical Combinatorics in Several Variables

We start with analytic preliminaries (see Appendix E for examples illustrating these concepts).Analytic combinatorics [FS09] gives asymptotics of a sequence (N(t) : t ∈ N) = N(0), N(1), . . .from properties of its generating function F (x) =

∑t≥0N(t)xt. Analytic combinatorics in several

variables (ACSV) [PW13, Mel21] considers multivariate sequences (N(t)) = (N(t1, . . . , td)) andgenerating functions

F (x) =∑t∈Nd

N(t)xt =∑

t1,...,td∈NN(t1, . . . , td)x

t11 · · ·x

tdd .

ACSV resembles the univariate case, transferring behavior of a generating function near singularitiesto an asymptotic expansion of its coefficients. There are many ways for the coefficient vector t to

11

grow to infinity in the multivariate setting. Under mild assumptions, if t = tα with t → ∞ andα fixed, then the asymptotic behaviour of N(t) is uniform as α varies in cones of Rd. Thus, themethods of ACSV derive asymptotic behaviour of the α-diagonals N(tα) as functions of α.

Fix a direction α ∈ Rd>0 and coprime polynomials Q(x) and H(x) such that the multivariate ra-tional function F (x) = Q(x)/H(x) has a convergent power series expansion F (x) =

∑t∈Nd N(t)xt

near the origin. We assume that H is not a constant (which is true for the generating functionsunder consideration) meaning (when d ≥ 2) that F admits an infinite set V of singularities definedby the vanishing of H. An asymptotic analysis begins with the Cauchy integral formula, which ind variables states that

N(tα) =1

(2πi)d

∫TF (x)

dx

xtα1+11 · · ·xtαd+1

d

, (1)

where T is a product of circles close to the origin. Our results hold when tα has non-negativeinteger coordinates so that N(tα) is well defined.

If d = 1 one can expand the circle of integration T out to infinity, picking up an asymptoticcontribution from each pole of F that can be explicitly determined through a residue computation.If d > 1 there are many possible ways to deform the domain of integration T away from the origin,and it is not immediately clear how to proceed. Roughly, the idea is to push T to the singular set Vdetermined by the zero set of the denominator H(x), perform a residue computation to reduce theCauchy integral in (1) to a simpler integral ‘on’ the variety V, then use the saddle-point methodto asymptotically approximate this reduced integral. To determine where on the singular set toperform this procedure, two properties of singularities come into play.

Minimal points are the singularities of F (x) on the boundary of its domain of absolute conver-gence. Equivalently, x ∈ V is minimal if and only if H(x) has no other root y with strictly smallercoordinate-wise modulus. A minimal point is strictly minimal if no other singularity has the samecoordinate-wise modulus, and finite minimal only a finite number of other singularities have thesame coordinate-wise modulus. Minimal points are those singularities which we can easily deformthe domain of integration T close to without crossing V.

Critical points are singularities where locally restricting to the variety results in a saddle-pointintegral. Although the critical points can be encoded by a finite number of algebraic sets, theirexplicit definition depends heavily on the geometry of V and can be quite complicated. Thesimplest situation is the square-free smooth case, when H and all of its partial derivatives donot simultaneously vanish and the critical points are solutions of the system

H(x) = α2x1Hx1(x)− α1x2Hx2(x) = · · · = αdx1Hx1(x)− α1xdHxd(x) (2)

containing d equations in d variables. If V is locally a manifold near a point x ∈ V but H and itspartial derivatives simultaneously vanish at x then x is a critical point if and only if (2) holds whenH is replaced by the product of its unique irreducible factors. The only other situation encounteredin this paper is the complete intersection case, when H is locally the union of d manifolds nearx ∈ V, and any such x is a critical point.

Under a few additional conditions, the existence of minimal critical points means asymptotics ofN(tα) can be determined from local properties of F (x) near these points. Although the generatingfunctions we study here admit minimal critical points, their existence is not guaranteed.

12

4.2 On the Way to ACSV: Cost-Diversity and Matrix Analysis

To use the ACSV techniques, our starting point is Lemma 5.1, which provides a linear recursionfor the sequence counting problem. Lemma 5.2 then uses this recursion to derive the family ofmultivariate generating functions that we analyze. The specific ACSV theorems that we use arecontained in Section 5.2. For example, one of our key theorems states that for a strongly-connected,cost-diverse G with cost-enumerator matrix PG(x) admitting a unique real eigenvalue ρG(x) withmaximum modulus, there is (under moderate assumptions) an asymptotic expansion of the form

NG,v(t, αt) ∼ x−t0 y−αt0 t−1/2 · Formula(α, x0, y0)

as t → ∞, where α > 0 determines the cost constraint, x0 is the unique positive solution toαxρ′G(x) = ρG(x) and y0 = 1/ρG(x0). The exact expression is in Theorem 3.4.

Before invoking these results, we need to establish some theory about costly constrained chan-nels. Doing so will both verify that our generating functions satisfy necessary conditions and helpus understand the implications of the ACSV theorems in our setting. In particular, we must firstrule out certain degenerate cases. Khandekar, McEliece, and Robemich [KMR00] have observedthat graphs with constant edge cost are uninteresting, since the capacity is simply determined bythe number of paths in the graph of a given length. Generalizing this observation, we focus oncost-diverse graphs (Definition 2.2). Interestingly, we also show that if a graph is not cost-diverse(i.e., it is cost-uniform), then the average cost of fixed-length paths approaches a constant. Hence,the only ways for the capacity to be non-zero is if the graph is cost-diverse or if we restrict to aspecific path length. Similarly, we consider strongly connected graphs to avoid the situation wherethe asymptotic counting results vary significantly based on the starting vertex.

Focusing on cost-diverse and strongly connected graphs, we dive into some of the interestingproperties that such graphs have. We aim to determine graph properties that are sufficiently generalto aid in our spectral and analytic analysis. This is the content of Section 5.3. A key stepping stonefor our results is Lemma 5.8, which provides an equivalence between many properties of stronglyconnected graphs. Specifically, we establish a connection between (i) cost-diversity, (ii) coboundaryconditions, (iii) nice behavior of the spectral radius under rotations, and (iv) log-log-linearity orlog-log-convexity of the spectral radius. Proving Lemma 5.8 involves both combinatorial reasoningand matrix spectral analysis.

4.3 Connecting ACSV and Cost-Diverse Graphs via Spectral Analysis

Our equivalence result in Lemma 5.8 sets the stage for us to derive properties of the generatingfunctions by using our knowledge of the underlying properties of the matrix PG(x). This appearsin Section 5.4. At a high level, we need to understand the singularities and critical points of ourgenerating functions FG(x, y) to apply the theorems in Section 5.2. In order to apply the standardresults of ACSV, strong information about the singular set is needed. In particular, we must identifyminimal critical points and points with the same coordinate-wise modulii; these are real algebraicproperties (involving relationships among the moduli of complex variables) and thus expensive todetermine computationally even on many explicit bivariate examples (see [MS21] for a discussionof the complexity of such operations). These issues are also complicated by the non-smoothness ofthe singular set of the generating functions under consideration.

To get around these issues, we use characteristics of our generating functions that are guaranteedby properties of the underlying cost-diverse and strongly connected graph G. In Lemma 5.19 we

13

identify the minimal singularities of FG(x, y), expressing them as a function of the graph period d,cost-period c, and spectral radius ρG(x0). The idea is to use the Perron-Frobenius theorem to showthat ρG(x0) is the single real Eigenvalue of maximum modulus, and then to show that (x0, 1/ρG(x0))is a minimal singularity. To prove uniqueness, we explicitly write out the other singularities andverify that they are not minimal.

Next, Lemma 5.20 proves that the points identified in Lemma 5.19 are smooth points ofFG(x, y), and gives a condition on α and ρG(x) that guarantees them to be critical. We pro-ceed by reasoning about the Eigenvectors of PG(x). A key component of the proof is Lemma 5.11,which shows that when we rotate x, replacing it by xe2πik/c, then the Eigenvalues also scale byexactly e2πibk/c. We use this property to find a nice formulation of a related adjoint matrix, and toconclude that the points are smooth.

Going one step further, we investigate the conditions on α and ρG(x) that ensure a solution tothe critical point equations (2). Specifically, in Lemma 5.21 we show that α must be in a certainrange depending on the behavior of ρG(x). This proof is where, among others, we use the strictlog-log-convexity of ρG(x) that we prove in Lemma 5.17.

The final component needed is Lemma 5.22, which proves that the singular set near the criticalpoints has non-degenerate geometry. We again analyze the matrix spectrum and use the rotationproperty in Lemma 5.11. A key step involves a characterization of non-degenerate critical singular-ities based on property of a certain Hessian. Specifically, we can show this Hessian is non-singularby again using the strict log-log-convexity of ρG(x).

To establish Theorem 3.4 we then apply results from [Mel21] and use the spectral properties ofPG that we have derived from the graph properties. When (x0, 1/ρG(x0)) is a smooth point of thesingular set of the generating function asymptotic behaviour is determined using Proposition 5.4,while in the non-smooth case it follows from an application of Proposition 5.5. We can computethe terms in the expression for NG,v(t, αt) using Lemma 5.11 and Lemma A.2, and also verify thatthe conditions on α hold as in the statement of Theorem 3.4.

Implications and Applications. After we have established Theorem 3.4, we derive the otherresults as corollaries in Section 5.6. For example, we give first-order approximations of the numberof followers in Theorem 3.4 and Theorem 3.6. These results then directly imply the exponentialgrowth rate of the number of sequences, and the growth rate determines the capacity. Specifically,the first-order approximations lead to our characterization of the variable-length capacity CG(α)in Theorem 3.1 and the fixed-length capacity CG in Theorem 3.5.

To provide some concrete applications of our general theorems, we consider questions relating toDNA synthesis and deletion balls in Section 6. Prior work [LSWY20] formulated these problems interms of a subsequence graph G(s) for a sequence s. The paths in G(s) correspond to subsequencesof s, and the cost of the paths corresponds, roughly, to the distance between characters in s.The previous analysis is limited to the case when s is the alternating sequence. By using our newtheorems to analyze G(s) and the associated sequence counting problems, we are able to extend theprevious result in two ways. First, we allow s to be an arbitrary periodic sequence, i.e., repeatinga string r to form s = (rrrr . . .). Second, we incorporate constraints by analyzing a labeled graphproduct (Definition 6.9) that is formed from G(s) and a constraint graph.

14

5 Constrained Codes and ACSV

As noted above, our general approach involves using spectral theory and graph properties to char-acterize the singular behaviour of a generating function, so that we can apply ACSV results. Wethus start by deriving the generating function of the sequence NG,v(t, n), after which we present theACSV theorems that we make use of. Introducing these theorems illustrates what singular proper-ties of the generating function we need to establish, and to this end we derive some general resultson the spectral properties of strongly connected and cost-diverse graphs. Finally, we establish theconnection between graph properties and the singularities of the generating function, allowing usto use the powerful machinery of ACSV.

5.1 Generating Function of the Follower Set Size

The starting point of our derivations is the following recursion, which is key to deriving the gener-ating function of the sequence NG,v(t, n).

Lemma 5.1. Let G be a deterministic graph. Then the size of the follower set of any vertex v ∈ Vobeys the recursion

NG,v(t, n) =∑

e∈E:init(e)=v

NG,term(e)(t− τ(e), n− 1)

for all n > 0 and t ≥ 0, where NG,v(t, 0) = 1 for all t ≥ 0 and NG,v(t, n) = 0 for all t < 0 or n < 0.

Proof. Let PG,v(t, n) denote the set of all length n paths through G that start from vertex v andhave cost at most t. By the deterministic property of the graph, NG,v(t, n) = |PG,v(t, n)|. PartitionPG,v(t, n) according to the first traversed edge e ∈ E into parts PG,v,e(t, n) = {p ∈ PG,v(t, n) :p = (e, e2, . . . )}, for any e ∈ E that leaves v (i.e., with init(e) = v). To start with, PG,v(t, n) =⋃e∈E PG,v,e(t, n) and the parts are distinct by definition. Now, any path p ∈ PG,v,e(t, n) starts

by traversing e, which costs τ(e) and results in the vertex term(e). Therefore, each path p ∈PG,v,e(t, n) can be assembled by prepending e to some path of cost at most T − τ(e) and lengthn−1 that starts from term(e), i.e., PG,v,e(t, n) =

{p = (e,p′) : p′ ∈ Pterm(e)(t− τ(e), n− 1)

}. Thus

|PG,v,e(t, n)| = NG,term(e)(t− τ(e), n− 1), which proves the recursive statement of the lemma. Theinitial condition NG,v(t, 0) = 1 for all t ≥ 0 comes from the fact that we include the length zerostring in our computations.

This recursion enables us to derive the exact generating function of the integer sequencesNG,v(t, n). Note that restricting our attention to the sequence NG,v(t, n) also yields results forNG,v(t) because NG,v(t) =

∑n≥0NG,v(t, n). The next lemma derives the generating function of the

sequence NG,v(t, n) for vertices v ∈ V .

Lemma 5.2. Let G be a deterministic graph, and let v be a vertex. Let (1, . . . , 1) denote theall-ones vector of length |V | and I denote the |V | × |V | identity matrix. The generating functionFG,v(x, y) of NG,v(t, n) is given by the entry of

FG(x, y) =1

1− x· (I − yPG(x))−1(1, . . . , 1)T

corresponding to the vertex v.

15

Note that I − yPG(x) is not always invertible, but the values of x and y where I − yPG(x) issingular form the singularities of our generating functions, which are precisely the objects we needto characterize to determine the asymptotic behavior of the integer sequence NG,v(t, n).

Proof. Starting from the recursive expression of NG,v(t, n), we first incorporate the beginning ofthe recursion and obtain

NG,v(t, n) =∑

e∈E:init(e)=v

NG,term(e)(t− τ(e), n− 1) + U(t, n),

where U(t, n) = 1 if n = 0 and t ≥ 0 and 0, otherwise. Multiplying with xtyn on both sides andsumming over n, t yields

FG,v(x, y) =∑

e∈E:init(e)=v

∑t,n≥0

NG,term(e)(t− τ(e), n− 1)xtyn +∑t,n≥0

U(t, n)xtyn

=∑

e∈E:init(e)=v

xτ(e)y∑

t≥−τ(e),n≥−1

NG,term(e)(t, n)xtyn +∑t≥0

xt

(a)=

∑e∈E:init(e)=v

xτ(e)yFG,term(e)(x, y) +1

1− x,

where in (a) we used that NG,v(t, n) = 0 for any t < 0 or n < 0 and the fact that the geometricseries is equal to 1/(1 − x). Combining the generating functions of all vertices into FG(x, y), weobtain

FG(x, y) = yPG(x)FG(x, y) +1

1− x(1, . . . , 1)T.

Rearranging the above equality gives the claim.

Example 5.3. Consider the graph in Figure 2. In this case, PG(x) = x+x2 and thus the generatingfunction of the single vertex is given by F (x, y) = 1

(1−x)(1−y(x+x2)) .

5.2 ACSV Theorems

We now introduce the theorems from ACSV that we will use to prove our asymptotic results, takingcare to note what properties we need to establish to determine asymptotics. These theorems arespecialized to the bivariate case we consider.

First, we see a theorem for ‘smooth’ asymptotics, which applies when asymptotics are deter-mined by a singularity near which the singular set forms a smooth manifold. We will apply thisresult in the regime when α > αlo

G. A singularity (x0, y0) of a rational generating function F (x, y)is called non-degenerate with respect to the direction (α1, α2) when the quantity

J(x0, y0) =α1(α1 + α2)

α22

+x20Hxx(x0, y0)

bHy(x0, y0)− 2

α1aHxy(x0, y0)

α2Hy(x0, y0)+α21bHyy(x0, y0)

α22Hy(x0, y0)

exists and is non-zero, where subscripted variables refer to partial derivatives.

16

Proposition 5.4 (Melczer [Mel21, Theorem 5.1]). Let α1, α2 > 0 and let Q(x, y), H(x, y) be co-prime polynomials such that F (x, y) = Q(x, y)/H(x, y) admits a power series expansion F (x, y) =∑

t,n≥0N(t, n)xtyn. Suppose that the system of polynomial equations

H(x, y) = α2xHx(x, y)− α1yHy(x, y) = 0 (3)

admits a finite number of solutions, exactly one of which (x0, y0) ∈ C2 is minimal. Suppose furtherthat (x0, y0) has non-zero coordinates, Hy(x0, y0) 6= 0, and (x0, y0) is non-degenerate. Then, ast→∞,

N(tα1, tα2) = x−tα10 y−tα2

0 t−1/21√

2πα2J(x0, y0)

(−Q(x0, y0)

y0Hy(x0, y0)+O

(1

t

))(4)

when t(α1, α2) ∈ N2.

A straightforward generalization of Proposition 5.4 exists when the critical point equations (3)admit a finite set of minimal singularities that all have the same coordinate-wise modulus. If allsuch points satisfy the conditions of the proposition then an asymptotic expansion is obtained bysumming the right-hand side of (4) at each of the points; see [Mel21, Corollary 5.2]. To establishasymptotics in the smooth case, we thus need to: characterize the minimal points that satisfy (3)for our generating functions; show that Hy does not vanish at these points (meaning they are‘smooth’ points); prove that the points are non-degenerate; and verify that the numerator of thegenerating function does not vanish (to be sure we have a non-zero dominant asymptotic term).

The other case of interest is the ‘multiple-point case’ where two smooth branches of the singularset collide. In this case, the asymptotic behavior is obtained using the following theorem.

Proposition 5.5 (Melczer [Mel21, Prop. 9.1 and Thm. 9.1]). Let α1, α2 > 0 and let Q(x, y), H(x, y)be coprime polynomials such that F (x, y) = Q(x, y)/H(x, y) admits a power series expansionF (x, y) =

∑t,n≥0N(t, n)xtyn. Suppose that (x0, y0) is a strictly minimal point, and near (x0, y0)

the zero set of H is locally the union of the sets defined by the vanishing of polynomials R(x, y) andS(x, y) such that R(x0, y0) = S(x0, y0) = 0 and the gradients of R and S are linearly independentat (x0, y0) (in particular, both gradients must be non-zero so each of the zero sets are locally smoothnear (x0, y0)). If there exist ν1, ν2 > 0 such that

(α1, α2) = ν1

(1,y0Ry(x0, y0)

x0Rx(x0, y0)

)+ ν2

(1,y0Sy(x0, y0)

x0Sx(x0, y0)

)then, as t→∞,

N(tα1, tα2) = x−tα10 y−tα2

0

Q(x0, y0)

| det J|+O(τ t)

where 0 < τ < x−α10 y−α2

0 and

J =

(x0Rx(x0, y0) y0Ry(x0, y0)x0Sx(x0, y0) y0Sy(x0, y0)

).

As in the smooth case, if the point (x0, y0) is finitely minimal and all points with the samecoordinate-wise modulus satisfy the conditions of Proposition 5.5, then we get an asymptotic ex-pansion by summing the asymptotic contributions of each. Note that in this multiple-point caseour asymptotic expansion has an exponentially small error term. Applying Proposition 5.5 is easierthan applying Proposition 5.4, and our task is to prove the existence of the constants ν1 and ν2.

Remark. In Appendix E we give concrete examples that help illustrate why the conditions of thesetheorems are necessary and how they are applied in our setting.

17

5.3 Cost-Diverse Graphs and Equivalent Conditions

An important aspect and requirement of Theorem 3.4 is that the graph G is cost-diverse. Roughlyspeaking, by Definition 2.2, cost-diversity means that some paths of the same length have differentcosts. This property is important in the derivation of the asymptotics of the bivariate seriesNG,v(t, αt) as it entails a smooth behavior of the series in the parameter α. Conversely, if G is cost-uniform then there is only a single value for α for which the series does not vanish asymptotically,which is the value where NG,v(t, αt) = NG,v(t). The following examples illustrate cost-diversity.

1

1

1

(a) Cost-uniform graph

2

1

1

(b) Cost-diverse graph

2

1

3 2

(c) Cost-uniform graph

2

1

1 2

(d) Cost-diverse graph with cost-period 2

1

1

2

3

(e) Graph with period 2 and cost-period 3

Figure 3: Examples of graphs illustrating cost-diversity and cost-periodicity

Example 5.6. Figure 3 illustrates cost-diversity on different graphs. Figure 3a is a graph withconstant cost and thus all paths of length m have cost exactly m. The graph is therefore not cost-diverse. Figure 3b on the other hand is cost-diverse; there are two paths of length 2 from the leftvertex to itself having cost either 2 or 4. Finally, Figure 3c shows a cost-uniform graph, since anycycle of length m from the left vertex to itself has cost 2m; any cycle of length m from the rightvertex to itself has cost 2m; any path of length m from the left to the right vertex has cost 2m− 1;and, any path of length m from the right to the left vertex has cost 2m+ 1.

Connections between cost-diversity and the spectral radius will be integral to Theorem 3.4. Aswe will see, the coboundary condition defined in Definition 2.5 arises in a variety of results relatedto the Perron-Frobenius theorem and is very useful for proving several of our results. We furtherneed the notion of log-log-convexity, which is defined as follows.

Definition 5.7. Let I ⊆ R+ be an interval and f(x) : I 7→ R+ a function on that interval. We callf(x) log-log-convex, if log f(es) is convex in the variable s on the interval log I , {log x : x ∈ I}.Analogously, we introduce the notions of strict log-log-convexity and log-log-linearity.

With these definitions we arrive at the central statement of this section.

Lemma 5.8. Let G be a strongly connected graph. The following statements are equivalent.

(a) The graph G has cost-period c.

(b) The graph G satisfies the c-periodic coboundary condition.

(c) For all x ∈ C and k ∈ Z, ρG(xe2πik/c) = ρG(x).

18

Cost period(Definition 2.3)

Coboundarycondition

(Definition 2.5)

Spectral radiuscomplex circle

Strict log-log-convexity

Non-degeneracySmoothness

and criticalityFinite minimality

Graph G

Spectralpropertiesof PG(x)

Singularitiesof FG(x, y)

Lemma 5.10

Lemma 5.10

Corollary 5.15Lemma 5.13Lemma 5.17Corollary 5.16

Lemma 5.22 Lemma 5.19Lemma 5.20

Figure 4: Relationships between the properties of strongly connected graphs, the nature of thecost-enumerator spectrum, and the singularities of the generating function.

Further, if G is cost-uniform, then the spectral radius ρG(x) is log-log-linear on x ∈ R+. If G iscost-diverse then the spectral radius ρG(x) is strictly log-log-convex on x ∈ R+.

We prove Lemma 5.8 by establishing Lemma 5.10, Lemma 5.13, Lemma 5.17, Corollary 5.15,and Corollary 5.16. Figure 4 depicts the relationships between the properties.

Remark 5.9. Lemma 5.13 and Corollary 5.15 below state that for any strongly connected graphG the equation ρG(xeiφ) = ρG(x) either has a finite number of solutions φk = 2πk/c, when G iscost-diverse with cost-period c, or holds for all φ when G is cost-uniform. Because cost-period zerorefers to cost-uniformity, this means statement (c) in Lemma 5.8 covers the case of invariance ofthe spectral radius on the complex circle. Furthermore, we see that when ρG(xeiφ) = ρG(x) has aninfinite number of solutions then the set of such solutions comprise a full circle.

Equivalence of Cost-Period and Coboundary Condition

We start with proving the equivalence of cost-uniformity and the coboundary condition. For con-venience, we will say that two integers are congruent modulo 0 if and only if they are equal.The following result is a generalization of the equivalence between the coboundary condition andcost-uniformity observed in [LHBS20b] to arbitrary cost periods c.

Lemma 5.10. Let G be a strongly connected graph. Then G has cost-period c if and only if itfulfills the c-periodic coboundary condition.

Proof. The c-periodic coboundary condition implies cost-period c: Let p = (e1, e2, . . . em)be a path from state vi to state vj with path cost τ(p) =

∑mk=1 τ(ek). Suppose p is represented by

19

state sequence vi = vi0 → vi1 → · · · → vim = vj . The coboundary condition allows the path cost tobe written as

τ(p) =m∑k=1

(b+B(vik)−B(vik−1)) + zc = mb+B(vj)−B(vi) + zc

for some integer z ∈ Z. Thus, the costs of all paths of length m that connect vi and vj are congruentmodulo c and, by definition, the graph G has cost-period c.

Cost-period c implies the c-periodic coboundary condition: We will start by showingthat there exists b ∈ R such that the cost of any cycle p of length m satisfies τ(p) ≡ bm (mod c).Let v1 ∈ V and let p1 be a cycle at state v1 of length m1. Such a cycle exists by strong connectivityof the graph. Suppose τ(p1) = t1. Now, let v2 ∈ V and let p2 be a cycle of length m2 at state v2 withcost τ(p2) = t2. Denote by g12 = gcd(m1,m2) the greatest common divisor of m1 and m2. Strongconnectivity of G implies there is a path p1→2 from v1 to v2 with length n ≥ 1 and cost τ(p1→2) = t.Define the path p comprising m2/g12 repetitions of cycle p1 followed by p1→2, and the path p′

comprising p1→2 followed by m1/g12 repetitions of the cycle p2. The paths p and p′ both havelength m1m2/g12+n. So as G has cost-period c, m2t1/g12+t = τ(p) = τ(p′)+zc = m1t2/g12+t+zcfor some z ∈ Z. This implies that

m2t1 −m1t2 = g12zc. (5)

Next we employ a variation of the Chinese Remainder Theorem in Lemma C.1, which implies thatthere exists b ∈ Q such that any cycle p of length m in G has cost τ(p) ≡ mb (mod c).

Now, define a function B : V → R as follows. Set B(v1) = 0. For a state vi 6= v1, choose a pathp1→i from v1 to vi of length n ≥ 1, and define B(vi) = τ(p1→i)− nb. We claim that (B(vi) mod c)is independent of the chosen path p1→i. To see this, suppose p′1→i and p′′1→i are two such pathsfrom v1 to vi of length n′ and n′′, respectively, and let pi→1 be a path of length p from vi to v1.The cycle p′ = (p′1→i,pi→1) has length n′ + p, so τ(p′) = (n′ + p)b + z′c, where z′ ∈ Z. Similarly,the cycle p′′ = (p′′1→i,pi→1) has length n′′ + p and cost τ(p′′) = (n′′ + p)b + z′′c for some z′′ ∈ Z.Then τ(p′1→i) = τ(p′) − τ(pi→1) = (n′ + p)b + z′c − τ(pi→1) and τ(p′′1→i) = τ(p′′) − τ(pi→1) =(n′′ + p)b+ z′′c− τ(pi→1). It follows that

τ(pi→1) = (n′ + p)b+ z′c− τ(p′1→i) = (n′′ + p)b+ z′′c− τ(p′′1→i)

from which we conclude that

τ(p′1→i)− n′b = τ(p′′1→i)− n′′b+ (z′ − z′′)c.

This confirms that by definition, (B(vi) mod c) is independent of the choice of path from v1 to vi.Finally, let e ∈ E be an edge from state vi to state vj , and let pj→1 denote a path from state

vj to v1 of length q. Consider the cycle p1 = (p1→i, e,pj→1), with cost τ(p1) = (n+ 1 + q)b+ z1cfor some z1 ∈ Z. Noting that τ(p1) = τ(p1→i, e) + τ(pj→1), and using the fact that τ(p1→i, e) =B(vj) + (n+ 1)b+ zjc for some zj ∈ Z, we find

τ(pj→1) = (n+ 1 + q)b+ z1c− (B(vj) + (n+ 1)b+ zjc) = qb−B(vj) + (z1 − zj)c.

We can also write τ(p1) = τ(p1→i) + τ(e) + τ(pj→1), implying that

τ(e) = τ(p1)− (τ(p1→i) + τ(pj→1))

= (n+ 1 + q)b− ((B(vi) + nb) + (qb−B(vj))) + (zj − zi)c= b+B(vj)−B(vi) + (zj − zi)c.

This confirms that the c-periodic coboundary condition holds.

20

Cost-Period and Spectral Properties

We next show that cost-diversity implies that there can only be a finite number of solutions toρG(xeiφ) = ρG(x) over 0 ≤ φ < 2π. This property is vital as it implies that the minimal singularitiesof our generating functions will be finitely minimal. We start with an auxiliary result on thestructure of the cost-enumerator matrix.

Lemma 5.11. Let G be a strongly connected graph with cost-period c. Then, for all x ∈ C, k ∈ Z,

PG

(xe2πik/c

)= e2πikb/cD−1k PG(x)Dk,

where Dk is the diagonal matrix with entries [Dk]ii = e2πikB(vi)/c and b and B(vi) are defined bythe coboundary condition. Denoting by λ1(x), . . . , λ|V |(x) the Eigenvalues of PG(x), it holds that

λj

(xe2πik/c

)= e2πikb/cλj(x),

Proof. The graph G satisfies the c-periodic coboundary condition by Lemma 5.10. Hence, thereexists a constant b and functions B : V 7→ R such that for any two vertices vi and vj , each edge efrom vi to vj has cost τ(e), which can be written as

τ(e) = b+B(vj)−B(vi) + zec,

for an integer ze. Abbreviate φk , 2πk/c. The entries of the cost-enumerator matrix are given by[PG

(xeiφk

)]ij

=∑


xτ(e)eiφkτ(e) = [PG(x)]ijeiφk(b+B(vj)−B(vi)).

Introducing the diagonal matrix Dk with entries [Dk]ii = eiφkB(vi), we can decompose the cost-enumerator matrix to

PG

(xeiφk

)= eiφkbD−1k PG(x)Dk.

The second part of the lemma directly follows from the similarity of the matrices PG(xeiφk) andPG(x) proven in the first part of the lemma.

We will use this property to prove that the Eigenvalues of PG(x) have a very special structure,when x varies over the complex circle. We continue with another auxiliary result that will serve toprove the subsequent result on the spectral structure of PG(x) on the complex circle.

Lemma 5.12. Let G be a strongly connected and cost-diverse graph with largest cost-period c.Then, there exist two equal-length cycles at the same state whose cost difference is precisely c.

Proof. For convenience we denote in the following for a path p by init(p) the initial vertex of itsfirst edge and by term(p) the terminal vertex of its last edge. To this end, recall the definition ofcost-period c. Since c is the largest cost-period, Definition 2.3 guarantees the existence of η ∈ Npairs of paths pi, p

′i, i ∈ [η], both of length mi, where for each i, pi and p′i start in the same vertex

vi , init(pi) = init(p′i) and end in the same vertex ui , term(pi) = term(p′i) such that the greatestcommon divisor of their cost differences is τ(pi) − τ(p′i) is c. Hence, by Bezout’s identity, thereexist (possibly negative) integers zi ∈ Z such that

η∑i=1

(τ(pi)− τ(p′i))zi = c

21

For each i, choose an arbitrary path pui→vi that connects ui and vi and construct two cyclesΓi = (pi,pui→vi) and ∆i = (p′i,pui→vi) that share the same return path pui→vi from ui to vi.Further choose arbitrary paths qi, 1 ≤ i ≤ η connecting vi and vi+1 and vη and v1. Now, setµi = max{zi, 0} and µ′i = µi − zi, denote by Γµi , µ ∈ N0 the µ−fold repetition of the cycle Γi andconstruct two large cycles Γ and ∆ by

Γ = (Γµ11 ,∆µ′11 ,q1,Γ

µ22 ,∆

µ′22 ,q2, . . . ,qη)

∆ = (Γµ′11 ,∆

µ11 ,q1,Γ

µ′22 ,∆

µ22 ,q2, . . . ,qη).

That is, Γ starts at v1, circles µ1 times along Γ1, then µ′1 times along ∆1, then proceed to movealong q1 to v2. There it circles µ2 times along Γ2 and µ′2 times along ∆2, and so on, until it movesback from vη to v1 along qη. Similarly, ∆ is created. Notice that µi and µ′i are guaranteed to benon-negative by their definitions. Computing the cost difference of Γ and ∆, one obtains

τ(Γ)− τ(∆) =

η∑i=1

(µiτ(pi) + µ′iτ(p′i))−η∑i=1

(µ′iτ(pi) + µiτ(p′i)) =

η∑i=1

(τ(pi)− τ(p′i))zi = c.

Hence, there exist two cycles at the vertex v1 of the same length m whose cost is precisely p.

Lemma 5.11 and Lemma 5.12 can be combined to prove the following result on the structure ofthe spectral radius on the complex circle.

Lemma 5.13. Let G be a strongly connected and cost-diverse graph with largest cost-period c.Then, for any x ∈ R+, there are precisely c solutions φk = 2πk/c, k ∈ {0, 1, . . . , c − 1} to theequation ρG(xeiφ) = ρG(x), in the interval 0 ≤ φ < 2π. For all other φ, ρG(xeiφ) < ρG(x).

Proof. By Lemma 5.11, for all k ∈ {0, 1, . . . , c − 1}, and j ∈ [|V |], we have λj(xeiφk) = eiφkbλj(x),which implies that ρG(xeiφk) = ρG(x).

We proceed with proving that for all other values of φ the spectral radius ρG(xeiφ) is strictlyless than ρG(x). We start with the observation that for any φ ∈ R, we have

|[PG(xeiφ)]ij | =

∣∣∣∣∣∣∑


(xeiφ)τ(e)

∣∣∣∣∣∣ ≤∑


xτ(e) = [PG(x)]ij .

By Wielandt’s theorem [Mey00, Sec. 8.3], it follows that the spectral radius satisfies ρG(xeiφ) ≤ρG(x) with equality if and only if there exist θ, θ1, θ2, . . . , θ|V | such that

PG(xeiφ) = eiθD−1PG(x)D,

where D is a diagonal matrix with entries [D]jj = eiθj . It therefore suffices to prove that thisequality can not be fulfilled for any 0 ≤ φ < 2π that is not equal to some φk. Powering the aboveequation to m ∈ N, it follows that

PmG (xeiφ) = eimθD−1Pm

G (x)D.

In particular, the entry i, j of this equation reads as

[PmG (xeiφ)]ij = ei(mθ+θj−θi)[Pm

G (x)]ij

22

and it follows that|[Pm

G (xeiφ)]ij | = |[PmG (x)]ij | = [Pm

G (x)]ij .

Denote now by Pij(m) = {p = (e1, . . . , em) : init(e1) = vi, term(em) = vj} the set of paths of lengthm from vi to vj . It is well-known [KMR00] that [Pm

G (x)]ij =∑

p∈Pij(m) xτ(p). By Lemma 5.12,

for a graph with largest cost-period c there exists a length m and a vertex vi such that there aretwo cycles of length m at vi whose cost differs by exactly c. Let in the following m, vi be such thatthey fulfill this property and denote by τ, τ + c the costs that are assumed by the two cycles pathsof length m at vi. Thus, the polynomial [Pm

G (x)]ii contains the sum of at least two monomials xτ

and xτ+c, each with integer-valued coefficients. Now, recall that the triangle inequality of a sumof complex numbers is tight if and only if the complex angles of all summands agree. Therefore, if2πφc is not an integer multiple of 2π, then |[Pm

G (xeiφ)]ii| < [PmG (x)]ii and the claim follows.

Conversely, also for cost-uniform graphs, the Eigenvalues of PG(x) have a special structure thatcan be derived explicitly. The following lemma proves this fact.

Lemma 5.14. Let G = (V,E, σ, τ) be a strongly connected graph that satisfies the coboundarycondition. Then, for all j ∈ [|V |] and x ∈ C, the Eigenvalues λ1(x), . . . , λ|V |(x) of PG(x) are

λj(x) = λj(1)xb,

where b is the constant of the coboundary condition.

Proof. The graph G satisfies the coboundary condition by assumption. Hence, there exists aconstant b and functions B : V 7→ R such that for any two vertices vi and vj , each edge fromvi to vj has cost τij , which can be written as

τij = b+B(vj)−B(vi).

Further, the number of edges from vi to vj is precisely [PG(1)]ij . It follows that each entry [PG(x)]ijof PG(x) is equal to

[PG(x)]ij = [PG(1)]ijxτij = [PG(1)]ijx

b+B(vj)−B(vi).

Introducing the diagonal matrix D(x) with entries [D(x)]ii = xB(vi), we can decompose the cost-enumerator matrix to

PG(x) = xbD−1(x)PG(1)D(x).

Thus, the characteristic polynomial φ(λ, x) of the cost-enumerator matrix PG(x) becomes

det(λI − PG(x)) = det(λI − xbD−1(x)PG(1)D(x))

= det(D) det(λI − xbD−1(x)PG(1)D(x)) detD−1

(a)= det(λI − xbPG(1)),

where in (a) we used the multiplicativity of the determinant. Thus, φ(λ, x) = xb|V |φ(x−bλ, 1). Sincethe Eigenvalues of PG(x) are precisely the roots of the characteristic polynomial, we can identifyλj(x) as the roots of φ(λ, x) and λj(1) as the roots of φ(λ, 1). By a variable substitution, it followsthat λj(x) = λj(1)xb for all j ∈ [|V |] and x ∈ C.

23

Lemma 5.14 illustrates that the Eigenvalues of cost-uniform graphs have the very special struc-ture of being monomials in x. All Eigenvalues share the same exponent b from the coboundarycondition and their coefficient is given by the corresponding Eigenvalue of the matrix PG(1).

This puts us in the position to prove the converse to Lemma 5.13. That is, we can show that if agraph is cost-uniform, or equivalently, satisfies the coboundary condition, then the spectral radiusis invariant on the complex circle.

Corollary 5.15. Let G be a strongly connected graph that satisfies the coboundary condition. Then,for any x ∈ C, ρG(xeiφ) = ρG(x), for all 0 ≤ φ < 2π.

Proof. The corollary directly follows from Lemma 5.14, using that the coboundary condition impliesthat for all x ∈ C, we have that ρG(xeiφ) = ρG(1) · |xeiφ|b = ρG(1) · |x|b = ρG(x).

Cost-Diversity and Strict Log-Log-Convexity

We next analyze the spectral radius ρG(x). First, we show that the spectral radius is log-log-linearon the real axis when the graph is cost-uniform, which will be part of proving Theorem 3.6.

Corollary 5.16. Let G be a strongly connected graph. If G is cost-uniform, then ρG(x) is log-log-linear on the interval x ∈ R+.

Proof. By Lemma 5.14, for all real-valued x > 0, we have that log ρG(es) = log(ρG(1)) + bs, whichis a linear function in s.

We turn towards proving the converse to the previous corollary, i.e., we show that if the graph Gis cost-diverse, then ρG(x) is strictly log-log-convex.

Lemma 5.17. If G is a strongly connected and cost-diverse graph, then ρG(x) is strictly log-log-convex for all x > 0.

The proof will make use of the following lemma [SWB06, Theorem 1.37].

Lemma 5.18. [SWB06, Theorem 1.37] Let P (s), s ∈ R be an irreducible matrix, such that allnon-zero entries are log-convex functions of s. Then the Perron-Frobenius eigenvalue ρ(s) of P (s)is log-convex. If additionally, at least one entry of P (s) is strictly log-convex, then ρ(s) is strictlylog-convex.

Proof. Consider the m-th power PmG (x) of PG(x). We know by [KMR00] that, denoting Pij(m)

as the set of paths of length m from vi to vj , the entry i, j of the matrix PmG (x) is given by

[PmG (x)]ij =

∑p∈Pij(m) x

τ(p). We will show that this entry is strictly log-log-convex, if there existtwo paths of length m from vi to vj with different costs. Taking the second derivative of the log-logexpression, we obtain

∂2

∂s2log([PmG (es)]ij) =

∑p esτ(p)

∑p τ(p)2esτ(p) −

(∑p τ(p)esτ(p)

)2([PmG (es)]ij

)2 .

Identifying the vectors u = (esτ(p)/2 : p ∈ Pij(m)) and v = (τ(p)esτ(p)/2 : p ∈ Pij(m)) that bothhave length |Pij(m)|, the numerator is equal to (u · u)(v · v) − (u · v)2, where u · v denotes theinner product of the vectors u and v. The numerator is therefore non-negative by Cauchy-Schwarz

24

inequality, and thus, the entries [PmG (x)]ij are either 0 or positive and log-convex. Further, due

to the cost-diversity of the graph G, there exist m, i, j such that there exist two paths of lengthm from vi to vj with different costs, and hence, u and v are linearly independent. In this case,Cauchy-Schwarz inequality holds with strict inequality and thus the numerator is positive, whichimplies that [PmG (es)]ij is strictly convex in s.

The Perron-Frobenius Eigenvalue of PmG (es) is given by ρmG (es). With Lemma 5.18 it follows thatρmG (es) is strictly log-convex. Since powering with positive integers does not change log-convexity,ρG(es) is also strictly log-convex. By definition, ρG(x) is thus strictly log-log-convex.

5.4 Singularity and Critical Point Analysis

The main challenge in proving Theorem 3.4 is showing that the prerequisites of Propositions 5.4and 5.5 are fulfilled. We establish the necessary conditions through a careful study of the singular-ities of our generating functions

FG(x, y) =1

1− x· (I − yPG(x))−1(1, . . . , 1)T .

Note that we can write FG(x, y) = QG(x, y)/HG(x, y) for a polynomial vector QG(x, y) and poly-nomial HG(x, y) = (1 − x) det(I − yPG(x)) = 0. In particular, every coordinate of FG(x, y) isa rational function with the same denominator, so notions like minimal and critical points canbe considered as properties of FG(x, y) instead of just for each coordinate. We always work withrespect to the direction (α1, α2) = (1, α), and this direction is assumed when discussing notionslike critical points and non-degeneracy.

Lemma 5.19. Let G be a strongly connected and cost-diverse graph with largest period d and largestcost-period c. The points

{(x0, 1/ρG(x0)) : 0 < x0 < 1} ∪ {(1, y0) : y0 ∈ C, |y0| ≤ 1/ρG(1)}

are minimal singularities of each coordinate of FG(x, y). All other minimal singularities have theform (

x0ei2πk/c, e−2πi(kb/c+j/d)/ρG(x0)

)for some 0 < x0 ≤ 1, k ∈ {0, 1, . . . , c− 1}, and j ∈ {0, 1, . . . , d− 1}, where b is the constant of thec-periodic coboundary condition.

Proof. The singularities of the coordinates of FG(x, y) are a subset of the solutions to (1−x) det(I−yPG(x)) = 0, and any root of the denominator where the numerator does not vanish is a singularity.Using that det(I − yPG(x)) =

∏j(1 − yλj(x)), where λj(x) are the Eigenvalues of PG(x), the

singularities of FG are thus a subset of the variety

X = {(x, 1/λj(x)) : x ∈ C, 1 ≤ j ≤ |V |} ∪ {(1, y) : y ∈ C}.

We start by investigating the first set of singularities. Right away, we see that for all x ∈ C with|x| > 1 the singularities (x, 1/λj(x)) cannot be minimal, since there exists y ∈ C such that (1, y)has a coordinate-wise smaller modulus as (x, 1/λj(x)). We thus focus on those singularities with0 < |x| ≤ 1. Due to the fact that the graph G is strongly connected, it follows that PG(x0) isirreducible for all x0 ∈ R+ and thus, by the Perron-Frobenius Theorem, has a single real Eigenvalue

25

ρG(x0) of maximum modulus. In the following we identify the Perron-Frobenius Eigenvalue as thefirst Eigenvalue ρG(x0) = λ1(x0).

We now show that for all 0 < x0 ≤ 1 the points (x0, 1/ρG(x0)) are minimal singularities. Tobegin, the numerator of FG at this point can be expressed as

QG(x0, 1/ρG(x0)) = adj

(I − PG(x0)

ρG(x0)

)1T = ρG(x0)

1−|V | adj(ρG(x0)I − PG(x0))1T,

so an application of Lemma A.2 shows that the numerator is non-zero, as adj(ρG(x0)I − PG(x0))is either all-positive or all-negative. In particular, these points are singularities of each coordi-nate and it remains to show minimality. We prove minimality using Proposition 5.4 of [Mel21],which states that a singularity (x0, 1/ρG(x0)) with positive coordinates is minimal if and only ifHG(tx0, t/ρG(x0)) is non-zero for all 0 < t < 1. The term (1− tx0) does not vanish for 0 < t < 1,so if HG(tx0, t/ρG(x0)) = 0 then t/ρG(x0) = 1/λj(tx0) for some 0 < t < 1 and j ≥ 2. We find that

t/ρG(x0) < 1/ρG(x0)(a)

≤ 1/ρG(tx0) ≤ |1/λj(tx0)|,

where inequality (a) uses that each entry of PG(x0) is monotonically increasing in x0 and thusρG(x0) is also monotonically increasing in x0. Hence HG(tx0, t/ρG(x0)) does not vanish on 0 < t < 1and it follows that any point (x0, 1/ρG(x0)) with 0 < x0 < 1 is a minimal singularity.

We next prove that the only other minimal singularities in {(x, 1/λj(x)) : x ∈ C, 1 ≤ j ≤ |V |}are as given in the statement of the lemma. To start with, by an extension of Perron-FrobeniusTheorem for periodic graphs [MRS01, Thm. 3.18], for each 0 < x0 ≤ 1 there are precisely d simpleEigenvalues λ1(x0), . . . , λd(x0) with the same modulus as the spectral radius and they are given by

λj(x0) = ρG(x0)e2πi(j−1)/d.

Due to the similarity of PG(x) and PG(xeiφk) for all φk = 2πk/c and k ∈ {0, 1, . . . , c − 1}, whichwas derived in Lemma 5.11, the Eigenvalues of PG(x0e

iφk) are given by λj(x0eiφk) = eiφkbλj(x0).

Therefore, for each j and k we obtain one candidate for a minimal singularity,(x0e

iφk , e−i(φkb+2π(j−1)/d)/ρG(x0)).

For all other φ that are not integer multiples of 2π/c, the singularities (x0eiφ, 1/λj(x0e

iφ)) are notminimal, as ρG(x0e

iφ) < ρG(x0) in this case was proven in Lemma 5.13. Furthermore, all otherEigenvalues λj(x) with j > d have |λj(x)| < ρG(x), which implies that they cannot be minimal.

Finally, we study the singularities in {(1, y) : y ∈ C}. All points (1, y0), y0 ∈ C, |y0| ≤ 1/ρG(1)are singularities, since the matrix I − y0PG(1) is invertible. Furthermore, these singularities areminimal due to the fact that (1, 1/ρG(1)) is minimal as proven above. Conversely, for |y0| > 1/ρG(1)the points (1, y0) are not minimal due to the existence of the singularities (x0, 1/ρG(x0)).

Lemma 5.20. Let G be a strongly connected and cost-diverse graph with period d and cost-period c.For all x0 ∈ R+ with x0 6= 1 and all k ∈ {0, 1, . . . , c− 1}, j ∈ {0, 1, . . . , d− 1}, the points(

x0e2πik/c, e−2πi(kb/c+j/d)/ρG(x0)

)are smooth points of FG(x, y) and critical if and only if αx0ρ

′G(x0) = ρG(x0). Any point (1, y0)

with y0 ∈ C and |y0| < ρG(1) is not a root of det(I − yPG(x)) and thus is a smooth point that isnever critical.

26

Proof. Abbreviate for convenience φk , 2πk/c and θj , 2πj/d. We start by verifying that for allx0 ∈ R+ with x0 6= 1 and k ∈ {0, 1, . . . , c−1}, j ∈ {0, 1, . . . , d−1}, the points (x0e

iφk , e−i(φkb+θj)/ρG(x0))are smooth. By Jacobi’s Formula, the partial derivative in direction y satisfies

∂HG(x, y)

∂y= −(1− x) tr (adj(I − yPG(x))PG(x)) .

For the rest of this proof we write λj(x0eiφk) for the d maximum modulus Eigenvalues of PG(x0e

iφk)that satisfy λj(x0e

iφk) = ei(φkb+θj)λ1(x0), where λ1(x0) = ρG(x0) is the Perron root of PG(x0),according to [MRS01, Thm. 3.18] and Lemma 5.11. The corresponding normalized Eigenvectorsare uj(x0e

iφk) and vj(x0eiφk) and plugging in the points of interest, we obtain

∂HG(x, y)

∂y

∣∣∣∣x=x0eiφk ,y=1/λj(x0e

iφk )

= −(1− x0eiφk) tr(

adj(I − PG(x0eiφk)/λj(x0e

iφk))PG(x0eiφk)).

Here we can use Lemma 5.11 to simplify the cost-enumerator matrix and Lemma A.2 to find anexplicit representation of the adjoint matrix, simplifying the above expression to

− cj(x0)(1− x0eiφk)(λj(x0))1−|V | tr(eiφkbvj(x0)PG(x0)u

Tj (x0))

=− cj(x0)(1− x0eiφk)(λj(x0))1−|V |ρG(x0)e

i(φkb+θj),

where cj(x0) ∈ R \ {0} is a non-zero constant. This expression is non-zero for all x0 ∈ R+ withx0 6= 1 and k ∈ {0, 1, . . . , c− 1}, j ∈ {0, 1, . . . , d− 1} and thus the points are smooth.

We now examine when these minimal points are solutions of the critical point equations

αx∂HG(x, y)

∂x= y

∂HG(x, y)

∂y.

The partial derivative of the denominator with respect to x is given by

∂HG(x, y)

∂x= −det(I − yPG(x))− (1− x)y tr

(adj(I − yPG(x))

∂PG(x)

∂x

).

Evaluating this partial derivative at the points (x0eiφk , 1/λj(x0e

iφk)), we obtain

∂HG(x, y)

∂x

∣∣∣∣x=x0eiφk ,y=1/λj(x0e

iφk )

= −(1− x0eiφk) tr(

adj(I − PG(x0eiφk)/λj(x0e

iφk))P ′G(x0eiφk)),

where P ′G(x) is the partial derivative of the cost-enumerator matrix with respect to x. Here we usethat det(I − yPG(x)) evaluated at these points is 0, as λj(x0e

iφk) is an Eigenvalue of PG(x0eiφk).

Similar to the case of the derivative with respect to y, we simplify this expression to

− cj(x0)(1− x0eiφk)(λj(x0))1−|V |eiφk(b−1) tr(vj(x0)P

′G(x0)u

Tj (x0))

(a)= − cj(x0)(1− x0eiφk)(λj(x0))

1−|V |λ′1(x0)ei(φk(b−1)+θj)

where, in the first step, we used that P ′G(x0eiφk) = eiφk(b−1)D−1k P

′G(x0)Dk according to Lemma 5.11,

and equality (a) follows from an application of Lemma A.4. Substituting our expressions for the

27

partial derivatives into the critical point equations shows that the critical point equations simplifyto αx0λ

′1(x0) = λ1(x0). Since λ1(x0) = ρG(x0), the lemma follows.

The singularities (1, y0) with |y0| < ρG(1) are not roots of det(I−yPG(x)) as ρG is an Eigenvalueof PG of largest modulus. Thus, near these points the zero set of the denominator is locally thezero set of the factor 1−x and is therefore smooth (algebraically, the partial derivative with respectto x is non-zero at these points). These points can never be critical because the partial derivativeof HG(x, y) with respect to y vanishes at any such point.

Lemma 5.21. Let G be a strongly connected and cost-diverse graph. Then, the critical pointequation αxρ′G(x) = ρG(x) has a positive real solution x0 if and only if

limx→∞

ρG(x)

xρ′G(x)< α < lim

x→0+

ρG(x)

xρ′G(x).

This solution, if it exists, is unique among all positive real x. If α > ρG(1)/ρ′G(1) then x0 < 1 andx0 > 1 otherwise.

Proof. Since ρG(x) > 0 for x ∈ R+, we can rewrite the equation we are trying to solve as f(x) = 1,where f(x) , αxρ′G(x)/ρG(x). To start we investigate the limit of f(x) as x → 0+. Note thatf(x) > 0 for all x ∈ R+. Furthermore, the strict log-log-convexity of ρG(x) proven in Lemma 5.17implies that f ′(x) > 0: strict log-log-convexity of ρG(x) means that log ρG(es) is strictly convexin s, and substituting x = es gives

∂

∂xf(x) = e−s

∂

∂sf(es) = αe−s

∂

∂s

esρ′G(es)

ρG(es)= αe−s

∂2

∂s2log ρG(es) > 0.

Since f ′(x) > 0 and f(x) > 0 we see that f(x) is a bounded and decreasing function as x → 0from above, and the monotone convergence theorem implies limx→0+ f(x) exists. Consequently, ifα < limx→0+ ρG(x)/(xρ′G(x)) then limx→0+ f(x) < 1, as both limits exist. Notice that we allow theupper-bound on α to diverge to∞, in which case we can take α as large as desired. This can happen,for example, when there exists a cycle of weight 0 in G. Similarly, the limit limx→∞ 1/f(x) exists, as1/f(x) is decreasing and positive. Hence, if α > limx→∞ ρG(x)/(xρ′G(x)) then limx→∞ 1/f(x) < 1.

To summarize, under our conditions on α we have limx→∞ f(x) < 1 and limx→0+ f(x) > 1. Bythe intermediate value theorem, there is at least one solution to f(x) = 1 in x ∈ R+. This solutionis unique, due to the strict monotonicity of f coming from f ′(x) > 0. We further see that if α isnot within these boundaries, there will be no solution in x ∈ R+ due to this monotonicity.

If αρ′G(1) > ρG(1) then f(1) > 1, and the solution to f(x) = 1 must occur at x0 < 1. Similarly,if αρ′G(1) < ρG(1) then f(1) < 1 and it follows that x0 > 1.

Lemma 5.22. Let G be a strongly connected and cost-diverse graph with period d and cost-period c.For all x0 ∈ R+ and k ∈ {0, 1, . . . , c− 1}, j ∈ {0, 1, . . . , d− 1}, the points(

x0e2πik/c, e−2πi(kb/c+j/d)/ρG(x0)

)are non-degenerate.

Proof. Write ξk , x0e2πik/c. According to [MRS01, Thm. 3.18] and Lemma 5.11, there are d

Eigenvalues of maximum modulus λj(ξk) of PG(ξk) that satisfy λj(ξk) = ei(φkb+θj)ρG(x0), where

28

ρG(x0) is the Perron root of PG(x0). Lemma 5.5 of [Mel21] implies that the quantityH determiningnon-degeneracy in the smooth case is the second derivative of

φ(θ) = log

(λj(ξk)

λj(ξkeiθ)

)+

iθ

α

at θ = 0. Differentiating twice with respect to θ gives

∂2

∂θ2φ(θ)

∣∣∣∣θ=0

(a)= − ∂2

∂θ2log λj(x0e

iθ)

∣∣∣∣θ=0

=∂2

∂s2log λj(e

s)

∣∣∣∣s=log x0

(b)=

∂2

∂s2log ρG(es)

∣∣∣∣s=log x0

(c)> 0.

In (a) we used Lemma 5.11 to conclude that λj(ξkeiφ) = e2πikb/cλj(x0e

iφ). Note that the differen-tiation to the left and right hand side of (b) should be understood with respect to complex-valueds and real-valued s, respectively, as ρG(es) is not complex differentiable in s in general. In (b) weused that for analytic functions, by the definition of complex differentiation, the derivative alongthe real line equals the complex derivative. Inequality (c) follows from the strict log-log-convexityof ρG(x) for x ∈ R+, as was proven in Lemma 5.17.

5.5 Proof of Theorem 3.4

We now can prove Theorem 3.4 by combining Lemmas 5.19 to 5.22 with Propositions 5.4 and 5.5.

Proof of Theorem 3.4. We differentiate between the two cases 0 < α < αlowG and αlow

G < α < αupG .

In the first case, the non-smooth singularity (1, 1/ρG(1)) and those with the same coordinate-wisemodulus will be the singularities that determine the asymptotic behavior. In the second case, thesingularities (x0, 1/ρG(x0)) with 0 < x0 < 1 and αx0ρ

′G(x0) = ρG(x0), and those with the same

coordinate-wise modulus, will be the ones contributing.We start with the non-smooth case 0 < α < αlow

G , aiming to apply Proposition 5.5 with theextension [Mel21, Corollary 9.1]. For any x0 ∈ R+ let λj(x0) with j ∈ {0, 1, . . . , d− 1} denote thed Eigenvalues of maximum modulus of PG(x0) and let uj(x0)

T and vj(x0) be the correspondingright and left Eigenvectors. We start by proving that (x0, y0) = (1, 1/λj(1)) satisfies the conditionsof Proposition 5.5. The two surfaces defined by the vanishing of R(x, y) = 1 − x and S(x, y) =det(I−yPG(x)) intersect at this point. Direct computation shows Rx(x, y) = −1 and Ry(x, y) = 0,while Jacobi’s formula implies Sx(x, y) = −y tr(adj(I−yPG(x))P ′G(x)) and Sy(x, y) = − tr(adj(I−yPG(x))PG(x)). Hence,

ν1 (1, 0) + ν2

(1,

tr(adj(λj(1)I − PG(1))PG(1))

tr(adj(λj(1)I − PG(1))P ′G(1))

)=

(ν1 + ν2, ν2

ρG(1)

ρ′G(1)

).

As α < ρG(1)/ρ′G(1) there exist ν1, ν2 > 0 such that (ν1 + ν2, ν2ρG(1)/ρ′G(1)) = (1, α), and therequired conditions hold. The remaining singularities (x, y) ∈ C2 with (|x|, |y|) = (1, 1/ρG(1)) aresmooth, however, by [Mel21, Corollary 5.6], none of them are critical as (1, 1/ρG(1)) is not criticalby Lemma 5.21. Therefore Proposition 5.5 is applicable and it remains to compute the requiredterms. The numerator of the generating function is given by

QG,v(1, 1/λj(1)) = λj(1)1−|V | adj(λj(1)I − PG(1))1T (a)= cj(1)λj(1)1−|V |uT

j (1)vj(1)1T,

29

where (a) follows from an application of Lemma A.2. Similarly, we obtain for the numerator

det J = − 1

λj(1)Sy(1, 1/λj(1)) =

1

λj(1)tr(adj(I − PG(1)/λj(1))PG(1)) = cj(1)λj(1)1−|V |.

Plugging these results into the expressions of Proposition 5.5 and summing over all contributingpoints (1, 1/λj(1)) according to [Mel21, Corollary 9.1] proves the first statement of Theorem 3.4.

We now move to the smooth case αlowG < α < αup

G . Lemma 5.19, Lemma 5.20, Lemma 5.21, andLemma 5.22 show that the point (x0, 1/ρG(x0)) where 0 < x0 < 1 and αx0ρ

′G(x0) = ρG(x0) is unique

and a smooth, finitely minimal, critical and non-degenerate singularity. Furthermore, all othersingularities with the same coordinate-wise modulus, which are (x0e

2πik/c, e−2πi(kb/c+j/d)/ρG(x0))for some k, j ∈ Z fulfill these properties as well. This allows us to invoke the extension [Mel21, Cor.5.2] of Proposition 5.4. Notice that there may be values of j and k where the numerator vanishes,however we have shown in Lemma 5.19 that this does not occur when k = j = 0. Thus, it ispossible that the leading asymptotic terms from some of these points vanishes, but the sum of allterms always captures the dominant asymptotic behaviour of the sequence under consideration.The quantity H appearing in the asymptotic expansion was derived in Lemma 5.22. Abbreviatingφk = 2πk/c and θj = 2πj/d in the following, we find that

QG,v(x0eiφk , 1/λj(x0e

iφk)) = cj(x0)λj(x0)1−|V |D−1k u

Tj (x0)vj(x0)Dk1

T,

where we used that for any two square matrices D and P , the adjoint of their products is given byadj(D−1PD) = D−1 adj(P )D.

5.6 Proof of the Other Theorems

Proof of Theorem 3.1. Theorem 3.1 directly follows from Theorem 3.4. By the definition of thecapacity, we take the logarithm of the asymptotic expansion NG,v(t, αt) and divide by t. Computingthe limit as t→∞, all terms except for the exponential in t vanish.

Theorem 3.5 and Theorem 3.6 can be proven via standard univariate singularity analysis [FS09].We start with a more general statement on the exact representation of the follower set size.

Proof of Theorem 3.6. By Lemma 5.2, the generating function of the integer series NG,v(t) is

FG,v(x) = QG,v(x)/HG(x),

where QG,v(x) = [adj(I − P (x))1T]v and HG(x) = (1 − x) det(I − PG(x)), both polynomials inx and y. Therefore, the singularities are a subset of the solutions to (1 − x) det(I − PG(x)) = 0.Invoking [FS09, Theorem IV.9] then proves Theorem 3.6. Note that in principle not all solutionshave to be singularities, as the numerator and denominator are not guaranteed to be coprime. Thiscase is covered by setting ΠG,v,i(t) = 0 in the partial fraction decomposition for all roots that sharecommon factors with the numerator.

Proof of Theorem 3.5. We claim that Theorem 3.5 follows from Theorem 3.6. First, we can directlycompute CG = limt→∞ log2NG,v(t)/t. Then, we use the fact that the roots satisfy

HG(x) = (1− x)∏j

(1− λj(x)),

30

s A C G T A C G T A C G T

x1 C T A C G C

x2 A G T A T A

x3 C T T T

Cycle 1 2 3 4 5 6 7 8 9 10 11 12

Figure 5: Synthesis of three strands x1 = (CTACGC),x2 = (AGTATA), and x3 = (CTTT) usingthe synthesis sequence s = (ACGTACGTACGT). The strand x1 is synthesized by attaching thenucleotides in cycles 2, 4, 5, 6, 7, 10, x2 is synthesized in cycles 1, 3, 4, 5, 8, 9, and similarly x3 issynthesized in cycles 2, 4, 8, 12. Hence, x1 can be synthesized in 10 cycles, x2 in 9 cycles and x3 in12 cycles, when using the synthesis sequence s.

where λj(x) are the Eigenvalues of PG(x). Since PG(x) is an irreducible matrix, there is an Eigen-value that is equal to the spectral radius. Thus, the singularity of smallest magnitude of FG,v(x) isthat for which ρG(x) = 1. The numerator at this singularity is non-zero due to Lemma A.2.

6 Other Applications: DNA Synthesis and Subsequence Counting

Writing data into DNA, which is known as DNA synthesis, is a costly part of existing DNA storagesystems. We study codes that minimize the time and number of required materials needed toproduce the DNA strands. Consider a system where digital data is encoded and synthesized intoseveral DNA strands x1,x2, · · · ∈ Σ∗ in parallel. These strands can be of equal or different lengths.The synthesis is performed by choosing a synthesis sequence of nucleotides s = (s1, s2, . . .) ∈ Σ∗

and in each cycle i = 1, 2, . . . , for each DNA strand xj , it is possible to either attach the symbol sito the strand xj or to perform no action. Therefore, a DNA strand x can be synthesized in t cyclesusing the synthesis sequence s if and only if x is a subsequence of (s1, . . . , st). Figure 5 depicts thesynthesis process. We also apply our results to the characterization of deletion balls centered at anarbitrary periodic sequence.

We now employ the general results on costly constrained channels from the previous sectionsto the specific application of efficient DNA synthesis.

6.1 Synthesis and Subsequence Graphs

We associate with the synthesis process a costly constrained channel that compactly describes thesequences that can be synthesized with a given synthesis sequence s. The system is defined by asubsequence graph or synthesis graph. We construct it so that its associated costly language withmaximum cost t contains the set of sequences that can be synthesized with s in at most t cycles.

Definition 6.1. The subsequence graph Gsub(s) of a synthesis sequence s = (s1, s2, . . . ) ∈ Σ∗ is adirected, labeled and weighted graph. It has vertices V = {v0, v1, v2, . . . }, where the vertex vi, i ≥ 1is associated with the symbol si and v0 is an auxiliary starting vertex. The vertices vi and vj, are

31

Start A C G T A C G T . . .A|1

C|2

G|3

T|4

C|1

G|2

T|3

A|4

G|1

T|2

A|3

C|4

T|1

A|2

C|3

G|4

A|1

C|2

G|3

T|4

C|1

G|2

T|3

G|1

T|2

T|1

Figure 6: Subsequence graph Gsub(s) for the synthesis sequence s = (ACGTACGT...). The label ofan edge is equal to the symbol of its final vertex and the cost is equal to the number of symbolsthat are skipped plus one.

connected by an edge e, if j > i and sk 6= sj for all i < k < j. The label of such an edge is σ(e) = sjand the cost is τ(e) = j − i.

Figure 6 shows the subsequence graph Gsub(s) for s = (ACGTACGT...). By construction ofthe graph Gsub(s) a sequence x can be synthesized using s if and only if there exists a paththrough Gsub(s) whose traversed edge labels generate x. Further, the cost of the path is equalto the number of cycles required to synthesize x using s. For example, synthesizing the sequencex1 = (CTACGC) from Figure 5 using the synthesis sequence s = (ACGTACGT...) corresponds to thepath Start→ C→ T→ A→ C→ G→ C and has cost 2+2+1+1+1+3 = 10. Note that althoughthe synthesis process would allow edges from vi to vj for all j > i, we only draw |Σ| outgoing edgesto the next appearance of each σ ∈ Σ for the following two reasons. First, it is desirable to attach anucleotide to a sequence as soon as possible to minimize the required synthesis cycles. Second, thisproperty is useful as it makes the graph deterministic, since for each vertex v the labels from anytwo edges that start from v have distinct labels. This directly leads to the following equivalence.

Proposition 6.2. The sequence x can be synthesized in t cycles using the synthesis sequence s ifand only if it is contained in the cost-t follower set of the starting vertex, i.e., x ∈ LGsub(s),v0(t).

Proof. A sequence x can be synthesized in t cycles using the synthesis sequence s if and only ifx is a subsequence of (s1, . . . , st). We thus prove that LGsub(s),v0(t) is the set of all subsequencesof (s1, . . . , st). On the one hand, each word in LGsub(s),v0(t) is a subsequence of (s1, . . . , st) due tothe following. Each path p = (e1, . . . , en) that starts at v0 generates a word σ(p) = (si1 , . . . , sin),where 1 ≤ i1 < i2 < · · · < in are the indices of the traversed vertices. Further, the cost of eachedge is τ(ej) = ij − ij−1 and henceforth τ(p) = i1 + (i2 − i1) + · · ·+ (in − in−1) = in. Thus in ≤ tand σ(p) is a subsequence of (s1, . . . , st). On the other hand, if x = (x1, . . . , xn) is a subsequenceof (s1, . . . , st), we can left align it in the subsequence graph Gsub(s) such that it will be generatedfrom a path start starts at v0. Due to the left alignment, the last symbol xn will be aligned with asymbol si with i ≤ t and thus the path has cost at most t.

Since the subsequence graph is deterministic, for each vertex v of Gsub(s) there is one-to-onecorrespondence between the follower set LGsub(s),v(t) and paths that start at v and have cost atmost t. Thus, the subsequence graph can be used to efficiently count the number of subsequences ofa given supersequence s using standard algorithms that enumerate paths through weighted graphs.

32

A C

GT

C|1

2T|3

A|4

G|1

A|3 C|4

T|1

C|3

G|4

A|1

G|3T|4

Figure 7: Periodic subsequence graph G(ACGT). Hereby, the last symbol of the period, T, can takethe role of the starting vertex.

However, due to the lack of cycles in this graphs we cannot directly use the well-known results fromconstrained system theory, as those usually require strongly connected graphs. We therefore foldthe subsequence graph for the case of periodic synthesis sequences to obtain an equivalent stronglyconnected graph.

To start with, a periodic synthesis sequence s is a sequence that can be written as s = (rrr . . . ),where r ∈ ΣM is the period of s and M ∈ N is the period length. For periodic sequences, thesubsequence graph can be folded to obtain a strongly connected graph with M vertices as follows.

Definition 6.3. The periodic subsequence graph G(r) of a periodic synthesis sequence s =(rrr . . . ) with r = (r1, . . . , rM ) ∈ ΣM is a directed, labeled and weighted graph. It has M verticesV = {v1, . . . , vM}, corresponding to the symbols of r. There is an edge e from vi to vj if either i < jand rk 6= rj for all k ∈ {i+1, . . . , j−1} or i ≥ j and rk 6= rj for all k ∈ {i+1, . . . ,M, 1, . . . , j−1}.The label of such an edge is σ(e) = rj and the cost is

τ(e) =

{j − i, if i < j,

M − j + i, if i ≥ j.

The periodic subsequence graph G(ACGT) is visualized in Figure 7. It is straightforward toverify that the analogue of Proposition 6.2 also holds for the periodic subsequence graph withstarting vertex vM . Note that the periodic subsequence graph has self loops with weight M forsymbols that occur exactly once in the period. In fact, we can derive the following properties ofthe synthesis graph.

Proposition 6.4. For M ≥ 2, let r ∈ ΣM be an arbitrary sequence which contains at least twodifferent symbols with corresponding synthesis graph G(r). Then, G(r) is deterministic, stronglyconnected and cost-diverse. Further, G(r) has period 1 and cost-period M .

Proof. To start with, G(r) is deterministic because all outgoing edges from the same vertex havedistinct labels by definition. The states v1, . . . , vM in G(r) correspond to the symbols in r =(r1r2 . . . rM ). Next, G(r) is strongly connected since there is a Hamiltonian cycle p1 of length Mstarting at state v1 and connecting vi to state vi+1 for i = 1, 2, . . . ,M − 1 and from vM to v1.

33

Now, there is also a cycle p2 starting at state v1 of length M − 1 and cost M . This cycle canbe constructed as follows. Let i be such that ri 6= ri+1, which is guaranteed to exist because r hastwo different symbols. Note that vi−1 and vi+1 are connected since ri 6= ri+1. Then, define a cyclep2 that contains all the vertices except vi, namely,

p2 = (v1 → v2 → · · · → vi−1 → vi+1 → vi+2 → · · · → vM → v1),

Note that p2 has length M − 1 and cost M . Hence, there exist two paths p1 and p2 of length Mand M − 1, respectively, implying that G(r) has period 1. Consider the (M − 1)-fold repetition ofcycle p1 and the M -fold repetition of cycle p2. Both have length M(M − 1), while the first hascost M(M − 1) and the second has cost M2, implying cost diversity. Finally, G(r) has cost-periodM since the costs of any two paths connecting vi and vj are congruent to j − i modulo M .

6.2 Synthesis Information Rates

For the synthesis process, we are aiming to characterize the maximum amount of informationthat can be synthesized in t synthesis cycles using the synthesis sequence s. In other words, weare seeking to explore the maximum number of information bits that can injectively be encodedto DNA strands, such that each encoded DNA strand can be synthesized in t cycles using thesynthesis sequence s. Since the mapping from information words to DNA strands must be injectiveto guarantee decodability of the original information, the number of possible information words isequal to the number of sequences that can be synthesized in time t using s. Recall, to this end,the subsequence graph presented in the previous section. By Proposition 6.2 a sequence x can besynthesized if and only if there is a path through the subsequence graph with the label x. Usingthe fact that the subsequence graph is deterministic, we can deduce that there is a one-to-onecorrespondence between a sequence x and a path through the subsequence graph (that starts atthe starting vertex). Thus, putting everything together, the number of possible information wordsthat can be synthesized in t cycles is equal to the number of paths through the subsequence graphthat start at the starting vertex and have cost at most t. To this end, recall that LG,v(t) andNG,v(t) from Definition 2.7 denote the set and number of words, respectively, that are generatedby a path through a graph G that starts from the vertex v and has weight at most t. We thusnaturally arrive at the following definition.

Definition 6.5. For an arbitrary r ∈ ΣM , we define Lr(t) , LG(r),vM (t) and Nr(t) , |Lr(t)|.Accordingly, we define Lr(n, t) , Lr(t) ∩ Σn and Nr(t, n) , |Lr(n, t)|.

With this definition, Lr(t) is the set of sequences that can be synthesized in time t when usingthe periodic synthesis sequence s = (rrr . . . ) and log2Nr(t) is the number of information bits thatcan be used for encoding. Note that in principle it is also possible to define Nr(t) for aperiodicsequences; however, in such a case, an asymptotic analysis is not immediate using costly constrainedchannels as presented in this paper. Given the above definitions, we are now in the position topresent the main figures of merit for the synthesis process.

Let s = (rrr . . . ) be a semi-infinite sequence and let 0 ≤ α ≤ 1 be a parameter controlling thelength of the synthesized sequences.

Definition 6.6. For an arbitrary period r ∈ ΣM , the synthesis capacity measured by number ofbits per synthesis cycle is defined as Cr , CG(r). Similarly, the fixed-length synthesis capacity

is defined as Cr(α) , CG(r)(α).

34

The alternating sequence is a prominent sequence due to the fact that it is known to maximizeits number of distinct subsequences [HR00] and hence also the unconstrained synthesis capacity.We proceed with deriving the synthesis capacity of alternating sequences. As it has been explainedbefore, the number of length-αt sequences that can be synthesized with s in time t is equal to thenumber of subsequences of length αt of the synthesis sequence of length t. This implies that thefollowing result on the synthesis capacity is also a result for the exponential growth rate of thenumber of length-αt subsequences.

Proposition 6.7. Consider the q-ary alternating sequence with period rq = (0, 1, . . . , q − 1). Thesynthesis capacity of this sequence is given by

Crq = − log2 xq,

where xq is the unique positive solution to∑q

i=1 xi = 1. The fixed-length synthesis capacity is

Crq(α) =

{α log2 q α < 2

q−1 ,

− log2(α∑q

i=1 ixq(α)i+1), 2

q−1 < α < 1,

where xq(α) is the unique solution to∑q

i=1(1− αi)xi = 0, on the interval 0 < x < 1.For q = 2 and q = 3, we obtain Cr2 ≈ 0.694, Cr3 ≈ 0.879 and

Cr2(α) = αh

(1− αα

),

Cr3(α) = αh(γα

)+ γh

(1− α− γ

γ

),

where γ = −23α+ 1

6

√−8α2 + 12α− 3 + 1

2 .

We will make use of the following lemma. Denote by S(v) = {τ(e) : e ∈ E, init(e) = v} themultiset of costs of all outgoing edges from v ∈ V .

Lemma 6.8. Let G be a strongly connected graph with S(v1) = S(v2) for all v1, v2 ∈ V . Then,

FG,v(x, y) =1

(1− x) (1− yλG(x)),

for all v ∈ V , where λG(x) =∑

τ∈S xτ .

Proof. Let 1 = (1, . . . , 1) denote the all-ones vector of length |V |. If S(v1) = S(v2) for all v1, v2 ∈ V ,it follows that

PG(x)1T =

∑τ∈S(v1)

xτ , . . . ,∑

τ∈S(v|V |)

xτ

T

=

(∑τ∈S

xτ

)1T

and thus 1T is a right Eigenvector of PG(x) with Eigenvalue λG(x) =∑

τ∈S xτ . It follows that

(I − yPG(x))1T = (1− yλG(x))1T and therefore

FG(x, y) =1

1− x· (I − yPG(x))−11T =

1

(1− x)(1− yλG(x))1T.

35

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.2

0.4

0.6

0.8

10.57 0.62 0.72

Cr2

Cr3

Cr4

α

Crq(α

)q = 2q = 3q = 4q = 8q →∞

Figure 8: Synthesis capacity of the alternating sequences rq over different alphabet sizes. Themaxima are highlighted for q ∈ {2, 3, 4} together with their maximizing α. Notice that these plotsconfirm the concavity of the fixed-length capacity in α and its maximum at the variable-lengthcapacity, which is derived in Proposition B.1.

Proof of Proposition 6.7. For the special case of the alternating sequence, the synthesis graph Grq

is a complete graph, where each vertex has q outgoing edges. The cost spectrum of the outgoingedges is S(v) = {1, 2, . . . , q} for all vertices v ∈ VGrq

. Thus, we can apply Lemma 6.8 and obtain

λGrq(x) =

q∑i=1

xi.

The results on the fixed length capacity then directly follow from applying Theorem 3.1. Similarly,the variable-length capacity follows from Theorem 3.5. The results for q = 2 and q = 3 follow froman explicit computation of the determining equations followed by some algebraic reformulations.

Figure 8 visualizes the capacity for the alternating sequences over different alphabets. Therelevant case for DNA synthesis is q = 4.

6.3 Constrained Synthesis

In many applications it helpful for the synthesized strands to fulfill certain constraints, such asforbidding specified substrings [MRRY21, JFSB20]. We thus present a method to compute themaximum synthesis information rate under a constraint. Formally, let s = (rrr . . . ) be a periodicsynthesis sequence. Further, let the constraint that we wish to impose on the DNA strands begiven in form of a constrained system, i.e., a directed, labeled graph Gc = (Vc, Ec, σc) with verticesVc, edges Ec and labels σc : Ec 7→ Σ. The set of possible constrained DNA sequences that areallowed to be synthesized is given by all words that are generated by paths through the graph

36

Gc = (Vc, Ec, σc) that start from a dedicated starting vertex vs ∈ V . The following graph productwill generate the set of constrained DNA sequences that can be synthesized in t synthesis cycles.

Definition 6.9. Let G1 = (V1, E1, σ1, τ1) be a directed, labeled, weighted graph and G2 = (V2, E2, σ2)be a directed, labeled graph. We define their label product as the directed, labeled, weighted graphG = G1 ×G2, G = (V,E, σ, τ) constituent of vertices V = V1 × V2, edges

E = {(e1, e2) ∈ E1 × E2 : σ1(e1) = σ2(e2)},

labeling σ : E 7→ σ such that for e = (e1, e2) ∈ E, it holds that σ(e) = σ1(e1) = σ2(e2). The weightsare τ : E 7→ N with τ(e) = τ1(e1).

With that definition, we can define the central quantity of interest for constrained synthesis,which is the number of constrained sequences that can be synthesized in t cycles.

Definition 6.10. Let r ∈ ΣM be an arbitrary period and Gc = (Vc, Ec, σc) a strongly connected,directed, labeled graph with starting vertex vs ∈ Vc. The constrained synthesis capacity measuredby number of bits per synthesis cycle is defined as

Cr,Gc,vs = lim supt→∞

log2(NG(r)×Gc,(vM ,vs)(t))

t.

Similarly, the fixed-length constrained synthesis capacity is defined as

Cr,Gc,vs(α) = lim supt→∞

log2(NG(r)×Gc,(vM ,vs)(t, bαtc))t

.

It remains to prove that the label product of the synthesis graph G and the constrained graphGc exactly generate the set of constrained sequences that can be synthesized in t cycles. This claimis proven in a more general form in the following lemma.

Lemma 6.11. Let G1 = (V1, E1, σ1, τ1) be a directed, labeled, weighted graph and G2 = (V2, E2, σ2)be a directed, labeled graph. Then, for any v1 ∈ V1 and v2 ∈ V2

LG1×G2,(v1,v2)(t) = LG1,v1(t) ∩ LG2,v2 .

Proof. We denote G = (V,E, σ, τ) = G1 × G2 as the label product of G1 and G2. To start with,assume that a sequence x is contained in x ∈ LG1,v1(t) ∩ LG2,v2 . This means that there exists apath p1 starting from v1 through G1 with σ1(p1) = x and τ1(p1) ≤ t and there exists another pathp2 starting from v2 through G2 with σ(p2) = x. For every i, the i-th edge in p1 has the samelabel as the i-th edge of p2, such that p , (p1,p2) is a path through G1 × G2 that starts from(v1, v2). Its label is σ(p) = σ1(p1) = σ2(p2) = x and its cost is τ(p) = τ1(p1) ≤ t. Henceforth,LG1×G2,(v1,v2)(t) ⊇ LG1,v1(t) ∩ LG2,v2 . Let now conversely x ∈ LG1×G2,(v1,v2)(t). Thus, there existsa path p through G1×G2, starting from (v1, v2) with σ(p) = x and τ(p) ≤ t. Writing p = (p1,p2),we see that p1 is a path through G1 that starts at v1 and has cost at most τ1(p1) = τ(p) ≤ tand label σ1(p1) = σ(p) = x. Further, p2 is a path through G2 that starts at v2 and has labelσ2(p2) = σ(p) = x and thus LG1×G2,(v1,v2)(t) ⊆ LG1,v1(t) ∩ LG2,v2 , which proves the claim.

With this result, the set of constrained sequences that can be synthesized in t cycles is preciselyLG(r)×Gc,(vM ,vs)(t). Note that the graph G(r)×Gc is deterministic, however it is not always stronglyconnected as we will show in the following example.

37

A C

C|1A|2

A|1C|2

(a) Periodic subsequence graph G(AC).

v1 v2

C

AA

(b) Constraint graph Gc imposing the constraint thatthe letter C is always follows by A.

A, v1 C, v1

A, v2 C, v2

C|1

A|2

C|2

A|1

A|2A|1

(c) Label product graph of the graphs G(AC) and Gc.

Figure 9: Periodic synthesis graph, constraint graph and their label product from Example 6.12.

Example 6.12. Consider the periodic synthesis sequence s with period r = (AC). Further, let Gc

be the constrained graph of the language of words over Σ = {A,C} comprising only words that donot contain runs of the letter C of length more than 1. Both graphs and their label product aredisplayed in Figure 9. The vertex (A, v2) has no incoming edges and thus the graph is not stronglyconnected. However, this vertex does not play a role, if we let either C be the starting vertex forthe synthesis graph or v1 be the starting vertex for the constraint graph. In these cases, the vertexcan simply be discarded, resulting in a strongly connected graph.

While it is known that for unconstrained synthesis the alternating sequence with period r =(A,C,G,T) maximizes the information rate, we will show in the following example that such a claimis not necessarily true for constrained synthesis.

Example 6.13. Consider the periodic synthesis sequence s with period r = (ACGT). Further, letGTT be the constraint graph that enumerates all words over Σ = {A,C,G,T} that contain only runsof the letter T that have length at least 2. Figure 10 shows the constraint graph and the product graph,while the synthesis graph is displayed in Figure 7. We find that C(ACGT),GTT

≈ 0.7188, however if weuse a synthesis sequence with period r = (ACG) instead we can get a higher constrained synthesisrate of C(ACG),GTT

≈ 0.8791. This result is intuitive: as compared to the unconstrained case, theconstraint increases the cost of synthesizing T by 4 cycles (following the constraint, we always needto synthesize two T’s in succession) which means that it is more efficient to remove T from thesynthesis sequence in order to reduce the cost of the remaining symbols. Notice that it is not acoincidence that C(ACG),GTT

= C(ACG) = Cr3 (c.f. Proposition 6.7) as the constraint GTT containsany sequence over the alphabet Σ = {A,C,G}.

38

v1 v2

T

A

C

G

T

(a) Constraint graphGTT of sequences, where theletter T only occurs in runs of length at least 2.

A, v1 A, v2

C, v1 C, v2

G, v1 G, v2

T, v1 T, v2

4

1

2

3

43

1

2

4

3

1

13

2

1

4

3

2

1

4

(b) Label product graph of the graphs G(ACGT) and GTT.The edge labels are equal to the label of the final vertex andhave been omitted for readability.

Figure 10: Constraint graph and their label product from Example 6.13.

7 Conclusion

Our work establishes a general theory, which allows us to associate properties of costly graphs withcharacteristics of the singularities of generating functions. This builds the first bridge betweenthe literature on analytic combinatorics in several variables and costly graphs bringing a newperspective and a set of powerful results to the literature of costly constrained channels. We usethis connection to deduce results on the asymptotic behavior of the size of the limited-cost followersets. As part of our analysis, we identify and analyze key properties of graphs that result in a well-behaved expression of the capacity. In particular, we introduce the notion of cost-diversity, whichis the property that entails a continuous behavior of the fixed-length capacity, or in its absence,results in a degenerate behavior. Next, studying the Perron-Frobenius theory of cost-diverse graphsand its implications on the nature of the singular variety of the generating functions, we provide acombinatorial proof of the precise asymptotic expansions of both, the number of fixed-length andvariable-length followers. Finally, we show that our results allow to give a full characterization ofthe subsequence spectrum of arbitrary periodic sequences. We use this insight to solve a relatedproblem about cost-efficient synthesis for DNA-based data storage.

7.1 Open Questions

1. Can we derive asymptotic expressions for the number of subsequences of a large family ofnon-periodic sequences?

2. Is it possible to extend our techniques to handle non-integer costs?

3. Our results determine the number of subsequences and the synthesis rate in the presence

39

of constraints. It would be interesting to turn this around and to characterize the optimalsynthesis sequence, maximizing the number of subsequences that adhere to the constraint.For example, is it true that there exists an optimal supersequence that itself satisfies theconstraint?

4. Counting subsequences corresponds to determining the size of a deletion ball. Can we achievesimilar results for insertion balls when there is also a constraint on the supersequences we arecounting?

8 Acknowledgement

The authors would like to thank Antonia Wachter-Zeh and Eitan Yaakobi for contributing valuableideas in the early stages of this project and for helping with formalizing the problem statementfor efficient synthesis for DNA-based data storage. We would also like to thank Andrew Tan fordesigning explicit shaping codes and performing numerical evaluations regarding the probabilisticchannel capacity. We finally would like to thank Yi Liu for interesting discussions on the interpre-tation of the boundary between the linear and concave region of the fixed-length capacity and alsoon the equivalence between the probabilistic and combinatorial capacity.

References

[ACH83] Roy L. Adler, Don Coppersmith, and Martin Hassner. Algorithms for sliding block codes — anapplication of symbolic dynamics to information theory. IEEE Transactions on InformationTheory, 29(1):5–22, January 1983.

[AFKM86] Roy L. Adler, Joel Friedman, Bruce Kitchens, and Brian H. Marcus. State splitting for variable-length graphs. IEEE Transactions on Information Theory, 32(1):108–113, January 1986.

[AHM82] Roy L Adler, Martin Hassner, and John Moussouris. Method and apparatus for generating anoiseless sliding block code for a (1, 7) channel with rate 2/3, 1982. US Patent 4,413,251.

[AKS96] Jonathan J. Ashley, Razmik Karabed, and Paul H. Siegel. Complexity and sliding block de-codability. IEEE Transactions on Information Theory, 42(6):1925–1947, November 1996.

[AM95] Jonathan J. Ashley and Brian H. Marcus. Canonical encoders for sliding block decoders. SIAMJournal on Discrete Mathematics, 8(4):555–605, 1995.

[AMR95] Jonathan J. Ashley, Brian H. Marcus, and Ron M. Roth. Construction of encoders withsmall decoding look-ahead for input-constrained channels. IEEE Transactions on InformationTheory, 41(1):55–76, January 1995.

[AMR96] Jonathan J. Ashley, Brian H. Marcus, and Ron M. Roth. On the decoding delay of encodersfor input-constrained channels. IEEE Transactions on Information Theory, 42(6):1948–1956,November 1996.

[Ash87] Jonathan J. Ashley. Performance Bounds in Constrained Sequence Coding. PhD thesis, Uni-versity of California, Santa Cruz, 1987.

[Ash88] Jonathan J. Ashley. A linear bound for sliding block decoder window size. IEEE Transactionson Information Theory, 34(3):389–399, May 1988.

[Ash96] Jonathan J. Ashley. A linear bound for sliding block decoder window size, II. IEEE Transactionson Information Theory, 42(6):1913–1924, November 1996.

40

[AVA+19] Leon Anavy, Inbal Vaknin, Orna Atar, Roee Amit, and Zohar Yakhini. Data storage in DNAwith fewer synthesis cycles using composite DNA letters. Nature Biotechnology, 37(10):1229–1236, 2019.

[BBBP11] Yuliy Baryshnikov, Wil Brady, Andrew Bressler, and Robin Pemantle. Two-dimensional quan-tum random walk. J. Stat. Phys., 142(1):78–107, 2011.

[BJP10] Georg Bocherer, Valdemar Cardoso da Rocha Junior, and Cecilio Pimentel. Capacity of GeneralDiscrete Noiseless Channels. arXiv:0802.2451 [cs, math], June 2010.

[BM11] Georg Bocherer and Rudolf Mathar. Matching dyadic distributions to channels. In 2011 DataCompression Conference, pages 23–32, 2011.

[BPRS20] Vinnu Bhardwaj, Pavel A Pevzner, Cyrus Rashtchian, and Yana Safonova. Trace reconstructionproblems in computational biology. IEEE Transactions on Information Theory, 67(6):3295–3314, 2020.

[BSS15] Georg Bocherer, Fabian Steiner, and Patrick Schulte. Bandwidth efficient and rate-matched low-density parity-check coded modulation. IEEE Transactions on Communications, 63(12):4651–4665, 2015.

[BSS19] Georg Bocherer, Patrick Schulte, and Fabian Steiner. Probabilistic shaping and forward errorcorrection for fiber-optic communication systems. Journal of Lightwave Technology, 37(2):230–244, 2019.

[Bo12] Georg Bocherer. Capacity-Achieving Probabilistic Shaping for Noisy and Noiseless Channels.PhD thesis, RWTH Aachen University, 2012.

[Car13] Marvin H. Caruthers. The chemical synthesis of DNA/RNA: Our gift to science. Journal ofBiological Chemistry, 288(2):1420–1427, January 2013.

[CNS19] Luis Ceze, Jeff Nivala, and Karin Strauss. Molecular digital data storage using DNA. NatureReviews Genetics, 20(8):456–466, 2019.

[CNT+20] Shubham Chandak, Joachim Neu, Kedar Tatwawadi, Jay Mardia, Billy Lau, Matthew Kubit,Reyna Hulett, Peter Griffin, Mary Wootters, Tsachy Weissman, and Hanlee Ji. Overcominghigh nanopore basecaller error rates for DNA storage via basecaller-decoder integration andconvolutional codes. In IEEE International Conference on Acoustics, Speech, and Signal Pro-cessing (ICASSP), 2020.

[Csi69] I. Csiszar. Simple proofs of some theorems on noiseless channels. Information and Control,14:285–298, 1969.

[DRV10] Ivan Djordjevic, William Ryan, and Bane Vasic. Coding for optical channels. Springer Science& Business Media, 2010.

[EZ17] Yaniv Erlich and Dina Zielinski. DNA Fountain enables a robust and efficient storage architec-ture. Science, 355(6328):950–954, 2017.

[Fra68] Peter A. Franaszek. Sequence-state coding for digital transmission. Bell System TechnicalJournal, 47:143–155, 1968.

[Fra69] Peter A. Franaszek. On synchronous variable length coding for discrete noiseless channels.Information and Control, 15:155–164, 1969.

[Fra70] Peter A. Franaszek. Sequence-state methods for run-length-limited coding. IBM Journal ofResearch and Development, 14:376–383, 1970.

[Fra72] Peter A. Franaszek. Run-length-limited variable length coding with error propagation limita-tion, 1972. US Patent 3,689,899.

41

[FS09] Philippe Flajolet and Robert Sedgewick. Analytic Combinatorics. Cambridge University Press,Cambridge ; New York, 2009.

[FW64] C. V. Freiman and A. D. Wyner. Optimum block codes for noiseless input restricted channels.Information and Control, 7:398–415, 1964.

[Gil95] E. N. Gilbert. Coding with digits of unequal cost. IEEE Transactions on Information Theory,41(2):596–600, March 1995.

[GR98] Mordecai J. Golin and Gunter Rote. A dynamic programming algorithm for constructingoptimal prefix-free codes with unequal letter costs. IEEE Transactions on Information Theory,44(5):1770–1781, September 1998.

[HJ12] Roger A. Horn and Charles R. Johnson. Matrix Analysis. Cambridge University Press, Cam-bridge ; New York, 2nd ed edition, 2012.

[HR00] Daniel S. Hirschberg and Mireille Regnier. Tight bounds on the number of string subsequences.Journal of Discrete Algorithms, 1(1):123–132, 2000.

[HU00] T.-S. Han and Osamu Uchida. Source code with cost as a nonuniform random number generator.IEEE Transactions on Information Theory, 46(2):712–717, 2000.

[Imm95] Kees A. Schouhamer Immink. EFMPlus: The coding format of the multimedia compact disc.IEEE Transactions on Consumer Electronics, 41(3):491–497, August 1995.

[Imm04] Kees A. Schouhamer Immink. Codes for Mass Data Storage Systems. Shannon FoundationPublishers, 2004.

[IMU97] Kenichi Iwata, Masakatu Morii, and Tomohiko Uyematsu. An efficient universal coding algo-rithm for noiseless channel with symbols of unequal cost. IEICE Transactions on Fundamentalsof Electronics, Communications and Computer Sciences, 80:2232–2237, 1997.

[INOO85] Kornelis A. Immink, Jakob G. Nijboer, Hiroshi Ogawa, and Kentaro Odaka. Method forencoding binary data, 1985. US Patent 4,501,000.

[JFLMK10] Ashish Jagmohan, Michele Franceschini, Luis A. Lastras-Montano, and John Karidis. Adaptiveendurance coding for nand flash. In 2010 IEEE Globecom Workshops, pages 1841–1845, 2010.

[JFSB20] Siddharth Jain, Farzad Farnoud, Moshe Schwartz, and Jehoshua Bruck. Coding for optimizedwriting rate in DNA storage. In 2020 IEEE International Symposium on Information Theory(ISIT), pages 711–716. IEEE, 2020.

[JH84] J. Justesen and T. Hohøoldt. Maxentropic Markov chains. IEEE Transactions on InformationTheory, 30(4):665–667, July 1984.

[Kar61] R. Karp. Minimum-redundancy coding for the discrete noiseless channel. IRE IEEE Transac-tions on Information Theory, 7(1):27–38, January 1961.

[KC14a] Sriram Kosuri and George Church. Large-scale de novo DNA synthesis: technologies andapplications. Nature Methods, pages 499–507, 2014.

[KC14b] Sriram Kosuri and George M Church. Large-scale de novo DNA synthesis: Technologies andapplications. Nature Methods, 11(5):499–507, May 2014.

[KKYW14] Victor Krachkovsky, Razmik Karabed, Shaohua Yang, and Bruce Wilson. On modulationcoding for channels with cost constraints. 2014 IEEE International Symposium on InformationTheory, pages 421–425, 2014.

[KMP+02] Andrew B Kahng, II Mandoiu, Pavel A Pevzner, Sherief Reda, and AZ Zelikovsky. Borderlength minimization in DNA array design. In International Workshop on Algorithms in Bioin-formatics, pages 435–448. Springer, 2002.

42

[KMR00] A. Khandekar, Robert J. McEliece, and Eugene R. Rodemich. The discrete noiseless channelrevisited. In Coding, Communications, and Broadcasting, pages 115–137. Research StudiesSeries Ltd., UK, Badlock, 2000.

[KN96] Ali S. Khayrallah and David L. Neuhoff. Coding for channels with cost constraints. IEEETransactions on Information Theory, 42(3):854–867, May 1996.

[Kra62] R. M. Krause. Channels which transmit letters of unequal duration. Information and Control,55:13–24, 1962.

[LHBS20a] Yi Liu, Pengfei Huang, Alexander W. Bergman, and Paul H. Siegel. Rate-constrained shapingcodes for structured sources. IEEE Trans. on Information Theory, 66(8):5261–5281, 2020.

[LHBS20b] Yi Liu, Pengfei Huang, Alexander W. Bergman, and Paul H. Siegel. Rate-constrained shapingcodes for structured sources. arXiv:2001.02748 [cs, math], January 2020.

[Liu20] Yi Liu. Coding Techniques to Extend the Lifetime of Flash Memories. PhD thesis, Universityof California, San Diego, San Diego, 2020.

[LKG+19] Henry H Lee, Reza Kalhor, Naveen Goela, Jean Bolot, and George M Church. Terminator-free template-independent enzymatic DNA synthesis for digital information storage. NatureCommunications, 10(1):1–12, 2019.

[LLR+20] Andreas Lenz, Yi Liu, Cyrus Rashtchian, Paul H. Siegel, Antonia Wachter-Zeh, and EitanYaakobi. Coding for efficient DNA synthesis. In Proc. Int. Symp. Inf. Theory, pages 2885–2890, Los Angeles, CA, USA, June 2020. IEEE.

[LSWY20] Andreas Lenz, Paul H. Siegel, Antonia Wachter-Zeh, and Eitan Yaakobi. Coding over sets forDNA storage. IEEE Transactions on Information Theory, 66(4):2331–2351, April 2020.

[LWG+20] Howon Lee, Daniel Wiegand, Kettner Griswold, Sukanya Punthambaker, Honggu Chun, RichieKohman, and George Church. Photon-directed multiplexed enzymatic DNA synthesis for molec-ular digital data storage. Preprint available at https://www.biorxiv.org/content/10.1101/2020.02.19.956888v1, Feb. 2020.

[Mar57] R. S. Marcus. Discrete Noiseless Coding. PhD thesis, Massachusetts Institute of Technology,1957.

[Mel80] K. Melhorn. An efficient algorithm for constructing nearly optimal prefix codes. IEEE Trans-actions on Information Theory, 26(5):513–517, September 1980.

[Mel21] Stephen Melczer. An Invitation to Analytic Combinatorics: From One to Several Variables.Texts & Monographs in Symbolic Computation. Springer International Publishing, 2021.

[Mey00] C. D. Meyer. Matrix Analysis and Applied Linear Algebra. Society for Industrial and AppliedMathematics, Philadelphia, 2000.

[MK06] Stephen W. McLaughlin and Ali S. Khayrallah. Shaping codes constructed from cost-constrained graphs. IEEE Trans. Inf. Theor., 43(2):692–699, September 2006.

[MKB08] Hugues Mercier, Majid Khabbazian, and Vijay K. Bhargava. On the number of subsequenceswhen deleting symbols from a string. IEEE Transactions on Information Theory, 54(7):3279–3285, July 2008.

[MR83] Robert J. McEliece and Eugene R. Rodemich. A maximum entropy Markov chain. In Proc.Conf. Inform. Sciences and Systems, pages 245–248, Johns Hopkins University, March 1983.

[MRRY21] Konstantin Makarychev, Miklos Z Racz, Cyrus Rashtchian, and Sergey Yekhanin. Batch Opti-mization for DNA Synthesis. In 2021 IEEE International Symposium on Information Theory(ISIT), pages 1949–1954. IEEE, 2021.

43

https://www.biorxiv.org/content/10.1101/2020.02.19.956888v1

https://www.biorxiv.org/content/10.1101/2020.02.19.956888v1

[MRS01] B. H. Marcus, Ron M. Roth, and Paul H. Siegel. An Introduction to Coding for ConstrainedSystems. Elsevier Press, 2001.

[MS21] Stephen Melczer and Bruno Salvy. Effective coefficient asymptotics of multivariate rationalfunctions via semi-numerical algorithms for polynomial systems. J. Symbolic Comput., 103:234–279, 2021.

[NL06] Kang Ning and Hon Wai Leong. The distribution and deposition algorithm for multiple oligonucleotide arrays. Genome Informatics, 17(2):89–99, 2006.

[NL11] Kang Ning and Hon Wai Leong. The multiple sequence sets: problem and heuristic algorithms.Journal of Combinatorial Optimization, 22(4):778–796, 2011.

[OAC+18] Lee Organick, Siena Dumas Ang, Yuan-Jyue Chen, Randolph Lopez, Sergey Yekhanin, Kon-stantin Makarychev, Miklos Z Racz, Govinda Kamath, Parikshit Gopalan, Bichlien Nguyen,Christopher N Takahashi, Sharon Newman, Hsing-Yeh Parker, Cyrus Rashtchian, KendallStewart, Gagan Gupta, Robert Carlson, John Mulligan, Douglas Carmean, Georg Seelig, LuisCeze, and Karin Strauss. Random access in large-scale DNA data storage. Nature Biotechnol-ogy, 36(3):242, 2018.

[Pat75] Arvind M. Patel. Zero-modulation encoding in magnetic recording. IBM Journal of Researchand Development, 19:366–378, 1975.

[PW08] Robin Pemantle and Mark C Wilson. Twenty combinatorial examples of asymptotics derivedfrom multivariate generating functions. Siam Review, 50(2):199–272, 2008.

[PW13] Robin Pemantle and Mark C. Wilson. Analytic Combinatorics in Several Variables. Number140 in Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge,2013.

[Rah06] Sven Rahmann. Subsequence combinatorics and applications to microarray production, DNAsequencing and chaining algorithms. In Annual Symposium on Combinatorial Pattern Matching,pages 153–164. Springer, 2006.

[SH21] Ilan Shomorony and Reinhard Heckel. DNA-based storage: Models and fundamental limits.IEEE Transactions on Information Theory, 67(6):3675–3689, 2021.

[Sha48] Claude E. Shannon. A mathematical theory of communication. Bell System Technical Journal,27:379–423, October 1948.

[Sor05] Joseph B. Soriaga. On Near-Capacity Code Design for Partial-Response Channels. PhD thesis,University of California, San Diego, 2005.

[SS06] Joseph B. Soriaga and Paul H. Siegel. On the Design of Finite-State Shaping Encoders forPartial-Response Channels. In Proc. Inf. Theory Appl. Workshop, San Diego, CA, USA, Febru-ary 2006.

[SWB06] Slawomir Stanczak, Marcin Wiczanowski, and Holger Boche. Resource Allocation in WirelessNetworks. Lecture Notes in Computer Science. Springer Berlin Heidelberg, Berlin, Heidelberg,2006.

[TB70] D. T. Tang and L. R. Bahl. Block codes for a class of constrained noiseless channels. Informationand Control, 17:436–461, 1970.

[TF20] Yuanyuan Tang and Farzad Farnoud. Error-correcting codes for short tandem duplication andsubstitution errors. In 2020 IEEE International Symposium on Information Theory (ISIT),pages 734–739. IEEE, 2020.

44

[TWA+20] S Kasra Tabatabaei, Boya Wang, Nagendra Bala Murali Athreya, Behnam Enghiad, Al-varo Gonzalo Hernandez, Christopher J Fields, Jean-Pierre Leburton, David Soloveichik,Huimin Zhao, and Olgica Milenkovic. DNA punch cards for storing data on native DNAsequences via enzymatic nicking. Nature Communications, 11(1):1–10, 2020.

[Uch01] Osamu Uchida. Maximum generating rate of the variable-length nonuniform random number.In Proc. 2001 IEEE Info. Theory Workshop (Cat. No.01EX494), pages 141–143, 2001.

[Var71] B. Varn. Optimal variable length codes (arbitrary symbol cost and equal code word probability).Information and Control, 19:289–301, 1971.

[Wil88] J. H. Wilkinson. The Algebraic Eigenvalue Problem. Monographs on Numerical Analysis.Clarendon Press ; Oxford University Press, Oxford : Oxford ; New York, 1988.

[YGM17] SM Hossein Tabatabaei Yazdi, Ryan Gabrys, and Olgica Milenkovic. Portable and error-freeDNA-based data storage. Scientific Reports, 7(1):5011, 2017.

[YKG+15] S. M. Hossein Tabatabaei Yazdi, Han Mao Kiah, Eva Garcia-Ruiz, Jian Ma, Huimin Zhao, andOlgica Milenkovic. DNA-based storage: Trends and methods. IEEE Transactions on Molecular,Biological and Multi-Scale Communications, 1(3):230–248, September 2015.

[YKGR+15] SM Hossein Tabatabaei Yazdi, Han Mao Kiah, Eva Garcia-Ruiz, Jian Ma, Huimin Zhao, andOlgica Milenkovic. DNA-based storage: trends and methods. IEEE Transactions on Molecular,Biological and Multi-Scale Communications, 1(3):230–248, 2015.

A Lemmas on Matrices

In the following statements we refer by the period d of a non-negative matrix P ∈ RM×M to theperiod d of the associated directed graph (Definition 2.4), which has M vertices and has an edgefrom vi to vj if [P ]ij > 0.

Lemma A.1. Let P be an irreducible matrix with period d and spectral radius ρ. Then, the adjointmatrix adj(ρe2πij/dI − P ) has rank one for all j ∈ {0, 1, . . . , d− 1}.

Proof. Note that the result can be deduced from, e.g., [Mey00, Problem 6.2.11] and we providea short proof for the readers convenience. Denote by M the number of rows (and columns) ofP and abbreviate for convenience θj , 2πj/p. We first show that rank(ρI − P ) = M − 1. TheEigenvalues of ρeiθjI−P are given by (ρeiθi−λi) , i ∈ {1, . . . ,M}, where λi are the (not necessarilydistinct) Eigenvalues of P . Since P is irreducible and has period p, by the Perron-FrobeniusTheorem and [MRS01, Thm 3.18] ρeiθj , j ∈ {0, 1, . . . , p − 1} are Eigenvalues of multiplicity oneand thus exactly one of the Eigenvalues ρeiθj − λi will be zero and all other non-zero. Thereforerank(ρeiθjI−P ) = M−1. Next, we observe that adj(ρeiθjI−P )(ρeiθjI−P ) = det(ρeiθjI−P )I = 0and thus adj(ρeiθjI − P ) spans a subspace of the left nullspace of (ρeiθjI − P ). Since ρeiθjI − Phas rank M − 1, it follows that rank(adj(ρeiθjI − P )) ≤ 1. On the other hand, ρeiθjI − P hasrank M − 1 and thus there exists an (M − 1) × (M − 1) submatrix of ρeiθjI − P , which is non-singular [Mey00, Chapter 4.5], and it follows that at least one entry of adj(ρeiθjI −P ) is non-zero.It follows that adj(ρeiθjI − P ) cannot have rank zero and thus has rank one.

Lemma A.2. Let P be an irreducible matrix of period d. Then, there are d Eigenvalues ρe2πij/d,j ∈ {0, 1, . . . , d−1} of maximum modulus and we denote their corresponding right and left non-zeroEigenvectors by uj and vj. The adjoint matrix adj(ρe2πij/dI − P ) is given by

adj(ρe2πij/dI − P ) = cj · uTj vj ,

45

where cj 6= 0 is a linear scaling factor. Thus, adj(ρI − P ) is either all-positive or all-negative.

Proof. Abbreviate for convenience θj , 2πj/d. By Lemma A.1, the adjoint matrix has rank one.It follows that adj(ρeiθjI − P ) can be written as the product uT

j vj of two vectors uj and vj , i.e.,

adj(ρeiθjI − P ) = uTj vj . The properties of the adjoint matrix [HJ12, p. 20] imply that

adj(ρeiθjI − P )(ρeiθjI − P ) = (ρeiθjI − P )adj(ρeiθjI − P ) = det(ρeiθjI − P )I.

By [MRS01, Theorem 3.18], the rotated spectral radius ρeiθj is an eigenvalue of P and the matrixρeiθjI − P is singular, so det(ρeiθjI − P ) = 0. Hence

adj(ρeiθjI − P )(ρeiθjI − P ) = (ρeiθjI − P )adj(ρeiθjI − P ) = 0.

This implies that the columns of adj(ρeiθjI − P ) are right eigenvectors of P associated to ρeiθj .Similarly, the rows of adj(ρeiθjI − P ) are left eigenvectors of P associated to ρeiθj , and therefore,adj(ρeiθjI − P ) = uT

j vj , where uj and vj are right and left Eigenvectors corresponding to ρeiθj .

Since cj = 0 implies adj(ρeiθjI − P ) = 0, this is not possible since the adjoint has rank exactlyone, which has been proven in Lemma A.1.

We proceed with proving the second statement. By the Perron-Frobenius theorem, u0 is eitherall-zero, all-positive, or an all-negative vector and the same applies to v0. If we now assume thatB , adj(ρI − P ) satisfies [B]11 > 0, the observations above imply that every entry of B must bepositive. Similarly, if [B]11 < 0, we can conclude that every entry of B must be negative.

Lemma A.3. Let P (x) be a matrix with spectral radius ρ(x), whose entries are analytic functionsin x ∈ C. Also assume that P (x) is irreducible with period d for all x ∈ R+. Then, for eachj ∈ {0, 1, . . . , d − 1} and all real-valued x ∈ R+ there exists a unique Eigenvalue λj(x) of P (x)with λj(x) = ρ(x)e2πij/d, which is analytic in a complex neighborhood around the positive real axis.Further, the associated right and left Eigenvectors uj(x) and vj(x), normalized to vTj (x)uj(x) = 1,are analytic on the same domain.

Proof. By the Perron-Frobenius theorem and the extension [MRS01, Theorem 3.18] for every j ∈{0, 1, . . . , d− 1} and x0 > 0, λj(x0) = ρ(x0)e

i2πij/d is a simple root of the characteristic polynomialφ(λ) = det(λI − P (x0)). The coefficients of this polynomial φ(λ) are polynomials in analyticfunctions, as the entries of P (x) are analytic by assumption. The implicit function theorem foralgebraic functions [Wil88, pp. 66-67] then implies that for each x0 > 0 there exists an ε > 0 suchthat λj(x) is an Eigenvalue of P (x) and λj(x) is an analytic function for all x ∈ C with |x−x0| < ε.As proven in [Wil88, pp. 66-67], the associated Eigenvectors are also analytic functions in x in aneighborhood around the positive real axis.

Lemma A.4. Let P (x) be a matrix with spectral radius ρ(x), whose entries are analytic functionsin x ∈ C. Further, let P (x) be irreducible with period d for all x ∈ R+. Then, the Eigenvaluesλj(x) of P (x) of maximum modulus and the associated right and left Eigenvectors uj(x) and vj(x),normalized to vTj (x)uj(x) = 1, are analytic in a neighborhood around R+ and it holds that

vj(x)∂P (x)

∂xuTj (x) =

∂λj(x)

∂x.

46

Proof. To start with, denote by λj(x) the Eigenvalues of maximum modulus whose existence isguaranteed from Lemma A.3. Due the analyticity of λj(x) proven in Lemma A.3, λj(x) anduj(x),vj(x) are differentiable. Differentiating P (x)uT

j (x) = λj(x)uTj (x) on both sides with respect

to x yields

P (x)∂uT

j (x)

∂x+∂P (x)

∂xuTj (x) = λj(x)

∂uTj (x)

∂x+∂λj(x)

∂xuTj (x).

Multiplying with vj(x) from the left (notice that vj(x)P (x) = λj(x)vj(x), since vj(x) is a leftEigenvector of P (x)), we obtain the desired equality

vj(x)∂P (x)

∂xuTj (x) =

∂λj(x)

∂x.

B Concavity and Maximality of Fixed-Length Capacity

Proposition B.1. Let G be a strongly connected, deterministic, cost-diverse graph. Then, CG(α)is a concave function in α and its maximum is equal to CG(α∗) = CG, where α∗ = 2CG/ρ′G(2−CG).

Proof. To start with, CG(α) is linear in the interval 0 ≤ α < αloG. In the interval αlo

G ≤ α < αupG ,

CG(α) = − log x0(α) + α log ρG(x0(α)), where x0(α) is the unique positive solution to f(x) = α−1

with f(x) , xρ′G(x)/ρG(x). Notice that ρG(x) > 0 for all x > 0 and thus, by Lemma A.3, f(x) isanalytic for all x > 0. Further, as in the proof of Lemma 5.21, we can show that ∂

∂xf(x) > 0, whichmeans that x0(α) is analytic in α and also strictly monotonically decreasing in α. Therefore, forαloG ≤ α < αup

G ,

∂

∂αCG(α) = −x

′0(α)

x0(α)+ log ρG(x0(α)) + α

x′0(α)ρ′G(x0(α))

ρG(x0(α))

= −x′0(α)

x0(α)+ log ρG(x0(α)) + αf(x0(α))

x′0(α)

x0(α)

(a)= log ρG(x0(α)),

where we used in (a) that f(x0(α)) = α−1 by definition. Since x0(α) is strictly monotonicallydecreasing in α and also ρG(x) and the logarithm are strictly monotone functions (see Lemma A.4),CG(α) is strictly concave in the considered interval. By definition of αlo

G, CG(α) → αloG log ρG(1),

as α approaches αloG from both the left and right, proving continuity of CG(α). Therefore, CG(α)

is a concave function on the full interval 0 ≤ α ≤ αupG . From the above derivation of the derivative

of CG(α), we further see that α∗ with ρG(x0(α∗)) = 1 is a unique stationary point of CG(α) with

capacity CG(α∗) = − log x0(α∗) = CG. It follows that α∗ is the unique solution for α to the system

of two equations f(x) = α−1 and ρG(x) = 1, x > 0. The above exposition proves that the solutionto these equations are α∗ and x0(α

∗), where α∗ is as given in the statement.

C Variation of the Chinese Remainder Theorem

Lemma C.1. Let (m1, τ1), (m2, τ2), . . . , be pairs of integers (mi, τi) ∈ N2 and denote by d thegreatest common divisor of all mi. If these pairs satisfy

miτj ≡ mjτi (mod (cd))

47

for all i, j, then there exists b ∈ Z such that for all i,

dτi ≡ mib (mod (cd)).

Further, any b′ ∈ Z with b′ ≡ b (mod c) has the same property.

Proof. We prove the statement by a direct construction. Assume without loss of generality thatgcd(m1, . . . ,mn) = d for some n ∈ N. This is possible, since there exist finitely many mi such thattheir greatest common divisor is equal to d. By Bezout’s identity, there exist z1, . . . , zn ∈ N withz1m1 + · · ·+ znmn = d. Choosing b = z1τ1 + . . . znτn, we obtain for any 1 ≤ i ≤ n,

mib = zimiτi +∑j 6=i

zjmiτj = τi

d−∑j 6=i

zjmj

+∑j 6=i

zjmiτj = τid+∑j 6=i

zj(miτj −mjτi).

By assumption miτj −mjτi ≡ 0 (mod (cd)) and thus mib ≡ τid (mod (cd)). On the other hand,for any i > n, we set zi = 0 and obtain via a similar argument

mib = zimiτi +

n∑j=1

zjmiτj = τid+

n∑j=1

zj(miτj −mjτi),

which implies that mib ≡ τid (mod (cd)). This concludes the proof.

D Combinatorial Characterization of αloG and αup

G

Recall the definitions αloG , ρG(1)/ρ′G(1) and αup

G , limx→0+

ρG(x)/(xρ′G(x)).

For x > 0, the cost-enumerator matrix PG(x) is irreducible, so by [MRS01, Lemma 3.5]

limm→∞

1

mlog

(∑u,v

(PG(x)m)u,v

)= log ρG(x). (6)

The terms in the summation PG(x)m)u,v are the generating functions for the costs of paths p oflength m from state u to state v, i.e.,

(PG(x)m)u,v =∑

p : u→ vlength p = m

xτ(p) (7)

Taking the derivative with respect to x and, by uniform convergence, changing the order of differ-entiation and limit on the left, we have

limm→∞

(1

mlog

(∑u,v

(PG(x)m)u,v

))′=ρ′G(x)

ρG(x). (8)

The derivative of the argument of the limit, evaluated at x = 1, is(1

mlog

(∑u,v

(PG(x)m)u,v

))′∣∣∣∣∣x=1

=1

m

∑p τ(p)xτ(p)−1∑

p xτ(p)

∣∣∣∣∣x=1

=

∑p τ(p)

N(m), Tave,m (9)

48

where the sums in the numerator and denominator of the expressions on the right run over allpaths p of length m in G and N(m) denotes the total number of such paths. Since τ(p)/m is theaverage edge cost of the path p, the derivative evaluated at x = 1 equals Tave,m, the average edgecost over all paths of length m in G. Setting x = 1 in (8), we obtain

Tave , limm→∞

∑p τ(p)/m

N(m)=ρ′G(1)

ρG(1)= (αlo

G)−1. (10)

where we can interpret Tave as the asymptotic average cost per edge over all paths in G. Notingthat the Markov chain of maximum entropy Hmax on G assigns probability approximately 2−mHmax

to each path in G, we see that this is also the average cost per edge with respect to this Markovchain.

Turning to αupG , we note from [MRS01, Theorem 3.17] that an expression similar to (6) applies

to the cycles at each state u in G, namely

limm→∞

1

mlog(PG(x)m)u,u = log ρG(x). (11)

The matrix element (PG(x)m)u,u is the generating function for the cost of cycles of length m atvertex u. So the logarithm in the argument of the limit can be written as

log(PG(x)m)u,u = log

∑length-m cycles p at u

xτ(p)

. (12)

Let Tmin,m(u) denote the minimum average cost per edge over cycles of length m at u. We notethat the limit Tmin = lim infm→∞ Tmin,m(u) is independent of u and equals the minimum averagecost per edge in a simple (non-intersecting) cycle in G.

Rewriting x = elog(x) and applying the log-sum-exp inequality to the resulting expression weobtain

maxcycles p

{log(x)τ(p)} ≤ log(PG(x)m)u,u ≤ maxcycles p

{log(x)τ(p)}+ log(P (1)m)u,u (13)

where the term on the right is the logarithm of the number of cycles of length m at u.We are interested in the limit as x→ 0+, so, restricting to the range 0 < x < 1 where log(x) < 0,

the maximum evaluates to log(x)Tmin,m, yielding

log(x)mTmin,m ≤ log(PG(x)m)u,u ≤ log(x)mTmin,m + log(P (1)m)u,u. (14)

Normalizing by m and taking the limit, we invoke (11) to conclude that

log(x)Tmin ≤ log ρG(x) ≤ log(x)Tmin + log ρG(1). (15)

We divide by log(x) and take the limit as x→ 0+. Recalling that ρG(x)→ 0 as x→ 0+, we applyl’Hopital’s rule to conclude

Tmin = limx→0+

xρ′G(x)

ρG(x)= (αup

G )−1. (16)

49

E Examples and Illustrations of our Methods

In this appendix we give further illustrations of our techniques. First, we give an example showinghow to apply univariate methods.

Example E.1 (Capacity of the Binary Alternating Sequence). Consider the constrained systemfrom Figure 2. The adjacency matrix of the system is given by P (x) = x+x2 and thus the generatingfunction of the series N(t) is equal to

F (x) =1

(1− x)(1− x− x2).

The poles of this function are given by x1 = 1, x2 = −√5+12 , x3 =

√5−12 . To analyze the coefficients

of this generating function, it is a standard procedure to calculate the partial fraction decompositionand we obtain

F (x) = − 1

1− x/x1+

1− 2√5

5

1− x/x2+

1 + 2√5

5

1− x/x3,

such that Π1(t) = −1, Π2(t) = 1− 2√5

2 and Π3(t) = 1+ 2√5

2 . Using this decomposition, the sequenceN(t) is precisely

N(t) = −1 +

(1− 2

√5

5

)(1−√

5

2

)t+

(1 +

2√

5

5

)(1 +√

5

2

)t.

Since the last summand will be asymptotically dominant, the capacity of this constrained system

is C = log(1+√5

2

), which can alternatively be derived from the positive solution x0 =

√5−12 to

ρ(x) = 1, where ρ(x) = x+ x2, confirming Theorem 3.5.

Next, we go through a multivariate example from first principles to sketch how the techniquesof ACSV work. Note that in the example here we have a strictly minimal critical point, whichsimplifies the analysis from what we need to consider in the general case.

Example E.2 (ACSV for the Binary Alternating Sequence). Consider the constrained systemfrom Figure 2. The corresponding generating function is F (x, y) = 1

(1−x)(1−yx(x+1)) , so the Cauchyresidue theorem gives the representation

N(t, αt) =1

(2πi)2

∫|x|=ε,|y|=ε

1

(1− x)(1− yx(x+ 1))

dxdy

xt+1yαt+1

for the power series coefficients of F . By inspection we see that the only non-zero terms ci,jxiyj

appearing in the power series expansion of F have i ≥ j, so we determine asymptotics of thissequence as n → ∞ and α ∈ (0, 1). That α lies in (0, 1) is also a reflection of the fact that ourmodels use integer weights.

Let H1(x, y) = 1−yx(x+ 1) and H2(x, y) = 1−x. The critical points on the smooth part of thesingular set defined by H1(x, y) = 0 and H2(x, y) 6= 0 satisfy (2) with H = H1, which has a uniquesolution

(x0, y0) =

(1− α2α− 1

,(1− 2α)2

α(1− α)

).

50

Because of the factor 1 − x in H(x, y) the point (x0, y0) is minimal only when α ∈ [2/3, 1), andsince y = 1/x(x + 1) when H1(x, y) = 0 this point is strictly minimal when α ∈ (2/3, 1). Noneof the smooth points where H2(x, y) = 0 and H1(x, y) 6= 0 are critical, but the nonsmooth point(x1, y1) = (1, 1/2) defined by H1(x, y) = H2(x, y) = 0 is also a minimal critical point.

If α ∈ (2/3, 1) then (x0, y0) is a minimal point so we may write

N(t, αt) =1

(2πi)2

∫|x|=σ1

(∫|y|=σ2−ε

1

(1− x)(1− yx(x+ 1))

dy

yαt+1

)dx

xt+1

for any sufficiently small ε > 0. Because (x0, y0) is strictly minimal, we may truncate the circle|x| = σ1 in this domain of integration to any neighbourhood N of the circle around the positive realaxis, while introducing only an exponentially negligible error. Furthermore, because this integraldecays exponentially with the modulus of y, replacing |y| = σ2 − ε with |y| = σ2 + ε results in anintegral which grows exponentially slower than N(t, αt). Combined, these two facts imply that thecoefficient N(t, αt) is, up to negligible error, approximated by the integral

1

(2πi)2

∫N

(∫|y|=σ2−ε

1

(1− x)(1− yx(x+ 1))

dy

yαt+1−∫|y|=σ2+ε

1

(1− x)(1− yx(x+ 1))

dy

yαt+1

)dx

xt+1,

where N = {σ1eiθ : −a < θ < a} for any sufficiently small a > 0. Because (x0, y0) is strictlyminimal and V is smooth at (x0, y0), for any x ∈ N the inner integral equals the univariate residuein y at the singularity y = 1/x(x+ 1), giving the asymptotic approximation

N(t, αt) ∼ 1

2πi

∫N

1

(1− x)x(1 + x)

dx

(x(x+ 1))−αt−1xt+1.

Parametrizing x = σ1eiθ then gives

N(t, αt) ∼ 1

2π

∫ a

−a

1

1− σ1eiθe−tφ(θ)dθ, (17)

where φ(θ) = log(σ1eiθ)−α log(σ1e

iθ)−α log(1+σ1eiθ). Taking derivatives shows that the derivative

φ′(θ) vanishes only at the origin and, after some algebraic manipulation, gives the series expansion

φ(θ) = log σ1 + α log σ2 +(1− α)(2α− 1)

2αθ2 +O(θ)3,

where we note that vanishing of φ′ at the origin occurs precisely because (x0, y0) is a critical point.The classical saddle-point method then implies we can replace φ in (17) by its power series at theorigin, giving explicit dominant asymptotic behaviour

N(t, αt) ∼ σ−t1 σ−αt2

2π

∫ ∞−∞

1

1− σ1e−t

(1−α)(2α−1)2α

θ2dθ

=

(2α− 1

1− α

)t( α(1− α)

(1− 2α)2

)αtt−1/2

√(2a− 1)a

(3a− 2)√

2π(1− a).

If α ∈ (0, 2/3) then (x0, y0) is no longer minimal, and the above argument breaks down. How-ever, the point (x1, y1) = (1, 1/2) is still minimal so we may write

N(t, αt) =1

(2πi)2

∫|x|=0.95|y|=0.45

1

(1− x)(1− yx(x+ 1))

dxdy

xt+1yαt+1.

51

Similar to the previous case, it can be shown that the integrand exponentially decreases for x and yaway from the positive real axis, so for any arbitrarily small neighbourhood N of (0.95, 0.4) in theproduct of circles {(x, y) ∈ C2 : |x| = 0.95, |y| = 0.4} we may approximate

N(t, αt) =1

(2πi)2

∫N

1

(1− x)(1− yx(x+ 1))

dxdy

xt+1yαt+1+O(ωt)

for some ω > 0 that will be less than the exponential growth of Nt,αt. As above our goal will be tocompute a residue to simplify this integral, however because the zero sets of H1(x, y) = 1−yx(x+1)and H2(x, y) = 1−x split C2 into four chambers near (x1, y1) (see Figure 11) we will be able to takea ‘double’ residue to directly compute N(t, αt) up to an exponentially small error, without needinga saddle-point analysis.

To perform this analysis, we first note that the exponential growth of the integrand I underconsideration is captured by the height function

hα(x, y) = limt→∞|I|1/t = x−1y−α.

Since the gradient of hα at (1, 1/2) is a positive2 linear combination

(∇hα)(1, 1/2) = 2α (1− 3α/2) (∇H1)(1, 1/2) + α2α(∇H2)(1, 1/2)

of the gradients of H1 and H2 at (1, 1/2), we are able to pick a point arbitrarily close to (1, 1/2) ineach in the three chambers of R2 \ V(H1H2) not containing the origin where the value of hα(x, y)is less than hα(1, 1/2) = 2α. See Figure 11 for an example of such points when α = 1/2.

A smaller value of the height function corresponds to an exponentially smaller Cauchy integral,so for each such point (a, b) we can add the integral over any arbitrarily small neighbourhood ofR2>0 in {(x, y) : |x| = a, |y| = b} while introducing only an exponentially negligible error. Adding a

signed sum of such integrals then allows us to take two residues, ultimately giving

N(t, αt) =1

(2πi)2Res x=1

y=1/x(x+1)

1

(1− x)(1− yx(x+ 1))

1

xt+1yαt+1+O(ωn) = 2αt +O(ωt)

for some ω ∈ (0, 1/2). Note that in this case we obtain asymptotics up to an exponentially smallerror (the polynomial error term in the first case was introduced by the saddle-point method, whichis no longer necessary when we have two factors intersecting).

E.1 A Geometric Visualization of Contributing Singularities

One of the most serendipitous aspects of our analysis is that, for α ∈ (0, αupG ), Proposition 5.4 applies

exactly in the situations where Proposition 5.5 doesn’t, so we can always determine asymptotics.In this section we explain this dichotomy through a geometric interpretation of minimal criticalpoints.

The amoeba amoeba(H) = {(log |x|, log |y|) ∈ C2∗ : H(x, y) = 0} of a bivariate polynomial

H(x, y) is the image of its roots with non-zero coordinates under the Relog map Relog(a, b) =(log |a|, log |b|). The amoeba of a polynomial is a connected region of R2 whose complement

2Note that in general one must orient the gradients in the right manner (for instance, multiplying a factor by −1should not change the approach). This is taken into consideration in Proposition 5.5.

52

0.9 0.95 1 1.05 1.1

0.3

0.4

0.5

0.6

0.7

Figure 11: Plot of real points on the curves 1 − x = 0 (solid) and 1 − yx(x + 1) = 0 (dashed)together with the points (0.95, 0.45), (1.05, 0.45), (0.95, 0.6), and (1.05, 0.6). When α = 1/2 thenthe value of hα(x, y) at the final three points is less than 2α =

√2.

amoeba(H)c = R2 \ amoeba(H) is the union of connected convex sets (see [Mel21, Sect. 3.3]for details on the amoeba properties discussed here). The connected components of amoeba(H)are in bijection with the different convergent Laurent expansions of 1/H, and it follows directlyfrom the definition that the amoeba of a product is the union of amoebas: amoeba(H1H2) =amoeba(H) ∪ amoeba(H2). Figure 12 shows the amoeba of the example discussed above.

Let H(x, y) = (1 − x) det(I − yP (x)) be the denominator appearing in one of our generatingfunctions. The power series expansion we consider corresponds to the component of amoeba(H)c

containing all points of the form (−N,−N) for all sufficiently large N > 0. If we let D denote thiscomponent then the minimal points of V are those which map to the boundary ∂D under the Relogmap. In fact, because our generating functions have non-negative power series coefficients, ∂Dcorresponds to the image of the minimal points of V with positive coordinates under the Relog map(see [Mel21, Prop. 5.4]). On the positive real points of V the Relog map has inverse (x, y) 7→ (ex, ey),so the tangent line to ∂D at a point (log x, log y) has normal ∇logH = (xHx(x, y), yHy(x, y)). Inparticular, a minimal point (x, y) with 0 < x < 1 and y > 0 is critical in the direction (1, α) if andonly if (1, α) is the outward normal to the tangent plane of ∂D at (log x, log y).

The other minimal point of interest is the unique minimal point (1, y) with y > 0. At (0, log y)the boundary ∂D does not have an outward normal vector, but an outward normal cone. WritingH1(x, y) = det(I − yP (x)) and H2(x, y) = 1 − x, the outward normal cone is the positive spanof the vectors −(∇logH1)(1, y) and −(∇logH2)(1, y). Thus, (1, y) determines asymptotics in thedirection (1, α) if and only if (1, α) lies in the outward normal cone at (0, log y). See Figure 13 foran illustration.

This geometric interpretation shows why we can always apply Propositions 5.4 and 5.5 forappropriate values of α. In particular, when α > 0 is close to zero the direction vector (1, α) isalmost horizontal, and thus lies in the outward normal cone to ∂D at the point (0, log y) discussedabove. This means we can apply Proposition 5.5 to determine asymptotics for α > 0 sufficientlysmall. Once (1, α) has larger slope than the logarithmic gradient (∇logH1)(1, y), which occurs whenα > αlo

G, it is no longer in this outward normal cone, but it is a scalar multiple of the outward normal

53

−2 −1 1 2

−2

−1

1

2

x

y

Figure 12: The amoeba of H(x, y) = −x2y − xy + 1 with the line x = 0 (corresponding to theamoeba of 1− x).

−3 −2 −1 0 1−3

−2

−1

0

1

2

D

−0.4 −0.2 0 0.2 0.4−2

−1.8

−1.6

−1.4

−1.2

−1

D

Figure 13: Left: The image of three minimal critical points under the Relog map, together withthe directions in which they are critical (normals to the corresponding tangent planes of D). Right:The outward normal cone to D at (0, log y) contains the directions in which (1, y) determinesasymptotics.

54

to ∂D at a smooth point. This means we can apply Proposition 5.4 to determine asymptotics forαloG < α < αup

G . The quantity αupG is the limit slope of outward normals to ∂D at minimal points

(x, y) ∈ ∂D with x→ −∞. Thus, for α > αupG there are no minimal critical points; in this situation

every line with normal (1, α) intersects D which implies that the coefficient sequence N(n, αn) iseventually zero (see [Mel21, Proposition 5.7]).

55

multivariate analytic combinatorics for cost constrained

Documents