nonconvex optimization for communication systems · nonconvex optimization for communication...

Nonconvex Optimization for

Communication Systems

Mung Chiang

Electrical Engineering DepartmentPrinceton University, Princeton, NJ 08544, [email protected]

Summary. Convex optimization has provided both a powerful tool and an intrigu-ing mentality to the analysis and design of communication systems over the lastfew years. A main challenge today is on nonconvex problems in these application.This paper presents an overview of some of the important nonconvex optimizationproblems in point-to-point and networked communication systems. Three typical ap-plications are covered: Internet congestion control through nonconcave network util-ity maximization, wireless network power control through geometric and sigmoidalprogramming, and DSL spectrum management through distributed nonconvex op-timization. A variety of nonconvex optimization techniques are showcased: fromstandard dual relaxation to sum-of-squares programming through successive SDPrelaxation, signomial programming through successive GP relaxation, and leveragingthe specific structures in problems for efficient and distributed heuristics.

Key words: Nonconvex optimization, Geometric programming, Semidefi-nite programming, Sum of squares, Duality, Network utility maximization,TCP/IP, Wireless network, Power control.

1 Introduction

There has been two major “waves” in the history of optimization theory: thefirst started with linear programming and simplex method in late 1940s, andthe second with convex optimization and interior point method in late 1980s.Each has been followed by a transforming period of appreciation-applicationcycle: as more people appreciate the use of LP/convex optimization, morelook for their formulations in various applications; then more work on its the-ory, efficient algorithms and softwares; the more powerful the tools become;then more people appreciate its usage. Communication systems benefit sig-nificantly from both waves, including multicommodity flow solutions (e.g.,Bellman Ford algorithm) from LP, and basic network utility maximizationand robust transceiver design from convex optimization.

2 Chiang: Nonconvex Optimization for Communication Systems

Much of the current research frontier is about the potential of the thirdwave, on nonconvex optimization. If one word is used to differentiate be-tween easy and hard problems, convexity is probably the “watershed”. But ifa longer description length is allowed, much useful conclusions can be drawneven for nonconvex optimization. Indeed, convexity is a very disturbing wa-tershed, since it is not a topological invariant under change of variable (e.g.,see geometric programming) or higher-dimension embedding (e.g., see sum ofsquares method). A variety of approaches have been proposed, from nonlineartransformation to turn an apparently nonconvex problem into a convex prob-lem, to characterization of attraction regions and systematically jumping outof a local optimum, from successive convex approximation to dualization, fromleveraging the specific structures of the problems (e.g., Difference of Convexfunctions, concave minimization, low rank nonconvexity) to developing moreefficient branch-and-bound procedures.

Researchers in communications and networking have been examining non-convex optimization using domain-specific structures in important problemsin the areas of wireless networking, Internet engineering, and communicationtheory. Perhaps four typical topics best illustrate the variety of challengingissues arising from nonconvex optimization in communication systems:

• Nonconvex objective to be minimized. An example is congestion controlfor inelastic applications.

• Nonconvex constraint set. An example is power control in low SIR regimes.• Integer constraints. Two important examples are single path routing and

multiuser detection.• Constraint sets that are convex but require an exponential number of

inequalities to explicitly describe. An example is optimal scheduling.

This chapter overviews the latest results in recent publications about thefirst two topics, with a particular focus on showing the connections betweenthe engineering intuitions about important problems in communication sys-tems and the state-of-the-art algorithms in nonconvex optimization theory.

2 Internet Congestion Control

2.1 Introduction

Basic network utility maximization

Since the publication of the seminal paper [24] by Kelly, Maulloo, and Tan in1998, the framework of Network Utility Maximization (NUM) has found manyapplications in network rate allocation algorithms and Internet congestioncontrol protocols (e.g., surveyed in [32, 47]). It has also lead to a systematicunderstanding the entire network protocol stack in the unifying framework(e.g., surveyed in [11, 31]). By allowing nonlinear concave utility objective

Special Volume 3

functions, NUM substantially expands the scope of the classical LP-basedNetwork Flow Problems.

Consider a communication network with L links, each with a fixed capacityof cl bps, and S sources (i.e., end users), each transmitting at a source rateof xs bps. Each source s emits one flow, using a fixed set L(s) of links in itspath, and has a utility function Us(xs). Each link l is shared by a set S(l)of sources. Network Utility Maximization (NUM), in its basic version, is thefollowing problem of maximizing the total utility of the network

∑

s Us(xs),over the source rates x, subject to linear flow constraints

∑

s:l∈L(s) xs ≤ cl forall links l:

maximize∑

s Us(xs)subject to

∑

s∈S(l) xs ≤ cl, ∀l,

x � 0(1)

where the variables are x ∈ RS .There are many nice properties of the basic NUM model due to several

simplifying assumptions of the utility functions and flow constraints, whichprovide the mathematical tractability of problem (1) but also limit its ap-plicability. In particular, the utility functions {Us} are often assumed to beincreasing and strictly concave functions.

Assuming that Us(xs) becomes concave for large enough xs is reasonable,because the law of diminishing marginal utility eventually will be effective.However, Us may not be concave throughout its domain. In his seminal paperpublished a decade ago, Shenker [45] differentiated inelastic network trafficfrom elastic traffic. Utility functions for elastic traffic were modeled as strictlyconcave functions. While inelastic flows with nonconcave utility functions rep-resent important applications in practice, they have received little attentionand rate allocation among them have scarcely any mathematical foundation,except three recent publications [28, 12, 15] (see also earlier work in [54, 29, 30]related to the approach in [28])

In this section, we investigate the extension of the basic NUM to max-imization of nonconcave utilities, as in the approach of [15]. We provide acentralized algorithm for off-line analysis and establishment of a performancebenchmark for nonconcave utility maximization when the utility function is apolynomial or signomial. Based on the semialgebraic approach to polynomialoptimization, we employ convex sum-of-squares (SOS) relaxations solved bya sequence of semidefinite programs (SDP), to obtain increasingly tighter up-per bounds on total achievable utility for polynomial utilities. Surprisingly, inall our experiments, a very low order and often a minimal order relaxationyields not just a bound on attainable network utility, but the globally maxi-mized network utility. When the bound is exact, which can be proved usinga sufficient test, we can also recover a globally optimal rate allocation.


Canonical distributed algorithm

A reason that the the assumption of utility function’s concavity is upheld inalmost all papers on NUM is that it leads to three highly desirable mathe-matical properties of the basic NUM:

• It is a convex optimization problem, therefore the global minimum can becomputed (at least in centralized algorithms) in worst-case polynomial-time complexity [5].

• Strong duality holds for (1) and its Lagrange dual problem. Zero dualitygap enables a dual approach to solve (1).

• Minimization of a separable objective function over linear constraints canbe conducted by distributed algorithms based on the dual approach.

Indeed, the basic NUM (1) is such a “nice” optimization problem thatits theoretical and computational properties have been well studied since the1960s in the field of monotropic programming, e.g., as summarized in [41].For network rate allocation problems, a dual-based distributed algorithm hasbeen widely studied (e.g., in [24, 32]), and is summarized below.

Zero duality gap for (1) states that the solving the Lagrange dual problemis equivalent to solving the primal problem (1). The Lagrange dual problemis readily derived. We first form the Lagrangian of (1):

L(x, λ) =∑

s

Us(xs) +∑

l

λl

cl −∑

s∈S(l)

xs

where λl ≥ 0 is the Lagrange multiplier (link congestion price) associated withthe linear flow constraint on link l. Additivity of total utility and linearityof flow constraints lead to a Lagrangian dual decomposition into individualsource terms:

L(x, λ) =∑

s

Us(xs) −

∑

l∈L(s)

λl

xs

+∑

l

clλl

=∑

s

Ls(xs, λs) +

∑

l

clλl

where λs =∑

l∈L(s) λl. For each source s, Ls(xs, λs) = Us(xs) − λsxs only

depends on local xs and the link prices λl on those links used by source s.The Lagrange dual function g(λ) is defined as the maximized L(x, λ) over

x. This “net utility” maximization obviously can be conducted distributivelyby the each source, as long as the aggregate link price λs =

∑

l∈L(s) λl isavailable to source s, where source s maximizes a strictly concave functionLs(xs, λ

s) over xs for a given λs:

x∗s(λ

s) = argmax [Us(xs) − λsxs] , ∀s. (2)

Special Volume 5

The Lagrange dual problem is

minimize g(λ) = L(x∗(λ), λ)subject to λ � 0

(3)

where the optimization variable is λ. Any algorithms that find a pair of primal-dual variables (x, λ) that satisfy the KKT optimality condition would solve(1) and its dual problem (23). One possibility is a distributed, iterative sub-gradient method, which updates the dual variables λ to solve the dual problem(23):

λl(t + 1) =

λl(t) − α(t)

cl −∑

s∈S(l)

xs(λs(t))

+

, ∀l (4)

where t is the iteration number and α(t) > 0 are step sizes. Certain choices ofstep sizes, such as α(t) = α0/t, α0 > 0, guarantee that the sequence of dualvariables λ(t) will converge to the dual optimal λ

∗ as t → ∞. The primalvariable x(λ(t)) will also converge to the primal optimal variable x∗. For aprimal problem that is a convex optimization, the convergence is towards theglobal optimum.

The sequence of the pair of algorithmic steps (2,4) forms a canonical dis-tributed algorithm that globally solves network utility optimization problem(1) and the dual (23) and computes the optimal rates x∗ and link prices λ

∗.

Nonconcave Network Utility Maximization

It is known that for many multimedia applications, user satisfaction mayassume non-concave shape as a function of the allocated rate. For example, theutility for streaming applications is better described by a sigmoidal function:with a convex part at low rate and a concave part at high rate, and a singleinflexion point x0 (with U ′′

s (x0) = 0) separating the two parts. The concavityassumption on Us is also related to the elasticity assumption on rate demandsby users. When demands for xs are not perfectly elastic, Us(xs) may not beconcave.

Suppose we remove the critical assumption that {Us} are concave func-tions, and allow them to be any nonlinear functions. The resulting NUMbecomes nonconvex optimization and significantly harder to be analyzed andsolved, even by centralized computational methods. In particular, a local op-timum may not be a global optimum and the duality gap can be strictlypositive. The standard distributive algorithms that solve the dual problemmay produce infeasible or suboptimal rate allocation.

Despite such difficulties, there have been two very recent publications ondistributed algorithm for nonconcave utility maximization. In [28], a “self-regulation” heuristic is proposed to avoid the resulting oscillation in rateallocation and shown to converges to an optimal rate allocation asymptot-ically when the proportion of nonconcave utility sources vanishes. In [12], a


0 2 4 6 8 10 120

0.5

1

1.5

2

2.5

3

x

U(x

)

Fig. 1. Some examples of utility functions Us(xs): it can be concave or sigmoidalas shown in the graph, or any general nonconcave function. If the bottleneck linkcapacity used by the source is small enough, i.e., if the dotted vertical line is pushedto the left, a sigmoidal utility function effectively becomes a convex utility function.

set of sufficient conditions and necessary conditions is presented under whichthe canonical distributed algorithm still converges to the globally optimal so-lution. However, these conditions may not hold in many cases. These twoapproaches illustrate the choice between admission control and capacity plan-ning to deal with nonconvexity (see also the discussion in [23]). But neitherapproach provides a theoretically polynomial-time and practically efficient al-gorithm (distributed or centralized) for nonconcave utility maximization.

In this section, we remove the concavity assumption on utility functions,thus turning NUM into a nonconvex optimization problem with a strictly pos-itive duality gap. Such problems in general are NP hard, thus extremely un-likely to be polynomial-time solvable even by centralized computations. Usinga family of convex semidefinite programming (SDP) relaxations based on thesum-of-squares (SOS) relaxation and the Positivstellensatz Theorem in realalgebraic geometry, we apply a centralized computational method to boundthe total network utility in polynomial-time. A surprising result is that for allthe examples we have tried, wherever we could verify the result, the tightestpossible bound (i.e., the globally optimal solution) of NUM with nonconcaveutilities is computed with a very low order relaxation. This efficient numer-ical method for off-line analysis also provides the benchmark for distributedheuristics.

These three different approaches: proposing distributed but suboptimalheuristics (for sigmoidal utilities) in [28], determining optimality conditionsfor the canonical distributed algorithm to converge globally (for all nonlinearutilities) in [12], and proposing efficient but centralized method to computethe global optimum (for a wide class of utilities that can be transformed into

Special Volume 7

polynomial utilities) in [15] and this section, are complementary in the studyof distributed rate allocation by nonconcave NUM.

2.2 Global maximization of nonconcave network utility

Sum-of-squares method

We would like to bound the maximum network utility by γ in polynomial timeand search for a tight bound. Had there been no link capacity constraints,maximizing a polynomial is already an NP hard problem, but can be relaxedinto a SDP [46]. This is because testing if the following bounding inequalityholds γ ≥ p(x), where p(x) is a polynomial of degree d in n variables, isequivalent to testing the positivity of γ−p(x), which can be relaxed into testingif γ − p(x) can be written as a sum of squares (SOS): p(x) =

∑ri=1 qi(x)2 for

some polynomials qi, where the degree of qi is less than or equal to d/2. Thisis referred to as the SOS relaxation. If a polynomial can be written as a sumof squares, it must be non-negative, but not vice versa. Conditions underwhich this relaxation is tight were studied since Hilbert. Determining if asum of squares decomposition exists can be formulated as an SDP feasibilityproblem, thus polynomial-time solvable.

Constrained nonconcave NUM can be relaxed by a generalization of theLagrange duality theory, which involves nonlinear combinations of the con-straints instead of linear combinations in the standard duality theory. The keyresult is the Positivstellensatz, due to Stengle [48], in real algebraic geometry,which states that for a system of polynomial inequalities, either there existsa solution in Rn or there exists a polynomial which is a certificate that nosolution exists. This infeasibility certificate is recently shown to be also com-putable by an SDP of sufficient size [38, 37], a process that is referred to asthe sum-of-squares method and automated by the software SOSTOOLS [39]initiated by Parrilo in 2000. For a complete theory and many applications ofSOS methods, see [38] and references therein.

Furthermore, the bound γ itself can become an optimization variable inthe SDP and can be directly minimized. A nested family of SDP relaxations,each indexed by the degree of the certificate polynomial, is guaranteed toproduce the exact global maximum. Of course, given the problem is NP hard,it is not surprising that the worst-case degree of certificate (thus the numberof SDP relaxations needed) is exponential in the number of variables. Whatis interesting is the observation that in applying SOSTOOLS to nonconcaveutility maximization, a very low order, often the minimum order relaxationalready produces the globally optimal solution.

Application of SOS method to nonconcave NUM

Using sum-of-squares and the Positivstellensatz, we set up the following prob-lem whose objective value converges to the optimal value of problem (1), where


{Ui} are now general polynomials, as the degree of the polynomials involvedis increased.

minimize γsubject toγ −

∑

s Us(xs) −∑

l λl(x)(cl −∑

s∈S(l) xs)

−∑

j,k λjk(x)(cj −∑

s∈S(j) xs)(ck −∑

s∈S(k) xs)−

. . . − λ12...n(x)(c1 −∑

s∈S(1) xs) . . . (cn −∑

s∈S(n) xs)

is SOS,λl(x), λjk(x), . . . , λ12...n(x) are SOS.

(5)

The optimization variables are γ and all of the coefficients in polynomialsλl(x), λjk(x), . . . , λ12...n(x). Note that x is not an optimization variable; theconstraints hold for all x, therefore imposing constraints on the coefficients.This formulation uses Schmudgen’s representation of positive polynomialsover compact sets [44].1 Two alternative representations are discussed in [15].

Let D be the degree of the expression in the first constraint in (5). Werefer to problem (5) as the SOS relaxation of order D for the constrainedNUM. For a fixed D, the problem can be solved via SDP. As D is increased,the expression includes more terms, the corresponding SDP becomes larger,and the relaxation gives tighter bounds. An important property of this nestedfamily of relaxations is guaranteed convergence of the bound to the globalmaximum.

Regarding the choice of degree D for each level of relaxation, clearly apolynomial of odd degree cannot be SOS, so we need to consider only thecases where the expression has even degree. Therefore, the degree of the firstnon-trivial relaxation is the largest even number greater than or equal todegree of

∑

s Us(xs), and the degree is increased by 2 for the next level.A key question now becomes: How do we find out, after solving an SOS

relaxation, if the bound happens to be exact? Fortunately, there is a sufficienttest that can reveal this, using the properties of the SDP and its dual solu-tion. In [19, 26], a parallel set of relaxations, equivalent to the SOS ones, isdeveloped in the dual framework. The dual of checking the nonnegativity ofa polynomial over a semi-algebraic set turns out to be finding a sequence ofmoments that represent a probability measure with support in that set. Tobe a valid set of moments, the sequence should form a positive semidefinitemoment matrix. Then, each level of relaxation fixes the size of this matrix,i.e., considers moments up a certain order, and therefore solves an SDP. Thisis equivalent to fixing the order of the polynomials appearing in SOS relax-

1 Schmudgen’s representation applies when γ −∑

Us(xs) is strictly positive onthe feasible set. Therefore the convergence is asymptotic in theory, however inpractice finite convergence is observed most of the time. If we were to use Stengle’sPositivstellensatz, we would have finite convergence but could not have γ as anoptimization variable and at each relaxation level would have to use a bisectionon γ. For computational convenience, we choose Schmudgen’s form.

Special Volume 9

ations. The sufficient rank test checks a rank condition on this moment matrixand recovers (one or several) optimal x∗, as discussed in [19].

In summary, we have the following Algorithm for centralized computationof a globally optimal rate allocation to nonconcave utility maximization, wherethe utility functions can be written as or converted into polynomials.

Algorithm 1. Sum-of-squares for nonconcave utility maximiza-tion.

1) Formulate the relaxed problem (5) for a given degree D.2) Use SDP to solve the Dth order relaxation, which can be conducted

using SOSTOOLS [39].3) If the resulting dual SDP solution satisfies the sufficient rank condition,

the Dth order optimizer γ∗(D) is the globally optimal network utility, and acorresponding x∗ can be obtained 2.

4) Increase D to D + 2, i.e., the next higher order relaxation, and repeat.

In the following subsection, we give examples of the application of SOSrelaxation to the nonconcave NUM. We also apply the above sufficient test tocheck if the bound is exact, and if so, we recover the optimum rate allocationx∗ that achieve this tightest bound.

2.3 Numerical Examples and Sigmoidal Utilities

Polynomial utility examples

First, consider quadratic utilities, i.e., Us(xs) = x2s as a simple case to start

with (this can be useful, for example, when the bottleneck link capacity limitssources to their convex region of a sigmoidal utility). We present examplesthat are typical, in our experience, of the performance of the relaxations.

Example 1. A small illustrative example. Consider the simple 2 link, 3user network shown in Figure 2, with c = [1, 2]. The optimization problem is

x1

x2 x3

c1 c2

Fig. 2. Network topology for example 1.

2 Otherwise, γ∗(D) may still be the globally optimal network utility but is onlyprovably an upper bound.


maximize∑

s x2s

subject to x1 + x2 ≤ 1x1 + x3 ≤ 2x1, x2, x3 ≥ 0.

(6)

The first level relaxation with D = 2 is

minimize γsubject toγ − (x2

1 + x22 + x2

3) − λ1(−x1 − x2 + 1) − λ2(−x1

−x3 + 2) − λ3x1 − λ4x2 − λ5x3 − λ6(−x1 − x2 + 1)(−x1 − x3 + 2) − λ7x1(−x1 − x2 + 1) − λ8x2(−x1

−x2 + 1) − λ9x3(−x1 − x2 + 1) − λ10x1(−x1 − x3 + 2)−λ11x2(−x1 − x3 + 2) − λ12x3(−x1 − x3 + 2)−λ13x1x2 − λ14x1x3 − λ15x2x3 is SOS,λi ≥ 0, i = 1, . . . , 15.

(7)

The first constraint above can be written as xT Qx for x = [1, x1, x2, x3]T

and an appropriate Q. For example, the (1,1) entry which is the constantterm reads γ − λ1 − 2λ2 − 2λ6, the (2,1) entry, coefficient of x1, reads λ1 +λ2 − λ3 + 3λ6 − λ7 − 2λ10, and so on. The expression is SOS if and only ifQ ≥ 0. The optimal γ is 5, which is achieved by, e.g., λ1 = 1, λ2 = 2, λ3 =1, λ8 = 1, λ10 = 1, λ12 = 1, λ13 = 1, λ14 = 2 and the rest of the λi equal tozero. Using the sufficient test (or, in this example, by inspection) we find theoptimal rates x0 = [0, 1, 2].

In this example, many of the λi could be chosen to be zero. This meansnot all product terms appearing in (7) are needed in constructing the SOSpolynomial. Such information is valuable from the decentralization point ofview, and can help determine to what extent our bound can be calculated ina distributed manner. This is a challenging topic for future work.

Example 2. Larger tree topology. As a larger example, consider the net-work shown in Figure 2.3 with 7 links. There are 9 users, with the followingrouting table that lists the links on each user’s path.

x1 x2 x3 x4 x5 x6 x7 x8 x9

1,2 1,2,4 2,3 4,5 2,4 6,5,7 5,6 7 5

For c = [5, 10, 4, 3, 7, 3, 5], we obtain the bound γ = 116 with D = 2,which turns out to be globally optimal, and the globally optimal rate vectorcan be recovered: x0 = [5, 0, 4, 0, 1, 0, 0, 5, 7]. In this example, exhaustivesearch is too computationally intensive, and the sufficient condition test playsan important role in proving the bound was exact and in recovering x0.

Example 3. Large m-hop ring topology. Consider a ring network with nnodes, n users and n links where each user’s flow starts from a node and goesclockwise through the next m links, as shown in Figure 2.3 for n = 6, m = 2.As a large example, with n = 25, m = 2 and capacities chosen randomlyfor a uniform distribution on [0, 10], using relaxation of order D = 2 we

Special Volume 11

c1 c2

c3

c4

c5

c6

c7


obtain the exact bound γ = 321.11 and recover an optimal rate allocation.For n = 30, m = 2, and capacities randomly chosen from [0, 15], it turns outthat D = 2 relaxation yields the exact bound 816.95 and a globally optimalrate allocation.

c1

c2

c3

c4

c5

c6


Sigmoidal utility examples

Now consider sigmoidal utilities in a standard form:

Us(xs) =1

1 + e−(asxs+bs),

where {as, bs} are constant integers. Even though these sigmoidal functionsare not polynomials, we show the problem can be cast as one with polynomialcost and constraints, with a change of variables.

Example 4. Sigmoidal utility. Consider the simple 2 link, 3 user exampleshown in Figure 2 for as = 1 and bs = −5.

The NUM problem is to


maximize∑

s1

1+e−(xs−5)

subject to x1 + x2 ≤ c1

x1 + x3 ≤ c2

x ≥ 0.

(8)

Let ys = 11+e−(xs−5) , then xs = − log( 1

ys−1)+5. Substituting for x1, x2 in

the first constraint, arranging terms and taking exponentials, then multiplyingthe sides by y1y2 (note that y1, y2 > 0), we get

(1 − y1)(1 − y2) ≥ e(10−c1)y1y2,

which is polynomial in the new variables y. This applies to all capacity con-straints, and the non-negativity constraints for xs translate to ys ≥ 1

1+e5 .Therefore the whole problem can be written in polynomial form, and SOSmethods apply. This transformation renders the problem polynomial for gen-eral sigmoidal utility functions, with any as and bs.

We present some numerical results, using a small illustrative example. HereSOS relaxations of order 4 (D = 4) were used. For c1 = 4, c2 = 8, we findγ = 1.228, which turns out to be a global optimum, with x0 = [0, 4, 8] as theoptimal rate vector. For c1 = 9, c2 = 10, we find γ = 1.982 and x0 = [0, 9, 10].Now place a weight of 2 on y1, while the other ys have weight one, we obtainγ = 1.982 and x0 = [9, 0, 1].

In general, if as 6= 1 for some s, however, the degree of the polynomials inthe transformed problem may be very high. If we write the general problemas

maximize∑

s1

1+e−(asxs+bs)

subject to∑

s∈S(l) xs ≤ cl, ∀l,

x ≥ 0,

(9)

each capacity constraint after transformation will be∏

s(1 − ys)rlsΠk 6=sak ≥

exp(−∏

s as(cl +∑

s rls/asbs))∏

s yrls

∏

k 6=sak

s ,

where rls = 1 if l ∈ L(s) and equals 0 otherwise. Since the product of theas appears in the exponents, as > 1 significantly increases the degree of thepolynomials appearing in the problem and hence the dimension of the SDPin the SOS method.

It is therefore also useful to consider alternative representations of sig-moidal functions such as the following rational function:

Us(xs) =xn

s

a + xns

,

where the inflection point is x0 = (a(n−1)n+1 )1/n and the slope at the inflection

point is Us(x0) = n−1

4n ( n+1a(n−1) )

1/n. Let ys = Us(xs), the NUM problem in this

case is equivalent to

Special Volume 13

maximize∑

s ys

subject to xns − ysx

ns − ays = 0

∑

s∈S(l) xs ≤ cl, ∀l

x ≥ 0

(10)

which again can be accommodated in the SOS method and be solved byAlgorithm 1.

The benefit of this choice of utility function is that the largest degree ofthe polynomials in the problem is n + 1, therefore growing linearly with n.The disadvantage compared to the exponential form for sigmoidal functionsis that the location of the inflection point and the slope at that point cannotbe set independently.

2.4 Alternative representations for convex relaxations tononconcave NUM

The SOS relaxation we used in the last two sections is based on Schmudgen’srepresentation for positive polynomials over compact sets described by otherpolynomials. We now briefly discuss two other representations of relevance tothe NUM, that are interesting from both theoretical (e.g., interpretation) andcomputational points of view.

LP relaxation

Exploiting linearity of the constraints in NUM and with the additional as-sumption of nonempty interior for the feasible set (which holds for NUM), wecan use Handelman’s representation [18] and refine the Positivstellensatz con-dition to obtain the following convex relaxation of nonconcave NUM problem:

maximize γsubject to

γ −∑

s Us(xs) =∑

α∈NL

λα

L∏

l=1

(cl −∑

s∈S(l) xs)αl , ∀x

λα ≥ 0, ∀α,

(11)

where the optimization variables are γ and λα, and α denotes an ordered setof integers {αl}.

Fixing D where∑

l αl ≤ D, and equating the coefficients on the two sidesof the equality in (11), yields a linear program (LP). (Note that there are noSOS terms, therefore no semidefiniteness conditions.) As before, increasingthe degree D gives higher order relaxations and a tighter bound.

We provide a pricing interpretation for problem (11). First, normalize eachcapacity constraint as 1 − ul(x) ≥ 0, where ul(x) =

∑

s∈S(l) xs/cl. We can

interpret ul(x) as link usage, or the probability that link l is used at any givenpoint in time. Then, in (11), we have terms linear in u such as λl(1 − ul(x)),


in which λl has a similar interpretation as in concave NUM, as the price ofusing link l. We also have product terms such as λjk(1 − uj(x))(1 − uk(x)),where λjkuj(x)uk(x) indicates the probability of simultaneous usage of linksj and k, for links whose usage probabilities are independent (e.g., they do notshare any flows). Products of more terms can be interpreted similarly.

While the above price interpretation is not complete and does not justify allthe terms appearing in (11) (e.g., powers of the constraints; product terms forlinks with shared flows), it does provide some useful intuition: this relaxationresults in a pricing scheme that provides better incentives for the users toobserve the constraints, by putting additional reward (since the correspondingterm adds positively to the utility) for simultaneously keeping two links free.Such incentive helps tighten the upper bound and eventually achieve a feasible(and optimal) allocation.

This relaxation is computationally attractive since we need to solve anLPs instead of the previous SDPs at each level. However, significantly morelevels may be required [27].

Relaxation with no product terms

Putinar [40] showed that a polynomial positive over a compact set 3 canbe represented as an SOS-combination of the constraints. This yields thefollowing convex relaxation for nonconcave NUM problem:

maximize γsubject to

γ −∑

s Us(xs) =∑L

l=1 λl(x)(cl −∑

s∈S(l) xs), ∀x

λ(x) is SOS,

(12)

where the optimization variables are the coefficients in λl(x). Similar to theSOS relaxation (5), fixing the order D of the expression in (12) results in anSDP. This relaxation has the nice property that no product terms appear: therelaxation becomes exact with a high enough D without the need of productterms. However, this degree might be much higher than what the previousSOS method requires.

2.5 Concluding Remarks and Future Directions

We consider the NUM problem in the presence of inelastic flows, i.e., flowswith nonconcave utilities. Despite its practical importance, this problem hasnot been studied widely, mainly due to the fact it is a nonconvex problem.There has been no effective mechanism, centralized or distributed, to com-pute the globally optimal rate allocation for nonconcave utility maximization

3 with an extra assumption that always holds for linear constraints as in NUMproblems

Special Volume 15

problems in networks. This limitation has made performance assessment anddesign of networks that include inelastic flows very difficult.

To address this problem, we employed convex SOS relaxations, solved by asequence of SDPs, to obtain high quality, increasingly tighter upper bounds ontotal achievable utility. In practice, the performance of our SOSTOOLS-basedalgorithm was surprisingly good, and bounds obtained using a polynomial-time (and indeed a low-order and often minimal order) relaxation were foundto be exact, achieving the global optimum of nonconcave NUM problems.Furthermore, a dual-based sufficient test, if successful, detects the exactnessof the bound, in which case the optimal rate allocation can also be recovered.This performance of the proposed algorithm brings up a fundamental questionon whether there is any particular property or structure in nonconcave NUMthat makes it especially suitable for SOS relaxations.

We further examined the use of two more specialized polynomial repre-sentations, one that uses products of constraints with constant multipliers,resulting in LP relaxations; and at the other end of spectrum, one that uses a‘linear’ combination of constraints with SOS multipliers. We expect these re-laxations to give higher order certificates, thus their potential computationalbenefits need to be examined further. We also show they admit economicsinterpretations (e.g., prices, incentives) that provide some insight on how theSOS relaxations work in the framework of link congestion pricing for the si-multaneous usage of multiple links.

An important research issue to be further investigated is decentraliza-tion methods for rate allocation among sources with nonconcave utilities. Theproposed algorithm here is not easy to decentralize, given the products of theconstraints or polynomial multipliers that destroy the separable structure ofthe problem. However, when relaxations become exact, the sparsity patternof the coefficients can provide information about partially decentralized com-putation of optimal rates. For example, if after solving the NUM off-line, weobtain an exact bound, then if the coefficient of the cross-term xixj turns outto be zero, it means users i and j do not need to communicate to each otherto find their optimal rates. An interesting next step in this area of research isto investigate distributed version of the proposed algorithm through limitedmessage passing among clusters of network nodes and links.

3 Wireless Network Power Control

3.1 Introduction

Due to the broadcast nature of radio transmission, data rates and other Qual-ity of Service (QoS) in a wireless network are affected by interference. This isparticularly important in CDMA systems where users transmit at the sametime over the same frequency bands and their spreading codes are not per-fectly orthogonal. Transmit power control is often used to tackle this problem


of signal interference. We study how to optimize over the transmit powers tocreate the optimal set of Signal-to-Interference Ratios (SIR) on wireless links.Optimality here can be with respect to a variety of objectives, such as maxi-mizing a system-wide efficiency metric (e.g., the total system throughput), ormaximizing a Quality of Service (QoS) metric for a user in the highest QoSclass, or maximizing a QoS metric for the user with the minimum QoS metricvalue (i.e., a maxmin optimization).

While the objective represents a system-wide goal to be optimized, individ-ual users’ QoS requirements also need to be satisfied. Any power allocationmust therefore be constrained by a feasible set formed by these minimumrequirements from the users. Such a constrained optimization captures thetradeoff between user-centric constraints and some network-centric objective.Because a higher power level from one transmitter increases the interferencelevels at other receivers, there may not be any feasible power allocation tosatisfy the requirements from all the users. Sometimes an existing set of re-quirements can be satisfied, but when a new user is admitted into the system,there exists no more feasible power control solutions, or the maximized objec-tive is reduced due to the tightening of the constraint set, leading to the needfor admission control and admission pricing, respectively.

Because many QoS metrics are nonlinear functions of SIR, which is in turna nonlinear (and neither convex nor concave) function of transmit powers, ingeneral power control optimization or feasibility problems are difficult non-linear optimization problems that may appear to be NP-hard problems. Thissection shows that, when SIR is much larger than 0dB, a class of nonlinearoptimization called Geometric Programming (GP) can be used to efficientlycompute the globally optimal power control in many of these problems, andefficiently determine the feasibility of user requirements by returning either afeasible (and indeed optimal) set of powers or a certificate of infeasibility. Thisalso leads to an effective admission control and admission pricing method.

The key observation is that despite the apparent nonconvexity, through logchange of variable the GP technique turns these constrained optimization ofpower control into convex optimization, which is intrinsically tractable despiteits nonlinearity in objective and constraints. However, when SIR is comparableto or below 0dB, the power control problems are truly nonconvex with noefficient and global solution methods. In this case, we present a heuristic thatis provably convergent and empirically almost always compute the globallyoptimal power allocation by solving a sequence of GPs through the approachof successive convex approximations.

The GP approach reveals the hidden convexity structure, which impliesefficient solution methods and the global optimality of any local optimum inpower control problems with nonlinear objective functions. It clearly differ-entiates the tractable formulations in high-SIR regime from the intractableones in low-SIR regime. Power control by GP is applicable to formulationsin both cellular networks with single-hop transmission between mobile usersand base stations, and ad hoc networks with mulithop transmission among

Special Volume 17

the nodes, as illustrated through several numerical examples in this section.Traditionally, GP is solved by centralized computation through the highly ef-ficient interior point methods. In this section we present a new result on howGP can be solved distributively with message passing, which has independentvalue to general maximization of coupled objective, and applies it to powercontrol problems with a further reduction of message passing overhead byleveraging the specific structures of power control problems.

3.2 Geometric Programming

GP is a class of nonlinear, nonconvex optimization problems with many usefultheoretical and computational properties. It was invented in 1967 by Duffin,Peterson, and Zener [14], and much of the developments by early 1980s wassummarized in [1]. Since a GP can be turned into a convex optimizationproblem, a local optimum is also a global optimum, Lagrange duality gapis zero under mild conditions, and a global optimum can be computed veryefficiently. Numerical efficiency holds both in theory and in practice: interiorpoint methods applied to GP have provably polynomial time complexity [35],and are very fast in practice with high-quality software downloadable fromthe Internet (e.g., the MOSEK package). Convexity and duality propertiesof GP are well understood, and large-scale, robust numerical solvers for GPare available. Furthermore, special structures in GP and its Lagrange dualproblem lead to distributed algorithms, physical interpretations, and compu-tational acceleration beyond the generic results for convex optimization. Adetailed tutorial of GP and comprehensive survey of its recent applications tocommunication systems can be found in [10]. This subsection contains a briefintroduction of GP terminology.

There are two equivalent forms of GP: standard form and convex form.The first is a constrained optimization of a type of function called posynomial,and the second form is obtained from the first through a logarithmic changeof variable.

We first define a monomial as a function f : Rn++ → R:

f(x) = dxa(1)

1 xa(2)

2 . . . xa(n)

n

where the multiplicative constant d ≥ 0 and the exponential constantsa(j) ∈ R, j = 1, 2, . . . , n. A sum of monomials, indexed by k below, is called aposynomial:

f(x) =K∑

k=1

dkxa(1)

k

1 xa(2)

k

2 . . . xa(n)

kn .

where dk ≥ 0, k = 1, 2, . . . , K, and a(j)k ∈ R, j = 1, 2, . . . , n, k = 1, 2, . . . , K.

For example, 2x−π1 x0.5

2 +3x1x1003 is a posynomial in x, x1−x2 is not a posyn-

omial, and x1/x2 is a monomial, thus also a posynomial.


Minimizing a posynomial subject to posynomial upper bound inequalityconstraints and monomial equality constraints is called GP in standard form:

minimize f0(x)subject to fi(x) ≤ 1, i = 1, 2, . . . , m,

hl(x) = 1, l = 1, 2, . . . , M(13)

where fi, i = 0, 1, . . . , m, are posynomials: fi(x) =∑Ki

k=1 dikxa(1)

ik

1 xa(2)

ik

2 . . . xa(n)

ikn ,

and hl, l = 1, 2, . . . , M are monomials: hl(x) = dlxa(1)

l

1 xa(2)

l

2 . . . xa(n)

ln .

GP in standard form is not a convex optimization problem, because posyn-omials are not convex functions. However, with a logarithmic change of thevariables and multiplicative constants: yi = log xi, bik = log dik, bl = log dl,and a logarithmic change of the functions’ values, we can turn it into thefollowing equivalent problem in y:

minimize p0(y) = log∑K0

k=1 exp(aT0ky + b0k)

subject to pi(y) = log∑Ki

k=1 exp(aTiky + bik) ≤ 0, i = 1, 2, . . . , m,

ql(y) = aTl y + bl = 0, l = 1, 2, . . . , M.

(14)

This is referred to as GP in convex form, which is a convex optimizationproblem since it can be verified that the log-sum-exp function is convex [5].

In summary, GP is a nonlinear, nonconvex optimization problem that canbe transformed into a nonlinear, convex problem. GP in standard form canbe used to formulate network resource allocation problems with nonlinear ob-jectives under nonlinear QoS constraints. The basic idea is that resources areoften allocated proportional to some parameters, and when resource alloca-tions are optimized over these parameters, we are maximizing an invertedposynomial subject to lower bounds on other inverted posynomials, which areequivalent to GP in standard form.

SP/GP, SOS/SDP

Note that, although posynomial seems to be a non-convex function, it becomesa convex function after the log transformation, as shown in an example inFigure 5. Compared to the (constrained or unconstrained) minimization ofa polynomial, the minimization of a posynomial in GP relaxes the integerconstraint on the exponential constants but imposes a positivity constraint onthe multiplicative constants and variables. There is a sharp contrast betweenthese two problems: polynomial minimization is NP-hard, but GP can beturned into convex optimization with provably polynomial-time algorithmsfor a global optimum.

In an extension of GP called Signomial Programming to be discussed laterin this section, the restriction of non-negative multiplicative constants is re-moved. This results in a general class of nonlinear and truly non-convex prob-lems that is simultaneously a generalization of GP and polynomial minimiza-tion over the positive quadrant, as summarized in the comparison Table 1.

Special Volume 19

05

10

0

5

10

0

20

40

60

80

100

120

YX

Fun

ctio

n

02

4

0240.5

1

1.5

2

2.5

3

3.5

4

4.5

5

AB

Fun

ctio

n

Fig. 5. A bi-variate posynomial before (left graph) and after (right graph) the logtransformation. A non-convex function is turned into a convex one.

GP PMoP SP

c R+ R R

a(j) R Z+ R

xj R++ R++ R++

Table 1. Comparison of GP, constrained polynomial minimization over the positivequadrant (PMoP), and Signomial Programming (SP). All three types of problemsminimize a sum of monomials subject to upper bound inequality constraints on sums

of monomials, but have different definitions of monomial: c∏

jxa(j)

j , as shown in thetable. GP is known to be polynomial-time solvable, but PMoP and SP are not.

The objective function of Signomial Programming can be formulated asminimizing a ratio between two posynomials, which is not a posynomial (sinceposynomials are closed under positive multiplication and addition but notdivision). As shown in Figure 6, a ratio between two posynomials is a non-convex function both before and after the log transformation. Although itdoes not seem likely that Signomial Programming can be turned into a con-vex optimization problem, there are heuristics to solve it through a sequenceof GP relaxations. However, due to the absence of algebraic structures foundin polynomials, such methods for Signomial Programming currently lack atheoretical foundation of convergence to global optimality. This is in contrastto the sum-of-squares method [38], which uses a nested family of SDP relax-


ations to solve constrained polynomial minimization problems as explained inthe last section.

0

5

10

0

5

10−20

0

20

40

60

XY

Fun

ctio

n

0

1

2

3

0

1

2

32

2.5

3

3.5

AB

Fun

ctio

n

Fig. 6. Ratio between two bi-variate posynomials before (left graph) and after (rightgraph) the log transformation. It is a non-convex function in both cases.

3.3 Power Control by Geometric Programming: Convex Case

Various schemes for power control, centralized or distributed, have been ex-tensively studied since 1990s based on different transmission models and ap-plication needs, e.g., in [2, 16, 34, 43, 50, 55]. This subsection summarizes thenew approach of formulating power control problems through GP. The key ad-vantage is that globally optimal power allocations can be efficiently computedfor a variety of nonlinear system-wide objectives and user QoS constraints,even when these nonlinear problems appear to be nonconvex optimization.

Basic model

Consider a wireless (cellular or multihop) network with n logical transmit-ter/receiver pairs. Transmit powers are denoted as P1, . . . , Pn. In the cellularuplink case, all logical receivers may reside in the same physical receiver, i.e.,the base station. In the multihop case, since the transmission environmentcan be different on the links comprising an end-to-end path, power controlschemes must consider each link along a flow’s path.

Special Volume 21

Under Rayleigh fading, the power received from transmitter j at receiveri is given by GijFijPj where Gij ≥ 0 represents the path gain (it may alsoencompass antenna gain and coding gain) that is often modeled as propor-tional to d−γ

ij , where dij denotes distance, γ is the power fall-off factor, andFij models Rayleigh fading and are independent and exponentially distributedwith unit mean. The distribution of the received power from transmitter j atreceiver i is then exponential with mean value E [GijFijPj ] = GijPj . The SIRfor the receiver on logical link i is:

SIRi =PiGiiFii

∑Nj 6=i PjGijFij + ni

(15)

where ni is the noise power for receiver i.The constellation size M used by a link can be closely approximated for

MQAM modulations as follows: M = 1 + −φ1

ln(φ2BER)SIR, where BER is the

bit error rate and φ1, φ2 are constants that depend on the modulation type.Defining K = −φ1

ln(φ2BER) leads to an expression of the data rate Ri on the

ith link as a function of the SIR: Ri = 1T log2(1 + KSIRi), which can be

approximated as

Ri =1

Tlog2(KSIRi) (16)

when KSIR is much larger than 1. This approximation is reasonable eitherwhen the signal level is much higher than the interference level or, in CDMAsystems, when the spreading gain is large. For notational simplicity in the restof this section, we redefine Gii as K times the original Gii, thus absorbingconstant K into the definition of SIR.

The aggregate data rate for the system can then be written as

Rsystem =∑

i

Ri =1

Tlog2

[

∏

i

SIRi

]

.

So in the high SIR regime, aggregate data rate maximization is equivalentto maximizing a product of SIR. The system throughput is the aggregatedata rate supportable by the system given a set of users with specified QoSrequirements.

Outage probability is another important QoS parameter for reliable com-munication in wireless networks. A channel outage is declared and packetslost when the received SIR falls below a given threshold SIRth, often com-puted from the BER requirement. Most systems are interference dominatedand the thermal noise is relatively small, thus the ith link outage probabilityis

Po,i = Prob{SIRi ≤ SIRth}

= Prob{GiiFiiPi ≤ SIRth

∑

j 6=i

GijFijPj}.


The outage probability can be expressed as Po,i = 1 −∏

j 6=i1

1+SIRthGijPj

GiiPi

[25], which means that the upper bound Po,i ≤ Po,i,max can be written as anupper bound on a posynomial in P:

∏

j 6=i

(

1 +SIRthGijPj

GiiPi

)

≤1

1 − Po,i,max. (17)

Cellular wireless networks

We first present how GP-based power control applies to cellular wireless net-works with one-hop transmission from N users to a base station. These resultsextend the scope of power control by the classical solution in CDMA systemsthat equalizes SIRs, and those by the iterative algorithms (e.g., in [2, 16, 34])that minimize total power (a linear objective function) subject to SIR con-straints.

We start the discussion on the suite of power control problem formula-tions with a simple objective function and basic constraints. The followingconstrained problem of maximizing the SIR of a particular user i∗ is a GP:

maximize Ri∗(P)subject to Ri(P) ≥ Ri,min, ∀i,

Pi1Gi1 = Pi2Gi2,0 ≤ Pi ≤ Pi,max, ∀i.

The first constraint, equivalent to SIRi ≥ SIRi,min, sets a floor on the SIRof other users and protects these users from user i∗ increasing her transmitpower excessively. The second constraint reflects the classical power controlcriterion in solving the near-far problem in CDMA systems: the expectedreceived power from one transmitter i1 must equal that from another i2. Thethird constraint is regulatory or system limitations on transmit powers. Allconstraints can be verified to be inequality upper bounds on posynomials intransmit power vector P.

Alternatively, we can use GP to maximize the minimum rate among allusers. The maxmin fairness objective:

maximizeP mini

{Ri}

can be accommodated in GP-based power control because it can be turnedinto equivalently maximizing an auxiliary variable t such that SIRi(P) ≥exp(t), ∀i, which has posynomial objective and constraints in (P, t).

Example 5. A small illustrative example. A simple system comprisedof five users is used for a numerical example. The five users are spaced atdistances d of 1, 5, 10, 15, and 20 units from the base station. The power fall-offfactor γ = 4. Each user has a maximum power constraint of Pmax = 0.5mW .The noise power is 0.5µW for all users. The SIR of all users, other than the

Special Volume 23

user we are optimizing for, must be greater than a common threshold SIR levelβ. In different experiments, β is varied to observe the effect on the optimizeduser’s SIR. This is done independently for the near user at d = 1, a mediumdistance user at d = 15, and the far user at d = 20. The results are plotted inFigure 7.

−5 0 5 10−20

−15

−10

−5

0

5

10

15

20

Threshold SIR (dB)

Opt

imiz

ed S

IR (

dB)

Optimized SIR vs. Threshold SIR

nearmediumfar

Fig. 7. Constrained optimization of power control in a cellular network (Example5).

Several interesting effects are illustrated. First, when the required thresh-old SIR in the constraints is sufficiently high, there is no feasible power controlsolution. At moderate threshold SIR, as β is decreased, the optimized SIR ini-tially increases rapidly. This is because it is allowed to increase its own powerby the sum of the power reductions in the four other users, and the noise isrelatively insignificant. At low threshold SIR, the noise becomes more signifi-cant and the power trade-off from the other users less significant, so the curvestarts to bend over. Eventually, the optimized user reaches its upper boundon power and cannot utilize the excess power allowed by the lower thresholdSIR for other users. This is exhibited by the transition from a sharp bend inthe curve to a much shallower sloped curve.

We now proceed to show that GP can also be applied to the problemformulations with an overall system objective of total system throughput,under both user data rate constraints and outage probability constraints.

The following constrained problem of maximizing system throughput is aGP:


maximize Rsystem(P)subject to Ri(P) ≥ Ri,min, ∀i,

Po,i(P) ≤ Po,i,max, ∀i,0 ≤ Pi ≤ Pi,max, ∀i

(18)

where the optimization variables are the transmit powers P. The objective isequivalent to minimizing the posynomial

∏

i ISRi, where ISR is 1SIR

. Each ISRis a posynomial in P and the product of posynomials is again a posynomial.The first constraint is from the data rate demand Ri,min by each user. Thesecond constraint represents the outage probability upper bounds Po,i,max.These inequality constraints put upper bounds on posynomials of P, as canbe readily verified through (16) and (17). Thus (18) is indeed a GP, andefficiently solvable for global optimality.

There are several obvious variations of problem (18) that can be solvedby GP, e.g., we can lower bound Rsystem as a constraint and maximize Ri∗

for a particular user i∗, or have a total power∑

i Pi constraint or objectivefunction.

The objective function to be maximized can also be generalized to aweighted sum of data rates:

∑

i wiRi where w � 0 is a given weight vector.This is still a GP because maximizing

∑

i wi log SIRi is equivalent to maximiz-ing log

∏

i SIRwi

i , which is in turn equivalent to minimizing∏

i ISRwi

i . Now useauxiliary variables {ti}, and minimize

∏

i twi

i over the original constraints in(18) plus the additional constraints ISRi ≤ ti for all i. This is readily verifiedto be a GP in (x, t), and is equivalent to the original problem.

Generalizing the above discussions and observing that high-SIR assump-tion is needed for GP formulation only when there are sums of log(1 + SIR)in the optimization problem, we have the following summary.

Proposition 1 In the high-SIR regime, any combination of objectives (A)-(E) and constraints (a)-(e) in Table 2 (pick any one of the objectives and anysubset of the constraints) is a power control optimization problem that can besolved by GP, i.e., can be transformed into a convex optimization with effi-cient algorithms to compute the globally optimal power vector. When objectives(C)-(D) or constraints (c)-(d) do not appear, the power control optimizationproblem can be solved by GP in any SIR regime.

In addition to efficient computation of the globally optimal power allo-cation with nonlinear objectives and constraints, GP can also be used foradmission control based on feasibility study described in [10], and for deter-mining which QoS constraint is a performance bottleneck, i.e., met tightly atthe optimal power allocation 4.

4 This is because most GP solution algorithms solve both the primal GP and itsLagrange dual problem, and by complementary slackness condition, a resourceconstraint is tight at optimal power allocation when the corresponding optimaldual variable is non-zero.

Special Volume 25

Table 2. Suite of Power Control Optimization solvable by GP

Objective Function Constraints

(A) Maximize Ri∗ (specific user) (a) Ri ≥ Ri,min (rate constraint)

(B) Maximize mini Ri (worst-case user) (b) Pi1Gi1 = Pi2Gi2 (near-far constraint)

(C) Maximize∑

iRi (total throughput) (c)

∑

iRi ≥ Rsystem,min (total throughput constraint)

(D) Maximize∑

iwiRi (weighted rate sum) (d) Po,i ≤ Po,i,max (outage prob. constraint)

(E) Minimize∑

iPi (total power) (e) 0 ≤ Pi ≤ Pi,max (power constraint)

Extensions

In wireless multihop networks, system throughput may be measured either byend-to-end transport layer utilities or by link layer aggregate throughput. GPapplication to the first approach has appeared in [9], and those to the secondapproach in [10]. Furthermore, delay and buffer overflow properties can alsobe accommodated in the constraints or objective function of GP-based powercontrol.

3.4 Power Control by Geometric Programming: Non-convex Case

If we maximize the total throughput Rsystem in the medium to low SIR case,i.e., when SIR is not much larger than 0dB, the approximation of log(1+SIR)as log SIR does not hold. Unlike SIR, which is an inverted posynomial, 1+SIR isnot an inverted posynomial. Instead, 1

1+SIRis a ratio between two posynomials:

f(P)

g(P)=

∑

j 6=i GijPj + ni∑

j GijPj + ni. (19)

Minimizing or upper bounding a ratio between two posynomials belongs toa truly nonconvex class of problems known as Complementary GP [1, 10] thatis an intrinsically intractable NP-hard problem. An equivalent generalizationof GP is Signomial Programming (SP) [1, 10]: minimizing a signomial subjectto upper bound inequality constraints on signomials, where a signomial s(x) isa sum of monomials, possibly with negative multiplicative coefficients: s(x) =∑N

i=1 cigi(x) where c ∈ RN and gi(x) are monomials 5.

Successive convex approximation method

Consider the following nonconvex problem:

5 An SP can always be converted into a Complementary GP, because an inequalityin SP, which can be written as fi1(x)−fi2(x) ≤ 1, where fi1, fi2 are posynomials,

is equivalent to an inequality fi1(x)1+fi2(x)

≤ 1 in Complementary GP.


minimize f0(x)subject to fi(x) ≤ 1, i = 1, 2, . . . , m,

(20)

where f0 is convex without loss of generality6, but the fi(x)’s, ∀i are noncon-vex. Since directly solving this problem is NP-hard, we want to solve it bya series of approximations fi(x) ≈ fi(x), ∀x, each of which can be optimallysolved in an easy way. It is known [33] that if the approximations satisfy thefollowing three properties, then the solutions of this series of approximationsconverge to a point satisfying the necessary optimality Karush-Kuhn-Tucker(KKT) conditions of the original problem:

(1) fi(x) ≤ fi(x) for all x,(2) fi(x0) = fi(x0) where x0 is the optimal solution of the approximated

problem in the previous iteration,(3) ∇fi(x0) = ∇fi(x0).The following algorithm describes the generic successive approximation

approach.Given a method to approximate fi(x) with fi(x) , ∀i, around somepoint of interest x0, the following algorithm provides the output of a vectorthat satisfies the KKT conditions of the original problem.

Algorithm 2. Successive approximation to a nonconvex problem.1) Choose an initial feasible point x(0) and set k = 1.2) Form an approximated problem of (20) based on the previous point

x(k−1).3) Solve the k-th approximated problem to obtain x(k).4) Increment k and go to step 2 until convergence to a stationary point.

Single condensation method. Complementary GPs involve upper bounds onthe ratio of posynomials as in (19); they can be turned into GPs by approx-imating the denominator of the ratio of posynomials, g(x), with a monomialg(x), but leaving the numerator f(x) as a posynomial.

Lemma 1 Let g(x) =∑

i ui(x) be a posynomial. Then

g(x) ≥ g(x) =∏

i

(

ui(x)

αi

)αi

. (21)

If, in addition, αi = ui(x0)/g(x0), ∀i, for any fixed positive x0, then g(x0) =g(x0), and g(x) is the best local monomial approximation to g(x) near x0 inthe sense of first order Taylor approximation.

Proof. The arithmetic-geometric mean inequality states that∑

i αivi ≥∏

i vαi

i ,where v ≻ 0 and α � 0, 1T

α = 1. Letting ui = αivi, we can write this basic

6 If f0 is nonconvex, we can move the objective function to theconstraint by introducing auxiliary scalar variable t and writingminimize t subject to the additional constraint f0(x) − t ≤ 0.

Special Volume 27

inequality as∑

i ui ≥∏

i

(

ui

αi

)αi

. The inequality becomes an equality if we

let αi = ui/∑

i ui, ∀i, which satisfies the condition that α � 0 and 1Tα = 1.

It can be readily verified that the best local monomial approximation of g(x)near x0 is g(x).

Proposition 2 The approximation of a ratio of posynomials f(x)/g(x) withf(x)/g(x) where g(x) is the monomial approximation of g(x) using thearithmetic-geometric mean approximation of Lemma 1 satisfies the three con-ditions for the convergence of the successive approximation method.

Proof. Conditions (1) and (2) are clearly satisfied since g(x) ≥ g(x) andg(x0) = g(x0) (Lemma 1). Condition (3) is easily verified by taking derivativesof g(x) and g(x).

Double condensation method. Another choice of approximation is to makea double monomial approximation for both the denominator and numeratorin (19). However, in order to satisfy the three conditions for the convergenceof the successive approximation method, a monomial approximation for thenumerator f(x) should satisfy f(x) ≤ f(x).

Applications to power control

Figure 8 shows a block diagram of the approach of GP-based power controlfor general SIR regime. In the high SIR regime, we need to solve only one GP.In the medium to low SIR regimes, we solve truly nonconvex power controlproblems that cannot be turned into convex formulation through a series ofGPs.

(High SIR)OriginalProblem

Solve1 GP

(Mediumto

Low SIR)

OriginalProblem

SPComplementary

GP (Condensed)Solve1 GP

-

- - -

6

Fig. 8. GP-based power control for general SIR regime.

GP-based power control problems in the medium to low SIR regimes be-come SP (or, equivalently, Complementary GP), which can be solved by thesingle or double condensation method. We focus on the single condensationmethod here. Consider a representative problem formulation of maximizingtotal system throughput in a cellular wireless network subject to user rate


and outage probability constraints in problem (18), which can be explicitlywritten out as:

minimize∏N

i=11

1+SIRi

subject to (2TRi,min − 1) 1SIRi

≤ 1, i = 1, . . . , N,

(SIRth)N−1(1 − Po,i,max)∏N

j 6=iGijPj

GiiPi≤ 1, i = 1, . . . , N,

Pi(Pi,max)−1 ≤ 1, i = 1, . . . , N.

(22)

All the constraints are posynomials. However, the objective is not a posyn-omial, but a ratio between two posynomials as in (19). This power controlproblem can be solved by the condensation method by solving a series ofGPs. Specifically, we have the following single-condensation algorithm:

Algorithm 3. Single condensation GP power control.1) Evaluate the denominator posynomial of the objective function in (22)

with the given P.2) Compute for each term i in this posynomial,

αi =value of ith term in posynomial

value of posynomial.

3) Condense the denominator posynomial of the (22) objective functioninto a monomial using (21) with weights αi.

4) Solve the resulting GP using an interior point method.5) Go to step 1 using P of step 4.6) Terminate the kth loop if ‖ P(k) − P(k−1) ‖≤ ǫ where ǫ is the error

tolerance for exit condition.

As condensing the objective in the above problem gives us an underesti-mate of the objective value, each GP in the condensation iteration loop triesto improve the accuracy of the approximation to a particular minimum in theoriginal feasible region. All three conditions for convergence are satisfied, andthe algorithm is convergent. Empirically through extensive numerical experi-ments, we observe that it almost always computes the globally optimal powerallocation.

Example 6. Single condensation example. We consider a wireless cellularnetwork with 3 users. Let T = 10−6s, Gii = 1.5, and generate Gij , i 6= j,as independent random variables uniformly distributed between 0 and 0.3.Threshold SIR is SIRth = −10dB, and minimal data rate requirements are100 kbps, 600 kbps and 1000 kbps for logical links 1, 2 and 3 respectively.Maximal outage probabilities are 0.01 for all links, and maximal transmitpowers are 3mW, 4mW and 5mW for link 1, 2 and 3, respectively. For eachinstance of SP power control (22), we pick a random initial feasible powervector P uniformly between 0 and Pmax. Figure 9 compares the maximizedtotal network throughput achieved over five hundred sets of experiments withdifferent initial vectors. With the (single) condensation method, SP converges

Special Volume 29

to different optima over the entire set of experiments, achieving (or comingvery close to) the global optimum at 5290 bps 96% of the time and a localoptimum at 5060 bps 4% of the time. The average number of GP iterationsrequired by the condensation method over the same set of experiments is 15if an extremely tight exit condition is picked for SP condensation iteration:ǫ = 1 × 10−10. This average can be substantially reduced by using a larger ǫ,e.g., increasing ǫ to 1 × 10−2 requires on average only 4 GPs.

0 50 100 150 200 250 300 350 400 450 5005050

5100

5150

5200

5250

5300

Experiment index

Tot

al s

yste

m th

roug

hput

ach

ieve

d

Optimized total system throughput

Fig. 9. Maximized total system throughput achieved by the (single) condensationmethod for 500 different initial feasible vectors (Example 6). Each point representsa different experiment with a different initial power vector.

We have thus far discussed a power control problem (22) where the objec-tive function needs to be condensed. The method is also applicable if someconstraint functions are signomials and need to be condensed [51].

3.5 Distributed Implementation

A limitation for GP-based power control in ad hoc networks without basestations is the need for centralized computation (e.g., by interior point meth-ods). The GP formulations of power control problems can also be solved bya new method of distributed algorithm for GP. The basic idea is that eachuser solves its own local optimization problem and the coupling among usersis taken care of by message passing among the users. Interestingly, the spe-cial structure of coupling for the problem at hand (all coupling among thelogical links can be lumped together using interference terms) allows one to


further reduce the amount of message passing among the users. Specifically,we use a dual decomposition method to decompose a GP into smaller sub-problems whose solutions are jointly and iteratively coordinated by the useof dual variables. The key step is to introduce auxiliary variables and to addextra equality constraints, thus transferring the coupling in the objective tocoupling in the constraints, which can be solved by introducing “consistencypricing” (in contrast to “congestion pricing”). We illustrate this idea throughan unconstrained GP followed by an application of the technique to powercontrol.

Distributed algorithm for GP

Suppose we have the following unconstrained standard form GP in x ≻ 0:

minimize∑

i fi(xi, {xj}j∈I(i)) (23)

where xi denotes the local variable of the ith user, {xj}j∈I(i) denote thecoupled variables from other users, and fi is either a monomial or posynomial.Making a change of variable yi = log xi, ∀i, in the original problem, we obtain

minimize∑

i fi(eyi , {eyj}j∈I(i)).

We now rewrite the problem by introducing auxiliary variables yij for thecoupled arguments and additional equality constraints to enforce consistency:

minimize∑

i fi(eyi , {eyij}j∈I(i))

subject to yij = yj, ∀j ∈ I(i), ∀i.(24)

Each ith user controls the local variables (yi, {yij}j∈I(i)). Next, the Lagrangianof (24) is formed as

L({yi}, {yij}; {γij}) =∑

i

fi(eyi , {eyij}j∈I(i)) +

∑

i

∑

j∈I(i)

γij(yj − yij)

=∑

i

Li(yi, {yij}; {γij})

where

Li(yi, {yij}; {γij}) = fi(eyi , {eyij}j∈I(i))+

(

∑

j:i∈I(j)

γji

)

yi−∑

j∈I(i)

γijyij . (25)

The minimization of the Lagrangian with respect to the primal variables({yi}, {yij}) can be done simultaneously and distributively by each user inparallel. In the more general case where the original problem (23) is con-strained, the additional constraints can be included in the minimization ateach Li.

Special Volume 31

In addition, the following master Lagrange dual problem has to be solvedto obtain the optimal dual variables or consistency prices {γij}:

max{γij}

g({γij}) (26)

whereg({γij}) =

∑

i

minyi,{yij}

Li(yi, {yij}; {γij}).

Note that the transformed primal problem (24) is convex with zero dualitygap; hence the Lagrange dual problem indeed solves the original standard GPproblem. A simple way to solve the maximization in (26) is with the followingsubgradient update for the consistency prices:

γij(t + 1) = γij(t) + δ(t)(yj(t) − yij(t)). (27)

Appropriate choice of the stepsize δ(t) > 0, e.g., δ(t) = δ0/t for some constantδ0 > 0, leads to convergence of the dual algorithm.

Summarizing, the ith user has to: i) minimize the function Li in (25)involving only local variables, upon receiving the updated dual variables{γji, j : i ∈ I(j)}, and ii) update the local consistency prices {γij , j ∈ I(i)}with (27), and broadcast the updated prices to the coupled users.

Applications to power control

As an illustrative example, we maximize the total system throughput in thehigh SIR regime with constraints local to each user. If we directly applied thedistributed approach described in the last subsection, the resulting algorithmwould require knowledge by each user of the interfering channels and inter-fering transmit powers, which would translate into a large amount of messagepassing. To obtain a practical distributed solution, we can leverage the struc-tures of power control problems at hand, and instead keep a local copy ofeach of the effective received powers PR

ij = GijPj . Again using problem (18)as an example formulation and assuming high SIR, we can write the problemas following (after the log change of variable):

minimize∑

i log(

G−1ii exp(−Pi)

(

∑

j 6=i exp(PRij ) + σ2

))

subject to PRij = Gij + Pj ,

Constraints local to each user, e.g., (a),(d) and (e) in Table (2).(28)

The partial Lagrangian is

L =∑

i

log

G−1ii exp(−Pi)

∑

j 6=i

exp(PRij ) + σ2

+∑

i

∑

j 6=i

γij

(

PRij −

(

Gij + Pj

))

,

(29)


and the local ith Lagrangian function in (29) is distributed to the ith user,from which the dual decomposition method can be used to determine the op-timal power allocation P∗. The distributed power control algorithm is sum-marized as follows.

Algorithm 4. Distributed power allocation update to maximize Rsystem.At each iteration t:1) The ith user receives the term

(

∑

j 6=i γji(t))

involving the dual variables

from the interfering users by message passing and minimizes the following local

Lagrangian with respect to Pi(t),{

PRij (t)

}

jsubject to the local constraints:

Li

(

Pi(t),{

PRij (t)

}

j; {γij(t)}j

)

= log(

G−1ii exp(−Pi(t))

(

∑

j 6=i exp(PRij (t)) + σ2

))

+

∑

j 6=i γijPRij (t) −

(

∑

j 6=i γji(t))

Pi(t).

2) The ith user estimates the effective received power from each of theinterfering users PR

ij (t) = GijPj(t) for j 6= i, updates the dual variable by

γij (t + 1) = γij (t) + (δ0/t)(

PRij (t) − log GijPj(t)

)

, (30)

and then broadcast them by message passing to all interfering users in thesystem.

Example 7. Distributed GP power control. We apply the distributed al-gorithm to solve the above power control problem for three logical links withGij = 0.2, i 6= j, Gii = 1, ∀i, maximal transmit powers of 6mW, 7mW and7mW for link 1, 2 and 3 respectively. Figure 10 shows the convergence of thedual objective function towards the globally optimal total throughput of thenetwork. Figure 11 shows the convergence of the two auxiliary variables inlink 1 and 3 towards the optimal solutions.


Power control problems with nonlinear objective and constraints may seem tobe difficult, NP-hard problems to solve for global optimality. However, whenSIR is much larger than 0dB, GP can be used to turn these problems intointrinsically tractable convex formulations, accommodating a variety of pos-sible combinations of objective and constraint functions involving data rate,delay, and outage probability. Then interior point algorithms can efficientlycompute the globally optimal power allocation even for a large network. Feasi-bility analysis of GP naturally lead to admission control and pricing schemes.When the high SIR approximation cannot be made, these power control prob-lems become SP and may be solved by the heuristic of condensation method

Special Volume 33

0 50 100 150 2000.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

2.2x 10

4

Iteration

Dual objective function

Fig. 10. Convergence of the dual objective function through distributed algorithm(Example 7).

0 50 100 150 200−10

−8

−6

−4

−2

0

2

Iteration

Consistency of the auxiliary variables

log(P2)

log(PR12

/G12

)

log(PR32

/G32

)

Fig. 11. Convergence of the consistency constraints through distributed algorithm(Example 7).

through a series of GPs. Distributed optimal algorithms for GP-based powercontrol in multihop networks can also be carried out through message passing.

Several interesting research issues remain to be further explored, in par-ticular, reduction of SP solution complexity (e.g., by using high-SIR approx-imation to obtain the initial power vector and by solving the series of GPsonly approximately except the last GP), and combination of SP solution anddistributed algorithm for distributed power control in low SIR regime.


4 DSL Spectrum Management

4.1 Introduction

Digital Subscriber Line (DSL) technologies transform traditional voice-bandcopper channels into high bandwidth data pipes, which are capable of deliver-ing data rates up to several Mbps per twisted-pair over a distance of about 10kft. The major obstacle for performance improvement in modern DSL systems(e.g., ADSL and VDSL) is the crosstalk, which is the interference generatedbetween different lines in the same binder. The crosstalk is typically 10-20dB larger than the background noise, and direct crosstalk cancellation (e.g.,[6, 17]) may not be feasible in many cases due to the complexity issues or asa result of unbundling. To mitigate the detriments caused by crosstalk, staticspectrum management that mandates spectrum mask or flat power backoffacross all frequencies (i.e., tones) has been implemented in the current sys-tem.

Dynamic spectrum management (DSM) techniques, on the other hand,can significantly improves data rates over the current practice of static spec-trum management. Within the current capability of the DSL modems, eachmodem has the capability to shape its own power spectrum density (PSD)across different tones, but can only treat crosstalk as background noise (i.e.,no signal level coordination, such as vector transmission or iterative decod-ing, is allowed), and each modem is inherently a single-input-single-outputcommunication system. The objective would be to optimize the PSD of allusers on all tones (i.e., continuous power loading or discrete bit loading), suchthat they are “compatible” with each other and the system performance (e.g.,weighted rate sum as discussed below) is maximized.

Compared to power control in wireless networks treated in the last section,the channel gains are not time-varying in DSL systems, but the problem di-mension increases tremendously because there are many “tones” (or frequencycarriers) over which transmission takes place. Nonconvexity still remains amajor technical challenge, and high SIR approximation in general cannot bemade. However, utilizing the specific structures of the problem (e.g., the chan-nel gain values), an efficient and distributed heuristics is shown to performclose to the optimum in many realistic DSL network scenarios.

This section develops, analyzes, and simulates the new algorithm for spec-trum management in frequency selective interference channels for DSL, calledAutonomous Spectrum Management (ASB). It is autonomous (distributed al-gorithm across the users without explicit information exchange) with linear-complexity, while provably convergent and comes close to the globally optimalrate region in practice, thus overcoming bottlenecks in the state-of-the-art al-gorithms in DSM, such as IW, OSB, and ISB summarized below.

Let K be the number of tones and N the number of users (lines). Theiterative waterfilling (IW) algorithm [56] is among one of the first DSM algo-rithms proposed. In IW, each user views any crosstalk experienced as additive

Special Volume 35

Gaussian noise, and seeks to maximize its data rate by “waterfilling” over theaggregated noise plus interference. No information exchange is needed amongusers, and all the actions are completely autonomous. IW leads to an greatperformance over the static approach, and enjoys a low complexity that islinear in N . However, the greedy nature of IW leads to a performance farfrom optimal in the near-far scenarios such as mixed CO/RT deployment andupstream VDSL.

To address this, an optimal spectrum balancing (OSB) algorithm [8] hasbeen proposed, which finds the best possible spectrum management solutionunder the current capabilities of the DSL modems. OSB avoids the selfishbehaviors of individual users by aiming at the maximization of a total weightedsum of users rates, which corresponds to a boundary point of the achievablerate region. On the other hand, OSB has a high computation complexity thatis exponential in N , which quickly leads to intractability when N is largerthan 6. Moreover, it is a completely centralized algorithm where a spectrummanagement center at the central office needs to know the global information(i.e., all the noise PSDs and crosstalk channel gains in the same binder) toperform the algorithm.

As an improvement to the OSB algorithm, an iterative spectrum balancing(ISB) algorithm [7] has been proposed, which is based on a weighted sumrate maximization similar as OSB. Different from OSB, ISB performs theoptimization iteratively through users, which leads to a quadratic complexityin N. Closely to optimal performance can be achieved by the ISB algorithm inmost cases. However, each user still needs to know the global information as inOSB, thus ISB is still a centralized algorithm and considered to be impracticalin many cases.

This section presents the ASB algorithm [21], which further reduce thecomplexity from ISB algorithm, and achieves close optimal performance sim-ilar as ISB and OSB. The basic idea is to use the concept of reference lineto mimic a “typical” victim line in the current binder. By setting the powerspectrum level to protect the reference line, a good balance between selfishand global maximizations can be achieved. The ASB algorithm enjoys a linearcomplexity in N and K, and can be implemented in a completely autonomousway. We prove the convergence of ASB for both 2-user and N -user case, underboth sequential and parallel updates.

Table 3 compares various aspects of different DSM algorithms. Utilizingthe structures of the DSL problem, in particular, the lack of channel variationand user mobility, is the key to provide a linear complexity, distributed, con-vergent, and almost optimal solution to this coupled nonconvex optimizationproblem.

4.2 System Model

Using the notation as in [8, 7], we consider a DSL bundle with N = {1, ..., N}modems (i.e., lines, users) and K = {1, ..., K} tones. Assume discrete multi-


Table 3. Comparison of different DSM algorithms

Algorithm Operation Complexity Performance Reference

IW Autonomous O (KN) Suboptimal [56]

OSB Centralized O(

KeN)

Optimal [8]

ISB Centralized O(

KN2)

Near optimal [7]

ASB Autonomous O (KN) Near optimal [21]

tone (DMT) modulation is employed by all modems, transmission can bemodeled independently on each tone as

yk = Hkxk + zk.

The vector xk = {xnk , n ∈ N} contains transmitted signals on tone k, where

xnk is the signal transmitted onto line n at tone k. yk and zk have similar

structures. yk is the vector of received signals on tone k. zk is the vector ofadditive noise on tone k and contains thermal noise, alien crosstalk, single-carrier modems, radio frequency interference etc. Hk = [hn,m

k ]n,m∈N is the

N × N channel transfer matrix on tone k, where hn,mk is the channel from

TX m to RX n on tone k. The diagonal elements of Hk contains the direct-channels whilst the off-diagonal elements contain the crosstalk channels. We

denote the transmit power spectrum density (PSD) snk = E

{

|xnk |

2}

. In last

section’s notation for single-carrier systems, we would have snk = Pn, ∀k. For

convenience we denote the vector containing the PSD of user n on all tonesas sn = {sn

k , k ∈ K} . We denote DMT symbol rate as fs.Assume that each modem treats interference from other modems as noise.

When the number of interfering modems is large, the interference can bewell approximated by a Gaussian distribution. Under this assumption theachievable bit loading of user n on tone k is

bnk = log

(

1 +1

Γ

snk

∑

m 6=n αn,mk sm

k + σnk

)

, (31)

where αn,mk = |hn,m

k |2/ |hn,n

k |2

is the normalized crosstalk channel gain, and

σnk is the noise power density normalized by the direct channel gain |hn,n

k |2.

Here Γ denotes the SINR-gap to capacity, which is a function of the desiredBER, coding gain and noise margin [49]. Without loss of generality, we assumeΓ = 1. The data rate on line n is thus

Rn = fs

∑

k∈K

bnk . (32)

Each modem n is typically subject to a total power constraint Pn, due to thelimitations on each modem’s analog frontend.

Special Volume 37

∑

k∈K

snk ≤ Pn. (33)

4.3 Spectrum Management Problem Formulation

The spectrum management problem is defined as follows

maximize R1

subject to Rn ≥ Rn,target, ∀n > 1∑

k∈K snk ≤ Pn, ∀n.

(34)

Here Rn,target the target rate constraint of user n. In other words, wetry to maximize the achievable rate of user 1, under the condition that allother users achieve their target rates Rn,target. The mutual interference in(31) causes Problem (34) to be coupled across users on each tone, and theindividual total power constraint causes Problem (34) to be coupled acrosstones as well. Moreover, the objective function in Problem (34) is non-convexdue to the coupling of interference, and the convexity of the rate region cannot be guaranteed in general.

However, it has been shown in [57] that the duality gap between the dualgap of Problem (34) goes to zero when the number of tones K gets large (e.g.,for VDSL), thus Problem (34) can be solved by dual decomposition method,which brings the complexity as a function of K down to linear. Moreover, afrequency-sharing property ensures the rate region is convex with large enoughK, and each boundary point of the boundary point of the rate region can beachieved by a weighted rate maximization as following [8]:

maximize R1 +∑

n>1 wnRn

subject to∑

k∈K snk ≤ Pn, ∀n ∈ N ,

(35)

such that the nonnegative weight coefficient wn is adjusted to ensure that thetarget rate constraint of user n is met. Without loss of generality, here wedefine w1 = 1. By changing the rate constraints Rn,target for users n > 1 (orequivalently, changing the weight coefficients, wn for n > 1), every boundarypoint of the convex rate region can be tracked.

We observe that at the optimal solutions of (34), each user chooses aPSD level that leads to a good balance of maximization of her own rateand minimization of the damages he causes to the other users. To accuratelycalculate the latter, the user needs to know the global information of the noisePSDs and crosstalk channel gains. However, if we aim at a less aggressiveobjective and only require each user give enough protection to the other usersin the binder while maximization her own rate, then global information maynot be needed. Indeed, we can introduce the concept of a “reference line”,a virtual line that represents a “typical” victim in the current binder. Theninstead of solving (34), each user tries to maximize the achievable data rate


on the reference line, subject to its own data rate and total power constraint.Define the rate of the reference line to user n as

Rn,ref =∑

k∈K

bnk =

∑

k∈K

log

(

1 +sk

αnksn

k + σk

)

.

The coefficients {sk, σk, αnk , ∀k, n} are parameters of the reference line and

can be obtained from field measurement. They represent the conditions of a“typical” victim user in an interference channel (here a binder of DSL lines),and are known to the users a priori. They can be further updated on a muchslower timescale through channel measurement data. User n then wants tosolve the following problem local to itself:

maximize Rn,ref

subject to Rn ≥ Rn,target,∑

k∈K snk ≤ Pn.

(36)

By using Lagragnian relaxation on the rate target constraint in Problem(36) with a weight coefficient (dual variable) wn, the relaxed version of (36)is

maximize wnRn + Rn,ref

subject to∑

k∈K snk ≤ Pn.

(37)

The weight coefficient wn needs to be adjusted to enforce the rate constraint.

4.4 ASB Algorithms

We first introduce the basic version of the ASB algorithm (ASB-I), where eachuser n chooses the PSD sn to solve (36) , and updates the weight coefficient wn

to enforce the target rate constraint. Then we introduce a variation of the ASBalgorithm (ASB-II) that enjoys even lower lower computational complexityand provable convergence.

ASB-I

For each user n, replacing the original optimization (36) with the Lagrangedual problem

maxλn≥0,

∑

k∈Ksn

k≤P n

∑

k∈K

maxsn

k

Jnk

(

wn, λn, snk , s−n

k

)

, (38)

whereJn

k

(


k

)

= wnbnk + bn

k − λnsnk . (39)

By introducing the dual variable λn, we decouple (36) into several smallersubproblem, one for each tone. And define Jn

k as user n’s objective functionon tone k. The optimal PSD that maximizes Jn

k for given wn and λn is

Special Volume 39

sn,Ik

(

wn, λn, s−nk

)

= arg maxsn

k∈[0,P n]

Jnk

(


k

)

, (40)

which can be found by solving the first order condition, ∂Jnk

(


k

)

/∂snk =

0, which leads to

wn

sn,Ik +

∑

m 6=n αn,mk sm

k + σnk

−αn

k sk(

sk + αnksn,I

k + σk

)(

αnksn,I

k + σk

) − λn = 0.

(41)Note that (41) can be simplified into a cubic equation which has three

solutions. The optimal PSD can be found by substituting these three solutionsback to the objective function Jn

k

(


k ,)

, as well as checking theboundary solutions sn

k = 0 and snk = Pn, and pick the one that yields the

largest value of Jnk .

The user then updates λn to enforce the power constraint, and updateswn to enforce the target rate constraint. The complete algorithm is given asfollows, where ελ and εw are small stepsizes for updating λn and wn.

Algorithm 5. Autonomous Spectrum Balancing.repeat

for each user n = 1, ..., Nrepeat

for each tone k = 1, ..., K, findsn,I

k = arg maxsnk≥0 Jn

k

λn =[

λn + ελ

(

∑

k sn,Ik − Pn

)]+

;

wn =[

wn + εw

(

Rn,target −∑

k bnk

)]+

;

until convergenceend

until convergence

ASB-II with Frequency-Selective Waterfilling

To obtain the optimal PSD in ASB-I (for fixed λn and wn), we have to solvethe roots of a cubic equation. To reduce the computation complexity andgain more insights of the solution structure, we assume that the referenceline operates in the high SIR regime whenever it is active: If sk > 0, thensk ≫ σk ≫ αn,m

k snk for any feasible sn

k , n ∈ N and k ∈ K. This assumption ismotivated by our observations on optimal solutions in DSL type of interferencechannels. It means that the reference PSD is much larger than the referencenoise, which is in turn much larger than the interference from user n. Thenon any tone k ∈ K = {k|sk > 0, k ∈ K} , the reference line’s achievable rate is


log

(

1 +sk

αnksn

k + σk

)

≈ log

(

sk

σk

)

−αn

ksnk

σk.

and user n’s objective function on tone k can be approximated by

Jn,II,1k

(


k

)

= wnbnk −

αnksn

k

σk− λnsn

k + log

(

sk

σk

)

.

The corresponding optimal PSD is

sn,II,1k

(

wn, λn, s−nk

)

=

wn

λn + αnk/σk

−∑

m 6=n

αn,mk sm

k − σnk

+

. (42)

This is a waterfilling type of solution and is intuitively satisfying: the PSDshould be smaller when the power constraint is tighter (i.e., λn is larger), orthe interference coefficient to the reference line αn

k is higher, or the noise levelon the reference line σk is smaller, or there is more interference plus noise∑

m 6=n αn,mk sm

k + σnk on the current tone. It is different from the conventional

waterfilling in that the water level in each tone is not only determined by thedual variables wn and λn, but also by the parameters of the reference line,αn

k/σk.On the other hand, on any tone where the reference line is inactive, i.e.,

k ∈ KC= {k|sk = 0, k ∈ K}, the objective function is

Jn,II,2k

(


k

)

= wnbnk − λnsn

k ,

and the corresponding optimal PSD is

sn,II,2k

(

wn, λn, s−nk

)

=

wn

λn−∑

m 6=n

αn,mk sm

k − σnk

+

. (43)

This is the same solution as the iterative waterfilling.The choice of optimal PSD in ASB-II can be summarized as the following:

sn,IIk

(

wn, λn, s−nk

)

=

(

wn

λn+αnk

/σk−∑

m 6=n αn,mk sm

k − σnk

)+

, k ∈ K(

wn

λn −∑

m 6=n αn,mk sm

k − σnk

)+

, k ∈ KC.

(44)This is essentially a waterfilling type of solution, with different water levels

for different tones (frequencies). We call it frequency selective waterfilling.

4.5 Convergence Analysis

In this subsection, we show the convergence for both ASB-I and ASB-II, forthe case where users fix their weight coefficients wn, which is also called Rate

Special Volume 41

Adaptive (RA) spectrum balancing [49] that aims at maximizing users’ ratessubject to power constraint. 7

Convergence in the Two-user case

The first result is on the convergence of ASB-I algorithm, with fixed w =(

w1, w2)

and λ =(

λ1, λ2)

.

Proposition 3 The ASB-I algorithm converges in a two-user case under fixedw and λ, if users start from initial PSD values

(

s1k, s2

k

)

=(

0, P 2)

or(

s1k, s2

k

)

=(

P 1, 0)

on all tones.

The proof of Theorem 3 uses supermodular game theory [53] and strategytransformation similar to [20].

Now consider the ASB-II algorithm where two users sequentially optimizetheir PSD levels under fixed values of w, but adjust λ to enforce the powerconstraint. The following lemma will be useful in proving the main convergenceresults.

Lemma 1. Consider any non-decreasing function f (x) and non-increasingfunction g (x), where there exists a unique x∗ such that f (x∗) = g (x∗) , ∂f (x) /∂ (x)|x=x∗ >0 and ∂g (x) /∂ (x)|x=x∗ < 0. Then

x∗ = argminx

{max{f (x) , g (x)}} .

Proof. For any ∆x > 0,

f (x∗ + ∆x) > f (x∗) = g (x∗) > g (x∗ + ∆x) ,

and

max {f (x∗ + ∆x) , g (x∗ + ∆x)} = f (x∗ + ∆x) > f (x∗) = max{f (x∗) , g (x∗)}.

Similarly for any ∆x < 0,

f (x∗ + ∆x) < f (x∗) = g (x∗) < g (x∗ + ∆x) ,

and

max {f (x∗ + ∆x) , g (x∗ + ∆x)} = g (x∗ + ∆x) > g (x∗) = max{f (x∗) , g (x∗)}.

It then clear that

x∗ = argminx

{max{f (x) , g (x)}} .

7 The second main category of spectrum balancing problem is Fixed Margin (FM),which is concerned with finding a minimal power allocation such that minimumtarget data-rates for each user is satisfied. For example, problem (34) is a mixedRA/FM problem.


Denote sn,tk as the PSD of user n in tone k after iteration t, where

∑

k sn,tk =

Pn is satisfied for any n and t. One iteration is defined as one round of updatesof all users. We can show that

Proposition 4 The ASB-II algorithm globally converges to the unique fixedpoint in a two-user system under fixed w, if maxk α2,1

k maxk α1,2k < 1.

The convergence result of iterative waterfilling in the two user case [56] isa special case of Proposition 4 by setting sk = 0, ∀k.

Proof. Define αnk as the equivalent interference channel gain from user n to

the reference line,

αnk =

{

αnk , k ∈ K

0, k ∈ KC ,

and we can simplify (44) as

sn,t+1k =

[

wn

λn,t+1 + αnk/σk

− αn,mk sm,t

k − σnk

]+

, ∀n, m ∈ {1, 2} , m 6= n, ∀k, ∀t

where λn,t+1 is chosen such that∑

k sn,t+1k = Pn.

Define [x]+

= max (x, 0) and [x]−

= max (−x, 0) , then it is clear that

∑

k

[

sn,tk − sn,t′

k

]+

=∑

k

[

sn,tk − sn,t′

k

]−

, ∀n, t, t′, (45)

since the total power constraint is always satisfied after any iteration. Alsodefine

fn,t (x) =∑

k

[

wn

x+αnk

/σk− αm

k sm,tk − σn

k − sn,tk

]−

,

gn,t (x) =∑

k

[

wn

x+αnk

/σk− αm

k sm,tk − σn

k − sn,tk

]+

,

and it is clear that fn,t (x) (gn,t (x) , respectively) is non-decreasing (non-increasing) in x, and strictly increasing (strictly decreasing) at x = λn,t+1.Also f

(

λn,t+1)

= g(

λn,t+1)

. Then by Lemma 1,

max{

fn,t (x) , gn,t (x)}

≥ max{

fn,t(

λn,t+1)

, gn,t(

λn,t+1)}

, ∀x.

Take x = λn,t, we have

max{

fn,t(

λn,t)

, gn,t(

λn,t)}

≥ max{

fn,t(

λn,t+1)

, gn,t(

λn,t+1)}

.

This leads to

Special Volume 43

max

{

∑

k

[

s1,t+1k − s1,t

k

]+

,∑

k

[

s1,t+1k − s1,t

k

]−}

(46)

= max{

f1,t(

λ1,t+1)

, g1,t(

λ1,t+1)}

≤ max{

f1,t(

λ1,t)

, g1,t(

λ1,t)}

= max

{

∑

k

[

w1

λ1,t + α1k/σk

− α1,2k s2,t

k − σ2k − s1,t

k

]−

,

∑

k

[

w1

λ1,t + α1k/σk

− α1,2k s2,t

k − σ2k − s1,t

k

]+}

= max

{

∑

k

α1,2k

[

s2,t−1k − s2,t

k

]+

,∑

k

α1,2k

[

s2,t−1k − s2,t

k

]−}

≤ maxk

{

α1,2k

}

max

{

∑

k

[

s2,tk − s2,t−1

k

]+

,∑

k

[

s2,tk − s2,t−1

k

]−}

≤ maxk

{

α1,2k

}

maxk

{

α2,1k

}

max

{

∑

k

[

s1,tk − s1,t−1

k

]+

,∑

k

[

s1,tk − s1,t−1

k

]−}

.

< max

{

∑

k

[

s1,tk − s1,t−1

k

]+

,∑

k

[

s1,tk − s1,t−1

k

]−}

(47)

The last inequality is due to the fact that maxk α1,2k maxk α2,1

k < 1. This showsthat the algorithm is a contraction mapping from any initial PSD values, thusglobally converges to a unique fixed point [4].

Convergence in the N-user case

We further extend the convergence results to a system with an arbitrary N > 2of users. We consider both sequential and parallel PSD updates of the users.In the more realistic but harder-to-analyze parallel updates, time is dividedinto slots, and each user n updates the PSD simultaneously in each time slotaccording to (44) based on the PSDs in the previous slot, where the λn isadjusted such that the power constraint is satisfied.

Proposition 5 Assume maxn,m,k αn,mk < 1

N−1 , then the ASB-II algorithmglobally converges (to the unique fixed point) in an N -user system under fixedw, with either sequential or parallel updates.

Proposition 5 contains the convergence of iterative waterfilling in an N -user case with sequential updates (proved in [13]) as a special case of ASBconvergence with sequential or parallel updates. Moreover, the convergenceproof for the parallel updates turns out to be simpler than the one for sequen-tial updates. The proof extends the that of Proposition 4, and can be foundin [21].


CP CP

CO CO

RT1 RT1

RT2 RT2

RT3 RT3

CP CP

CP CP

CP CP 5 km

4 km

3.5 km

3 km

2 km

3 km

4 km CP CP CP CP

CO CO CO CO

RT1 RT1 RT1 RT1

RT2 RT2 RT2 RT2

RT3 RT3 RT3 RT3

CP CP CP CP

CP CP CP CP

CP CP CP CP 5 km

4 km

3.5 km

3 km

2 km

3 km

4 km

Fig. 12. An example of mixed CO/RT deployment topology (Example 8).

4.6 Simulation Results

Example 8. Mixed CO-RT DSL. Here we summarize a typical numericalexample comparing the performances of ASB algorithms with IW, OSB andISB. We consider a standard mixed Central Office (CO) and Remote Terminal(RT) deployment. A 4 user scenario has been selected to make a comparisonwith the highly complex OSB algorithm possible. As depicted in Figure 12the scenario consists of 1 CO distributed line, and 3 RT distributed lines.The target rates on RT1 and RT2 have both been set to 2 Mbps. For avariety of different target rates on RT3, the CO line attempts to maximizeits own data-rate either by transmitting at full power in IW, or by setting itscorresponding weight wco to unity in OSB, ISB and ASB. This produces therate regions shown in Figure 13, which shows that ASB achieves near optimalperformance similar as OSB and ISB, and significant gain over IW even thoughboth ASB and IW are autonomous. For example, with a target rate of 1 Mbpson CO, the rate on RT3 reaches 7.3 Mbps under ASB algorithm, which is a121% increase compared with the 3.3 Mbps achieved by IW. We have alsoperformed extensive simulations (more than 10, 000 scenarios) with differentCO and RT positions, line lengths and reference line parameters. We foundthat the performance of ASB is very insensitive to definition of reference line:with a single choice of the reference line we observe good performance in abroad range of scenarios.


Dynamic spectrum management techniques can greatly improve the perfor-mance the DSL lines by inducing cooperation among interfering users in thesame binder. For example, iterative waterfiling (IW) algorithm is a completelyautonomous DSM algorithm with linear complexity in the number of usersand number of tones, but the performance could be far from optimal in the

Special Volume 45

0 1 2 3 4 5 6 7 80.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

RT3 (Mbps)

CO

(M

bps)

RT1 @ 2 Mbps, RT2 @ 2 Mbps

Optimal Spectrum BalancingIterative Spectrum BalancingAutonomous Spectrum BalancingIterative Waterfilling

Fig. 13. Rate regions obtained by ASB, IW, OSB, and ISB.

mixed CO/RT deployment scenario. The optimal spectrum balancing (OSB)and iterative spectrum balancing (ISB) algorithms achieve optimal and closeto optimal performances, respectively, but have high complexities in terms ofthe number of users and are completely centralized.

This section surveys an autonomous dynamic spectrum management (DSM)algorithm called autonomous spectrum balancing (ASB). ASB utilizes the con-cept of “reference line”, which mimics a typical victim line in the binder. Bysetting the power spectrum level to protect the reference line, a good bal-ance between selfish and global maximizations can be achieved. Comparedwith IW, OSB and ISB, the ASB algorithm enjoys completely autonomousoperations, low (linear) complexity in both the number of users and numberof tones. Simulation shows that the ASB algorithm achieves close to optimalperformance and is robust to the choice of reference line parameters.

Acknowledgment

The author would like to acknowledge collaborations with Raphael Cendrillon,Maryam Fazel, Prashanth Hande, Jianwei Huang, Daniel Palomar, and CheeWei Tan on the publications related to this survey [12, 15, 51, 21], as wellas helpful general discussions on related topics with Stephen Boyd, RobertCalderbank, John Doyle, David Julian, Jang-Won Lee, Ying Li, Steven Low,Daniel O’Neill, Asuman Ozdaglar, Pablo Parrilo, Ness Shroff, R. Srikant, andAo Tang.


References

1. M. Avriel, Ed. Advances in Geometric Programming, Plenum Press, New York,1980.

2. N. Bambos, “Toward power-sensitive network architectures in wireless commu-nications: Concepts, issues, and design aspects.” IEEE Pers. Comm. Mag., vol.5, no. 3, pp. 50-59, 1998.

3. D. P. Bertsekas, Nonlinear Programming, Athena Scientific, 1999.4. D. P. Bertsekas and J. N. Tsitsiklis, Parallel and Distributed Computation: nu-

merical methods. Prentice Hall, 1989.5. S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University

Press, 2004.6. R. Cendrillon, G. Ginis, and M. Moonen, “Improved linear crosstalk precompen-

sation for downstream vdsl,” in Proceedings of IEEE International Conferenceon Acoustics, Speed and Signal Processing (ICASSP), 2004, pp. 1053–1056.

7. R. Cendrillon and M. Moonen, “Iterative spectrum balancing for digital sub-scriber lines,” in IEEE International Communications Conference (ICC), 2005.

8. R. Cendrillon, W. Yu, M. Moonen, J. Verlinden, and T. Bostoen, “Optimalmulti-user spectrum management for digital subscriber lines,” accepted by IEEETransactions on Communications, 2005.

9. M. Chiang, “Balancing transport and physical layers in wireless multihop net-works: Jointly optimal congestion control and power control,” IEEE J. Sel. AreaComm., vol. 23, no. 1, pp. 104-116, Jan. 2005.

10. M. Chiang, “Geometric programming for communication systems”, Foundationsand Trends of Communications and Information Theory, vol. 2, no. 1-2, pp. 1-156, Aug. 2005.

11. M. Chiang, S. H. Low, R. A. Calderbank, and J. C. Doyle, “Layering as opti-mization decomposition”, To appear in Proceedings of IEEE, 2006.

12. M. Chiang, S. Zhang, and P. Hande, “Distributed rate allocation for inelasticflows: Optimization framework, optimality conditions, and optimal algorithms,”Proc. IEEE Infocom, Miami, FL, March 2005.

13. S. T. Chung, “Transmission schemes for frequency selective gaussian interferencechannels,” Ph.D. dissertation, Stanford University, 2003.

14. R. J. Duffin, E. L. Peterson, and C. Zener, Geometric Programming: Theoryand Applications. Wiley, 1967.

15. M. Fazel and M. Chiang, “Nonconcave network utility maximization by sum ofsquares programming,” Proc. IEEE CDC, Dec. 2005.

16. G. Foschini and Z. Miljanic, “A simple distributed autonomous power controlalgorithm and its convergence”, IEEE Trans. Veh. Tech., vol. 42, no. 4, 1993.

17. G. Ginis and J. Cioffi, “Vectored transmission for digital subscriber line sys-tems,” IEEE Journal on Selected Areas of Communications, vol. 20, no. 5, pp.1085–1104, 2002.

18. D. Handelman, “Representing polynomials by positive linear functions on com-pact convex polyhedra,” Pacific J. Math., vol. 132, pp. 35-62, 1988.

19. D. Henrion, J.B. Lasserre, “Detecting global optimality and extracting solutionsin GloptiPoly,” Research report, LAAS-CNRS, 2003.

20. J. Huang, R. Berry, and M. L. Honig, “A game theoretic analysis of distributedpower control for spread spectrum ad hoc networks,” Proc. IEEE ISIT, July2005.

Special Volume 47

21. J. Huang, R. Cendrillon, and M. Chiang, “Autonomous Spectrum Balancing inDSL Interference Channels”, Proc. IEEE ISIT, July 2006.

22. D. Julian, M. Chiang, D. ONeill, and S. Boyd, “QoS and fairness constrainedconvex optimization of resource allocation for wireless cellular and ad hoc net-works.” Proc. IEEE Infocom, New York, NY, June 2002.

23. F. P. Kelly, “Models for a self-managed Internet”, Philosophical Transactions ofthe Royal Society, A358, 2335-2348, 2000.

24. F. P. Kelly, A. Maulloo, and D. Tan, “Rate control for communication networks:shadow prices, proportional fairness and stability,” Journal of Operations Re-search Society, vol. 49, no. 3, pp.237-252, March 1998.

25. S. Kandukuri and S. Boyd, “Optimal power control in interference limited fadingwireless channels with outage probability specifications”, IEEE Trans. WirelessComm., vol. 1, no. 1, pp. 46-55, Jan. 2002.

26. J.B. Lasserre, “Global optimization with polynomials and the problem of mo-ments,” SIAM J. Optim., vol. 11, no. 3, pp. 796-817, 2001.

27. J.B. Lasserre, “Polynomial programming: LP-relaxations also converge,” SIAMJ. Optimization, vol. 15, no. 2, pp. 383-393, 2004.

28. J. W. Lee, R. Mazumdar, and N. B. Shroff, “Non-convex optimization and ratecontrol for multi-class services in the Internet,” IEEE/ACM Trans. Networking,vol. 13, no. 4, pp. 827-840, Aug. 2005.

29. J. W. Lee, R. Mazumdar, and N. B. Shroff, “Downlink power allocation formulti-class CDMA wireless networks”, IEEE/ACM Networking, vol, 13, no. 4,pp. 854-867, Aug. 2005.

30. J. W. Lee, R. Mazumdar, and N. B. Shroff, “Opportunistic power schedulingfor multi-server wireless systems with minimum performance constraints,” IEEETrans. Wireless Comm., vol.5, no. 5, May 2006.

31. X. Lin, N. B. Shroff, and R. Srikant, “A tutorial on cross-layer optimization inwireless networks”, IEEE J. Sel. Area Comm., Aug. 2006.

32. S. H. Low, “A duality model of TCP and queue management algorithms,”IEEE/ACM Tran. Networking, vol. 11, no. 4, pp. 525-536, Aug. 2003.

33. B. R. Marks and G. P. Wright, “A general inner approximation algorithm fornonconvex mathematical program”, Operations Research, 1978.

34. D. Mitra, “An asynchronous distributed algorithm for power control in cellularradio systems”, Proceedings of 4th WINLAB Workshop Rutgers University, NJ,1993.

35. Yu. Nesterov and A. Nemirovsky, Interior Point Polynomial Methods in ConvexProgramming, SIAM Press, 1994.

36. D. Palomar and M. Chiang, “Alternative decompositions for distributed maxi-mization of network utility: Framework and applications”, Proc. IEEE Infocom,Barcelona, Spain, April 2006.

37. P. A. Parrilo, Structured semidefinite programs and semi-algebraic geometrymethods in robustness and optimization,” PhD thesis, Caltech, May 2002.

38. P. A. Parrilo, “Semidefinite programming relaxations for semi-algebraic prob-lems,” Math. Program., vol. 96, pp.293-320, 2003.

39. S. Prajna, A. Papachristodoulou, P. A. Parrilo, “SOSTOOLS:Sum of squares optimization toolbox for Matlab, available fromhttp://www.cds.caltech.edu/sostools, 2002-04.

40. M. Putinar, “Positive polynomials on compact semi-algebraic sets,” IndianaUniversity Mathematics Journal, vol. 42, no. 3, pp. 969-984, 1993.


41. R. T. Rockafellar, Network Flows and Monotropic Programming, Athena Scien-tific, 1998.

42. R. T. Rockafellar, “Lagrange multipliers and optimality,” SIAM Review, vol.35, pp. 183-283, 1993.

43. C. Saraydar, N. Mandayam, and D. Goodman, “Pricing and power control ina multicell wireless data network”, IEEE J. Sel. Areas Comm., vol. 19, no. 10,pp. 1883-1892, 2001.

44. K. Schmudgen, “The k-moment problem for compact semialgebraic sets,” Math.Ann., vol. 289, pp. 203-206, 1991.

45. S. Shenker, “Fundamental design issues for the future Internet,” IEEE J. Sel.Area Comm., vol. 13, no. 7, pp. 1176-1188, Sept. 1995.

46. N. Z. Shor, “Quadratic optimization problems,” Soviet J. Comput. Systems Sci.,vol 25, pp. 1-11, 1987.

47. R. Srikant, The Mathematics of Internet Congestion Control, Birkhauser 2004.48. G. Stengle, “A Nullstellensatz and a Positivstellensatz in semialgebraic geome-

try,” Math. Ann., vol. 207, pp.87-97, 1974.49. T. Starr, J. Cioffi, and P. Silverman, Understanding digital Subscriber Line Tech-

nology. Prentice Hall, 1999.50. C. Sung and W. Wong, “Power control and rate management for wireless mul-

timedia CDMA systems”, IEEE Trans. Comm. vol. 49, no. 7, pp. 1215-1226,2001.

51. C. W. Tan, D. Palomar, and M. Chiang, “Solving non-convex power controlproblems in wireless networks: Low SIR regime and distributed algorithms,”Proc. IEEE Globecom, St. Louis, MO, Nov. 2005.

52. C. W. Tan, D. Palomar and M. Chiang, “ Distributed Optimization of Cou-pled Systems with Applications to Network Utility Maximization”, Proc. IEEEICASSP 2006, Toulouse, France, May 2006.

53. D. M. Topkis, Supermodularity and Complementarity. Princeton UniversityPress, 1998.

54. M. Xiao, N. B. Shroff, and E. K. P. Chong, “Utility based power control (UBPC)in cellular wireless systems”, IEEE/ACM Trans. Networking, vol. 11, no. 10, pp.210-221, March 2003.

55. R. Yates, “A framework for uplink power control in cellular radio systems”,IEEE J. Sel. Areas Comm., vol. 13, no. 7, pp. 1341-1347, 1995.

56. W. Yu, G. Ginis, and J. Cioffi, “Distributed multiuser power control for digitalsubscriber lines,” IEEE Journal on Selected Areas in Communication, vol. 20,no. 5, pp. 1105–1115, June 2002.

57. W. Yu, R. Lui, and R. Cendrillon, “Dual optimization methods for multiuserorthogonal frequency division multiplex systems,” in Proceedings of IEEE Globe-com, vol. 1, 2004, pp. 225–229.

nonconvex optimization for communication systems · nonconvex optimization for communication...

Documents