454 ieee transactions on communications, vol. 59, no. 2
TRANSCRIPT
![Page 1: 454 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 59, NO. 2](https://reader031.vdocuments.mx/reader031/viewer/2022012505/6180f0cc30fe322e9c3198c6/html5/thumbnails/1.jpg)
454 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 59, NO. 2, FEBRUARY 2011
Exploiting Sparse User Activity inMultiuser Detection
Hao Zhu, Student Member, IEEE, and Georgios B. Giannakis, Fellow, IEEE
AbstractβThe number of active users in code-division multipleaccess (CDMA) systems is often much lower than the spreadinggain. The present paper exploits fruitfully this a priori infor-mation to improve performance of multiuser detectors. A low-activity factor manifests itself in a sparse symbol vector withentries drawn from a finite alphabet that is augmented by thezero symbol to capture user inactivity. The non-equiprobablesymbols of the augmented alphabet motivate a sparsity-exploitingmaximum a posteriori probability (S-MAP) criterion, which isshown to yield a cost comprising the β2 least-squares errorpenalized by the π-th norm of the wanted symbol vector(π = 0, 1, 2). Related optimization problems appear in variableselection (shrinkage) schemes developed for linear regression,as well as in the emerging field of compressive sampling (CS).The contribution of this work to such sparse CDMA systemsis a gamut of sparsity-exploiting multiuser detectors tradingoff performance for complexity requirements. From the vantagepoint of CS and the least-absolute shrinkage selection operator(Lasso) spectrum of applications, the contribution amounts tosparsity-exploiting algorithms when the entries of the wantedsignal vector adhere to finite-alphabet constraints.
Index TermsβSparsity, multiuser detection, compressive sam-pling, Lasso, sphere decoding.
I. INTRODUCTION
MULTIUSER detection (MUD) algorithms play a majorrole for mitigating multi-access interference present
in code-division multiple access (CDMA) systems; see e.g.,[17] and references therein. These well-appreciated MUDalgorithms simultaneously detect the transmitted symbols ofall active user terminals. However, they require knowledgeof which terminals are active, and exploit no possible user(in)activity.
In this paper, MUD algorithms are developed when the ac-tive terminals are unknown and the activity factor (probabilityof each user being active) is low β a typical scenario in tacticalor commercial CDMA systems deployed. The inactivity per
Paper approved by X. Wang, the Editor for Multiuser Detection andEqualization of the IEEE Communications Society. Manuscript receivedSeptember 21, 2009; revised April 2, 2010, July 3, 2010, and August 30,2010.
This work was supported by the NSF grants CCF-0830480, 1016605,and ECCS-0824007, 1002180; and also through the collaborative partic-ipation in the Communications and Networks Consortium sponsored bythe U. S. Army Research Laboratory under the Collaborative TechnologyAlliance Program, Cooperative Agreement DAAD19-01-2-0011. The U. S.Government is authorized to reproduce and distribute reprints for Governmentpurposes notwithstanding any copyright notation thereon. Part of this workwas presented at the IEEE Intl. Symp. on Info. Theory, Seoul, South Korea,June 2009.
The authors are with the Department of Electrical and Computer Engineer-ing, University of Minnesota, 200 Union Street SE, Minneapolis, MN 55455,USA (e-mail: {zhuh, georgios}@umn.edu).
Digital Object Identifier 10.1109/TCOMM.2011.121410.090570
user can be naturally incorporated by augmenting the under-lying alphabet with an extra zero constellation point. Lowactivity thus implies a sparse symbol vector to be detected.With non-equiprobable symbols in the augmented alphabet,the optimal sparsity-embracing MUD naturally suggests amaximum a posteriori (MAP) criterion for detection. Sparsityhas been used for estimating parameters of communicationsystems, see e.g., [1], [3], [6], [10], but not for multiuserdetection1.
Reconstruction of sparse signals has become very popularrecently, especially after the emergence of the compressivesampling (CS) theory; see e.g., [2], [4], [7], [16] and refer-ences therein. Exploiting sparsity has also received growinginterest in regression problems because it offers parsimoniousstatistical models that prevent data overfitting. In particular,the least-absolute shrinkage and selection operator (Lasso) hasbeen applied for variable selection (VS) to diverse applicationsinvolving sparse signals [15]. Lasso regression amounts toregularizing the ordinary least-squares (LS) with the β1 normof the wanted vector, thus retaining the attractive features ofboth VS and LS regression. However, neither CS nor Lassohave dealt with sparse signals under finite-alphabet constraints;and this is the subject of the present work.
Different from sparse signal reconstruction or regression,the multiuser detectors developed in this paper must accountfor sparsity but also for the finite alphabet of the desiredsymbol vector. Taking into consideration the sparsity in MUDunder integer constraints however, incurs complexity that isexponential in the vector length. To cope with this challenge,we will exploit sparsity to either relax or judiciously searchover subsets of the alphabet. The resultant MUD algorithmstrade off optimality in detection error performance with com-putational complexity. In a nutshell, this paperβs contributionis the development of efficient algorithms for MUD undersparsity and finite-alphabet constraints.
Also in the CS literature a number of works is available todeal with the performance of recovering sparse sequences ofsize exceeding the number of observations; see e.g., [4], [5]and references therein. These works establish the minimumnumber of observations needed to estimate sparse vectorswith overwhelming probability. In the design of practicalCDMA systems however, one is also interested in savingbandwidth and power resources. This becomes possible byreducing the size of the required spreading gain, which in turnreduces latency and energy consumption. Taking advantage ofthe low activity factor, enables designs with spreading gains
1Recently, along with the conference pre-cursor of this paper in [20], workrelated to sparse sphere decoding was reported independently in [14].
0090-6778/11$25.00 cβ 2011 IEEE
![Page 2: 454 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 59, NO. 2](https://reader031.vdocuments.mx/reader031/viewer/2022012505/6180f0cc30fe322e9c3198c6/html5/thumbnails/2.jpg)
ZHU and GIANNAKIS: EXPLOITING SPARSE USER ACTIVITY IN MULTIUSER DETECTION 455
smaller than the number of candidate users. Efficient multiuserdetectors are also developed here for this kind of under-determined systems encountered in CDMA communications.
The rest of the paper is organized as follows. SectionII presents the model and formulates the problem; whileSection III introduces the optimal sparsity-exploiting MAPdetector. (Sub-) optimal algorithms are developed in SectionsIV and V. Section VI lists several extensions. All the detectionalgorithms are tested and compared numerically in SectionVII, before concluding in Section VIII.
II. MODELING AND PROBLEM STATEMENT
Consider the uplink of a CDMA system with πΎ userterminals and spreading gain π . Assume first that π β₯ πΎ .The under-determined case (π < πΎ) will also be addressedlater on in Section VI-B. Suppose the system has a relativelylow activity factor, which analytically means that each termi-nal is active with probability (w.p.) ππ < 1/2 per symbol,and the events βactiveβ are independent across symbols andacross users. The case of correlated (in)activity of users acrosssymbols will be dealt with in Section VI-A. Let ππ β πdenote the symbol, drawn from a finite alphabet by the π-thuser, when active; otherwise, ππ = 0. Incorporating possible(in)activity per user is equivalent to having ππ take values froman augmented alphabet ππ := πβͺ{0}.
The access point (AP) receives the superimposed modulated(quasi-) synchronous signature waveforms through (possiblyfrequency-selective) fading channels in the presence of ad-ditive white Gaussian noise (AWGN); and projects on theorthonormal space spanned by the aggregate waveforms toobtain the received chip samples collected in the π Γ 1vector y. With the πΎ Γ 1 vector b containing the symbolsof all (active and inactive) users, the canonical input-outputrelationship is, see e.g., [17, Sec. 2.9]
y = Hb + w (1)
where H is an πΓπΎ matrix capturing transmit-receive filters,spreading sequences, channel impulse responses, and timingoffsets; and the π Γ 1 vector w is the AWGN. Without lossof generality (w.l.o.g.), w can be scaled to have unit variance.Note that (1) holds for quasi-synchronous systems too, whererelative user asynchronism is bounded to a few chips persymbol, provided that: either i) user transmissions includeguard bands to eliminate inter-symbol interference (ISI); orii) the received vector y collects only the chips of each userthat belong to the part of the common symbol interval underconsideration.
The low activity factor implies that b is a sparse vector.However, the AP is neither aware of the positions nor thenumber of zero entries in b. In order to perform multiuserdetection (MUD) needed to determine the optimal b, the APmust account for the augmented alphabet ππ, i.e., consider allthe possible candidate vectors b β ππΎ
π . This way, the MUDalso determines the π-th userβs activity captured by the extraconstellation point ππ = 0. Supposing that the AP has thechannel matrix H available (e.g., via training), the goal ofthis paper is to detect the optimal b given the received vectory by exploiting the sparsity of active users.
Fig. 1. Wireless sensors access a UAV.
To motivate this sparsity-exploiting MUD setup in theCDMA context, consider a set of terminals wishing to linkwith a common AP. Suppose that the AP acquires the fullmatrix H (with all terminals active) during a training phase.Those channels may include either non-dispersive or multipathfading, and are assumed invariant during the coherence time,which is typically larger compared to the symbol period. Eachterminal accesses the channel randomly, and the AP receivesthe superposition of signals from the active terminals only.The AP is interested in determining both the active terminalsand the symbols transmitted.
Another scenario where H is known and sparsity-exploitingMUD is well motivated, entails an unmanned aerial vehicle(UAV) collecting information from a ground wireless sensornetwork (WSN) placed over a grid, as depicted in Fig. 1.As the UAV flies over the grid of sensors, it collects thesignals from a random subset of them. If the channel fadingis predominantly affected by path loss, the UAV can acquireH based on the relative AP-to-sensor distances. Again, theUAV faces the problem of determining both identities of activesensors, and the symbols each active sensor transmits.
With this problem setup in mind, we will develop differentMUD strategies, which account for the low activity factor.First, we will look at the maximum a posteriori probability(MAP) optimum MUD that exploits the sparsity present.
III. SPARSITY-EXPLOITING MAP DETECTOR
The goal is to detect b in (1), given a prescribed activityfactor, the received vector y, and the matrix H. Recall thoughthat the low activity factor leads to a sparse b, i.e., each entryππ is more likely to take the value 0 from the alphabet. Becauseentries {ππ}πΎπ=1 are non-equiprobable, the optimal detector inthe sense of minimizing the detection error rate is the MAPone.
Aiming at a sparsity-aware MAP criterion, consider firstthe prior probability for b. For simplicity in exposition,suppose for now that each terminal transmits binary symbolswhen active, i.e., π = {Β±1}. (It will become clear lateron that all sparsity-cognizant MUD schemes are applicableto finite alphabets of general constellations, not necessarilybinary.) If ππ takes values from {β1, 0, 1}, with correspondingprobabilities {ππ/2, 1 β ππ, ππ/2}, and since each entry ππ isindependent from ππβ² for π β= πβ², the prior probability for b
![Page 3: 454 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 59, NO. 2](https://reader031.vdocuments.mx/reader031/viewer/2022012505/6180f0cc30fe322e9c3198c6/html5/thumbnails/3.jpg)
456 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 59, NO. 2, FEBRUARY 2011
can be expressed as
Pr(b) =
πΎβπ=1
Pr(ππ) = (1 β ππ)πΎββ₯bβ₯0(ππ/2)β₯bβ₯0 (2)
where β₯bβ₯0 denotes the β0 (pseudo) norm that is by definitionequal to the number of non-zero entries in the vector b. Upontaking logarithms, (2) yields
ln Pr(b) = βπβ₯bβ₯0 + πΎ ln(1 β ππ) (3)
where
π := ln1 β ππππ/2
. (4)
Since low activity factor means ππ < 1/2, it follows readilyfrom (4) that π > 0. With the prior distribution of b in (3),the sparsity-aware MAP (S-MAP) detector is
bMAP =arg maxbβππΎ
π
Pr(bβ£y)=arg minbβππΎ
π
βln π(yβ£b)βln Pr(b)
= arg minbβππΎ
π
1
2β₯y βHbβ₯22 + πβ₯bβ₯0 (5)
where the last equality follows from (3) and the Gaussianityof w. Hence, the S-MAP detection task amounts to findingthe vector in the constraint set ππΎ
π , which minimizes the costin (5).
Two remarks are now in order.Remark 1: (General constellations). Beyond binary alpha-bets, it is easy to see why the S-MAP detector in (5) appliesto more general constellations, including pulse amplitudemodulation (PAM), phase-shift keying (PSK), and quadratureamplitude modulation (QAM). Specifically, for general π -aryconstellations with π β₯ 2 it suffices to adjust accordinglythe prior probability as a function of β₯bβ₯0 in (2). Thiswill render the S-MAP MUD in (5) applicable to generalπ -ary constellations, provided that π in (4) is replaced byπ := ln 1βππ
ππ/π.
Remark 2: (Scale π as a function of ππ). The definition in (4)reveals the explicit relationship of π with the activity factorππ. Different from CS and VS approaches, where π is a tuningparameter often chosen with cross-validation techniques asa function of the data size π and πΎ , here it is directlycoupled with the user activity factor ππ. Such a couplingcarries over even when users have distinct activity factors.Specifically, if the user π is active w.p. ππ,π, then the termπβ₯bβ₯0 := π
βπΎπ=1 β£ππβ£ in (5) should change to
βπΎπ=1 ππβ£ππβ£,
where ππ is defined as in (4) with ππ,π substituting ππ. Thisuser-specific regularization will be used in Section VI-A toadaptively estimate user activity factors on-the-fly, and thusenable sparsity-aware MUD which accounts for correlated user(in)activity across the time slots.
With the variable ππ only taking values from {Β±1, 0}, itholds for π β₯ 1 that
β₯bβ₯0 =
πΎβπ=1
β£ππβ£π = β₯bβ₯ππ, β b β ππΎπ . (6)
Hence, the S-MAP detector (5) for binary transmissions isequivalent to
bMAP = arg minbβππΎ
π
1
2β₯y βHbβ₯22 + πβ₯bβ₯ππ , β π β₯ 1. (7)
Notice that the equivalence between (5) and (7) is based onthe norm equivalence in (6), which holds only for constantmodulus constellations. Although the cost in (7) will turn outto be of interest on its own, it is not an S-MAP detector fornon-constant modulus constellations.
Interestingly, since low-activity factor implies a positiveπ, the problem (7) entails a convex cost function, comparedto the non-convex one in (5). In lieu of the finite-alphabetconstraint, the criterion of (7) consists of the least-squares(LS) cost regularized by the βπ norm, which in the context oflinear regression has been adopted to mitigate data overfitting.In contrast, the LS estimator only considers the goodness-of-fit, and thus tends to overfit the data. Shrinking the LSestimator, by penalizing its size through the βπ norm, typicallyoutperforms LS in practice. For example, the Lasso adoptsthe β1 norm through which it effects sparsity [15]. In theMUD context for CDMA systems with low-activity factor,the vector b has a sparse structure, which motivates well thisregularizing strategy. What is distinct and interesting here isthat this penalty-augmented LS approach under finite-alphabetconstraints emerges naturally as the logarithm of the prior inthe S-MAP detector.
However, the finite-alphabet constraint renders the solutionof (7) combinatorially complex. For general H and y, thesolution of (7) requires exhaustive search over all the 3πΎ
feasible points, with the complexity growing exponentiallyin the problem dimension πΎ . Likewise, for general π -aryalphabets the complexity incurred by (5) is πͺ((π +1)πΎ). Onthe other hand, many (sub-) optimal alternatives are availablein the MUD literature; see e.g., [17, Ch. 5-7]. Similarly here,we will develop different (sub-) optimal algorithms to tradeoff complexity for probability of error performance in sparsity-exploiting MUD alternatives.
Since the exponential complexity of MUD stems from thefinite-alphabet constraint, one reduced-complexity approachis to solve the unconstrained convex problem, and thenquantize the resultant soft decision to the nearest point inthe alphabet. This approach includes the sub-optimal linearMUD algorithms (decorrelating and minimum mean-squareerror (MMSE) detectors). Another approach is to search over(possibly a subset of) the alphabet lattice directly as inthe decision-directed detectors or the sphere decoders [18].Likewise, it is possible to devise (sub-) optimal algorithmsfor solving the S-MAP MUD problem (7) along these twocategories. First, we will present the sparsity-exploiting MUDalgorithms by relaxing the finite-alphabet constraint.
IV. RELAXED S-MAP DETECTORS
In addition to offering an S-MAP detector for constantmodulus constellations, the cost in (7) is convex. Thus, byrelaxing the combinatorial constraint, the optimization prob-lem (7) can be solved efficiently by capitalizing on convexity.As mentioned earlier, this problem is similar to the penalty-augmented LS criterion that is used for VS in linear regression,where the choice of π is important for controlling the shrinkingeffect, that is the degree of sparsity in the solution. Next, wewill develop detectors for two choices of π, and compare themin terms of complexity and performance.
![Page 4: 454 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 59, NO. 2](https://reader031.vdocuments.mx/reader031/viewer/2022012505/6180f0cc30fe322e9c3198c6/html5/thumbnails/4.jpg)
ZHU and GIANNAKIS: EXPLOITING SPARSE USER ACTIVITY IN MULTIUSER DETECTION 457
A. Linear Ridge MUD
The choice π = 2 is a popular one in statistics, well-known as Ridge regression. Its popularity is mainly due tothe fact that it can regularize the LS solution while retainingits closed-form expression as a linear function of the datay. A relaxed detection algorithm for S-MAP MUD can bedeveloped accordingly with π = 2, what we term Ridgedetector (RD). Ignoring the finite-alphabet constraint, theoptimal solution of (7) for π = 2 takes a linear form
bRD = arg minb
1
2β₯y βHbβ₯22 + πβ₯bβ₯22
= (HπH + 2πI)β1Hπy. (8)
In addition to its simplicity, and different from LS, theexistence of the inverse in (8) is ensured even for ill-posedor under-determined problems; i.e., when H is rank deficientor fat (π < πΎ). Notice that bRD takes a form similar tothe linear MMSE multiuser detector, with the parameter πreplacing the noise variance, and connecting the activity factorwith the degree of regularization applied to the matrix HπH.
Upon quantizing each entry of the soft decision bRD withthe operator
π¬π(π₯) := sign(π₯)11(β£π₯β£ β₯ π) (9)
where π > 0, sign(π₯) = 1(β1) with π₯ > (<)0, and 11denoting the indicator function, the hard RD is
bRD = π¬π(bRD) = π¬π
((HπH + 2πI)β1Hπy
). (10)
Because the detector in (8) is linear, it is possible toexpress linearly its soft output bRD with respect to (w.r.t.) theinput symbol vector b. Based on this relationship, one cansubsequently derive the symbol error rate (SER) of the harddetected symbols in bRD as a function of the quantizationthreshold π. These steps will be followed next to obtain theperformance of the RD.
1) Performance Analysis: Letting b denote the vectortransmitted, and substituting y = Hb + w into (8) yields
bRD = (HπH + 2πI)β1Hπ (Hb + w) = Gb + wβ² (11)
where G := I β 2π(HπH + 2πI)β1, and the colorednoise wβ² := (HπH + 2πI)β1Hπw is zero-mean Gaussianwith covariance matrix Ξ£π€β² := πΈ{wβ²(wβ²)π } = (HπH +2πI)β2HπH.
It follows readily from (11) that the π-th entry of bRD
satisfies
πRDπ = πΊππ οΏ½οΏ½π +
ββ β=π
πΊπβοΏ½οΏ½β + π€β²π (12)
where πΊπβ and π€β²π are the (π, β)-th and π-th entries of G
and wβ², respectively. The last two terms in the right-hand sideof (12) capture the multiuser interference-plus-noise effect,which has variance
π2π = var
β§β¨β©ββ β=π
πΊπβοΏ½οΏ½β + π€β²π
β«β¬β =
ββ β=π
πΊ2πβππ + Ξ£π€β²,ππ (13)
where Ξ£π€β²,ππ denotes the (π, π)-th entry of Ξ£π€β² .With the interference-plus-noise term being approximately
Gaussian distributed, deciphering οΏ½οΏ½π from (12) amounts to
detecting a ternary deterministic signal in the presence of zero-mean, Gaussian noise of known variance. Hence, the symbolerror rate (SER) for the π-th entry using the quantization rulein (10) entry-wise can be analytically obtained as
π RDπ,π = 2(1 β ππ)π
(π
ππ
)+ πππ
(πΊππ β π
ππ
)(14)
where π(π) := (1/β
2π)β«βπ
exp(βπ2/2)ππ denotes theGaussian tail function.
The SER in (14) is a convex function of the threshold π.Thus, taking the first-order derivative w.r.t. π and setting itequal to zero yields the optimal threshold for the π-th entryas
ππ =πΊππ
2+
π2π
πΊπππ . (15)
The corresponding minimum SER becomes [cf. (14) and (15)]
π RDπ,π =2(1 β ππ)π
(πΊππ
2ππ+
πππ
πΊππ
)+πππ
(πΊππ
2ππβ πππ
πΊππ
).
(16)
As the CDMA system signal-to-noise ratio (SNR) goes toinfinity, asymptotically we have G β I and Ξ£π€β² β 0, sothe optimal threshold in (15) approaches 0.5. The numericaltests in Section VII will also confirm that selecting ππ = 0.5approaches the minimum SER π RD
π,π over the range of SNRvalues encountered in most practical settings.
The clear advantage of RD-MUD is its simplicity as alinear detector. However, using the β2 norm for regularization,the RD-MUD inherently introduces a Gaussian prior for theunconstrained symbol vector and is thus not effecting sparsityin bRD; see also [15]. This renders the performance of RDdependent on the quantization threshold π β a fact also corrob-orated by the simulations in Section VII. These considerationsmotivate the ensuing development of an alternative relaxed S-MAP algorithm, which accounts for the sparsity present inb.
B. Lasso-based MUD
Another popular regression method is the Lasso one, whichregularizes the LS cost with the β1 norm. In the Bayesianformulation, regularization with the β1 norm corresponds toadopting a Laplacian prior for b [15]. The nice feature ofLasso-based regression is that it ensures sparsity in the resul-tant estimates. The degree of sparsity depends on the valueof π, which is selected here using the a priori informationavailable on the activity factor [cf. (4)]. The optimal solutionof (7) for π = 1 without the finite-alphabet constraint yieldsthe Lasso detector (LD) as
bLD = arg minb
1
2β₯y βHbβ₯22 + πβ₯bβ₯1. (17)
While a closed-form solution is impossible for general H,the minimization in (17) is a quadratic programming (QP)problem that can be readily accomplished using available QPsolvers, such as SeDuMi [13]. Upon slicing the solution in(17), we obtain the detection result as
bLD = π¬π(bLD). (18)
![Page 5: 454 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 59, NO. 2](https://reader031.vdocuments.mx/reader031/viewer/2022012505/6180f0cc30fe322e9c3198c6/html5/thumbnails/5.jpg)
458 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 59, NO. 2, FEBRUARY 2011
The larger π is, the more sparse bLD becomes [cf. (4)]. Thisis intuitively reasonable, because π is inversely proportionalto the activity factor ππ. Since the Lasso approach (17)yields sparse estimates systematically, and can be obtainedvia QP solvers in polynomial time, LD is a competitive MUDalternative. Lack of a closed-form solution prevents analyticalevaluation of the SER, which will be tested using simulationsin Section IV.
Remark 3: So far, we assumed π = {Β±1} to ensureequivalence of the βπ-norm regularized S-MAP detector (7)with the more general one in (5). However, the sub-optimalalgorithms of this section ignore the finite-alphabet constraint,and just rely on the convexity of the cost function in (7) tooffer MUD schemes that can be implemented efficiently, eitherin linear closed-form or through quadratic programming. Infact, starting from (7) and forgoing its equivalence with (5),the RD and LD relaxations of (7) apply also for general π -ary alphabets for any π > 2. Of course, the quantizationthresholds required for slicing the soft symbol estimates inorder to obtain hard symbol estimates must be modified inaccordance with the corresponding constellation. For non-constant modulus transmissions, the cost in (7) favors low-energy (close to the origin) constellation points, but this effectis mitigated by the judicious selection of the quantizationthresholds.
Note that forgoing the equivalence of (7) with (5) is lessof an issue for RD and LD because the major limitationof these simple relaxation-based algorithms is their sub-optimality, which emerges because they do not account forthe finite-alphabet symbol constraints. Next, MUD algorithmsare developed to minimize the S-MAP cost while adhering tothe constraint in (5) explicitly.
V. S-MAP DETECTORS WITH LATTICE SEARCH
User symbols in this section are drawn from an π -ary PAM alphabet π = {Β±1,Β±3, . . . ,Β±(π β 1)}, withπ even. Consider also reformulating the S-MAP problemin (5) using the QR decomposition of the matrix H (as-sumed here to be square or tall with full column rank) asH = QR, where R is a πΎ Γ πΎ upper triangular matrix,and Q is an π Γ πΎ unitary matrix. Substituting this QRdecomposition into (5), and left multiplying with the uni-tary Q inside the LS cost, the S-MAP detector becomes:bMAP = arg minbβππ
π
12
β₯β₯Qπy βQπ (QR)bβ₯β₯22
+ πβ₯bβ₯0 =arg minbβππΎ
π
12β₯yβ² βRbβ₯22 + πβ₯bβ₯0, where yβ² := Qπy; or,
after using the definitions of the norms,
bMAP = arg minbβππΎ
π
πΎβπ=1
β§β¨β©1
2
(π¦β²π β
πΎββ=π
π πβπβ
)2
+ πβ£ππβ£0
β«β¬β .
(19)
Although the optimal solution of (19) still incurs expo-nential complexity, the upper triangular form of R enablesdecomposition of (19) into sub-problems involving only scalarvariables. As it will be seen later in Section V-A, the S-MAPproblem accepts a neat closed-form solution in the scalar case(πΎ = 1). This is instrumental for the development of efficient(near-) optimal algorithms searching over the finite-alphabet
induced lattice. One such sub-optimal MUD algorithm isdescribed next.
A. Sparsity-Exploiting Decision-Directed MUD
Close look at (19) reveals that once the estimates {οΏ½οΏ½β}πΎβ=π+1
are available, the optimal οΏ½οΏ½π can be obtained by minimizingthe cost corresponding to the π-th summand of (19). This leadsto the per-symbol optimal decision-directed detector (DDD),which following related schemes in different contexts, couldbe also called successive interference cancellation or decision-feedback decoding, see e.g., [11, Sec. 9.4] and [17, Ch. 7]. Themain difference here is that b is sparse.
The DDD algorithm relies on back substitution to decom-pose the overall S-MAP cost into πΎ sub-costs each dependenton a single scalar variable and accepting a closed-form solu-tion. Specifically, supposing the symbols {οΏ½οΏ½DDD
β }πΎβ=π+1 havebeen already detected, the DDD algorithm detects the π-thsymbol as
οΏ½οΏ½DDDπ = arg min
ππβππ
1
2
(π¦β²π β
πΎββ=π+1
π πβοΏ½οΏ½DDDβ β π ππππ
)2
+ πβ£ππβ£0 . (20)
This minimization problem entails only one scalar variabletaking one of (π + 1) possible points in ππ. Thus, theminimum is found after comparing the costs correspondingto these (π + 1) values. Appendix A proves that this optimalsolution can be found in closed form as
οΏ½οΏ½DDDπ = βπLS
π β11 (2πLSπ βπLS
π β β βπLSπ β2 β 2π/π 2
ππ > 0)
(21)
where πLSπ := (π¦β²
πββπΎ
β=π+1 π πβοΏ½οΏ½DDDβ )/π ππ , and ββ β quantizes
to the nearest point in π. The simple implementation steps aretabulated as Algorithm 1.
Algorithm 1 (DDD): Input yβ², R, π, and π even.Output bDDD.
1: for π = πΎ,πΎ β 1, . . . , 1 do2: (Back substitution) Compute the unconstrained LS solution
πLSπ := (π¦β²
π ββπΎβ=π+1π πβοΏ½οΏ½
DDDβ )/π ππ .
3: (Quantize to π) Set οΏ½οΏ½DDDπ := βπLS
π β.4: (Compare with 0) If 2πLS
π βπLSπ β β βπLS
π β2 β 2π/π 2ππ β€ 0, then
set οΏ½οΏ½DDDπ := 0.
5: end for
When R is diagonal, Algorithm 1 yields the optimal S-MAPdetection result; i.e., bMAP = bDDD for this case. However,since the DDD detects symbols sequentially, it is prone toerror propagation, especially at low SNR values. The errorpropagation can be mitigated by preprocessing and orderingmethods [17, Ch. 7]. Also similar to all related detectors thatrely on back substitution, error performance of the sparseDDD will be analyzed assuming there is no error propagation.For the special case of π = 2, Appendix B shows that underthis assumption, the SER for (21) becomes
πDDDπ,π = 2(1 β ππ)π
( β£π ππβ£2
+π
β£π ππβ£)
+πππ
( β£π ππβ£2
β π
β£π ππβ£)
. (22)
![Page 6: 454 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 59, NO. 2](https://reader031.vdocuments.mx/reader031/viewer/2022012505/6180f0cc30fe322e9c3198c6/html5/thumbnails/6.jpg)
ZHU and GIANNAKIS: EXPLOITING SPARSE USER ACTIVITY IN MULTIUSER DETECTION 459
For a general π -ary constellation, it is also possible toapproximate the SER using the union bound.
Because it accounts for the finite-alphabet constraint, sparseDDD outperforms the relaxed detectors of the previous sectionβ a fact that will be confirmed also by simulations. How-ever, the error propagation emerging at medium-low SNRdegrades the sparse DDD performance when compared tothe optimal but computationally complex S-MAP detector. Asa compromise, a branch-and-bound type of MUD algorithmis developed next, to attain (near-) optimal performance byexploiting the finite-alphabet and sparsity constraints, at theprice of increased complexity compared to DDD.
B. Sparsity-Exploiting Sphere Decoding-based MUD
Sphere decoding (SD) algorithms have been widely used formaximum-likelihood (ML) demodulation of multiuser and/ormultiple-input multiple-output (MIMO) transmissions. Givena PAM or QAM alphabet, SD yields (near-) ML performanceat polynomial average complexity; see e.g., [11, Sec. 5.2].However, different from the ML-optimal SD that minimizes anLS cost, the S-MAP problem (19) entails also a regularizationterm to account for sparsity in b. Although the resultantalgorithm will be termed sparse sphere decoder (SSD), itsearches in fact within an ββ0-norm regularized sphere,β whichis not a sphere but a hyper-solid.
The goal is to find the unknown πΎ Γ 1 vector b β ππΎπ ,
which minimizes the distance metric [cf. (19)]
π·πΎ1 (b) :=
πΎβπ=1
β§β¨β©1
2
(π¦β²π β
πΎββ=π
π πβπβ
)2
+ πβ£ππβ£0
β«β¬β . (23)
For a large enough threshold π , candidate vectors (and thusthe minimizer of π·πΎ
1 too) satisfy2
π·πΎ1 (b) < π (24)
that specifies a hyper-solid inside which the wanted minimizermust lie. Define now [cf. (21)]
ππ :=
(π¦β²π β
πΎββ=π+1
π πβπβ
)/π ππ, π = πΎ,πΎ β 1 β β β 1
(25)
where ππΎ := π¦β²πΎ/π πΎπΎ . Note that ππ depends on {πβ}πΎβ=π+1.
Using (25), the hyper-solid in (24) can be expressed as
π·πΎ1 (b) =
πΎβπ=1
{π 2
ππ
2(ππ β ππ)
2+ πβ£ππβ£0
}< π (26)
or, in a more compact form as π·πΎ1 (b) =
βπΎπ=1 ππ(ππ) < π ,
where ππ(ππ) := (π 2ππ/2) (ππ β ππ)
2+ πβ£ππβ£0. In addition to
the overall metric π·πΎ1 assessing a candidate vector b β ππΎ
π ,as well as the per entry metric ππ for each candidate symbolππ β ππ, it will be useful to define the accumulated metricπ·πΎ
π :=βπΎ
β=π πβ(πβ) corresponding to the πΎβπ+1 candidatesymbols from entry πΎ down to entry π.
2At initialization, π is set equal to β so that (24) is always satisfied.
Per entry π, which subsequently will be referred to as levelπ, eq. (26) implies a set of inequalities:
Level π : ππ(ππ) <π β π·πΎπ+1 ,
for π = πΎ,πΎ β 1 . . . 1, (27)
with π·πΎπΎ+1 := 0. SSD relies on the Schnorr-Euchner (SE)
enumeration, see e.g., [9], properly adapted here to account forthe β0-norm regularization. SE capitalizes on the inequalities(27) to search efficiently over all possible vectors b withentries belonging to ππ. Any candidate b β ππΎ
π obeying theπΎ inequalities in (27) for a given π , will be termed admissible.Threshold π is reduced after each admissible b is identified, aswill be detailed soon. The SE-based SSD amounts to a depth-first tree search, which seeks and checks candidate vectorsstarting from entry (level) πΎ and working downwards to entry1 per candidate vector.
At level πΎ , SE search chooses ππΎ = βππΎβ11(2ππΎβππΎβ ββππΎβ2 β 2π/π 2
πΎπΎ > 0), which we know from (21) is theconstellation point yielding the smallest ππΎ . If this choiceof ππΎ does not satisfy the inequality (27) with π = πΎ ,no other constellation point will satisfy it either, and theminimizer of π·πΎ
1 in (23) must lie outside3 the hyper-solidpostulated by (24). If this choice of ππΎ satisfies (27), SEproceeds to level πΎ β 1 in which (25) is used first withπ = πΎ β 1 to find ππΎβ1 that depends on the chosen ππΎfrom level πΎ; subsequently, ππΎβ1 is selected as ππΎβ1 =βππΎβ1β11(2ππΎβ1βππΎβ1β β βππΎβ1β2 β 2π/π 2
πΎβ1,πΎβ1 > 0).If this choice of ππΎβ1 does not satisfy (27) with π = πΎ β 1,then we move back to level πΎ , and select ππΎ equal to theconstellation point yielding the second smallest ππΎ , and soon; otherwise, we proceed to level πΎ β 2. Continuing thisprocedure down to level 1, yields the first candidate vector b,which is deemed admissible since it has entries belonging toππ and also satisfying (24). This candidate is stored, and thethreshold is updated to π := π·πΎ
1 (b).Then, the search proceeds looking for a better candidate.
Now at level 1, we move up to level 2 and choose π2 equal tothe constellation point yielding the second smallest cost π2.If this π2 satisfies (27) at level 2 with the current π , we movedown to level 1 to update the value of π1 (note that π2 hasjust been updated and {πβ}πΎβ=3 are equal to the correspondingentries in b). If (27) at level 2 is not satisfied with the currentπ , we move up to level 3 to update the value of π3, and soon.
Finally, when it fails to find any other admissible candidatesatisfying (27), the search stops, and the latest admissiblecandidate b is the optimal bMAP solution sought. Withπ = β, the first found admissible candidate b is the bDDD
solution of Section V-A.Before summarizing the SSD steps, it is prudent to elaborate
on the ordered enumeration of the constellation points perlevel, which in fact constitutes the main difference of SSDrelative to SD. In lieu of the 0 constellation point and theβ0 norm, SE in SD enumerates the PAM symbols per levelπ in the order of increasing cost as: {ππ, ππ + 2Ξπ, ππ β2Ξπ, ππ + 4Ξπ, ππ β 4Ξπ, β β β }
β©π, with ππ := βππβ andΞπ := sign(ππ β ππ). (With βππβ yielding the smallest ππ,
3This will never happen with π = β in (24).
![Page 7: 454 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 59, NO. 2](https://reader031.vdocuments.mx/reader031/viewer/2022012505/6180f0cc30fe322e9c3198c6/html5/thumbnails/7.jpg)
460 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 59, NO. 2, FEBRUARY 2011
if Ξπ = 1, then βππβ + 2 yields the second smallest ππ andβππββ2 the third; and the other way around, if Ξπ = β1.) SDeffects such an ordered enumeration by alternately updatingππ = ππ+2Ξπ and Ξπ = βΞπβsign(Ξπ) [9]. To demonstratehow SSD further accounts for the β0-norm regularization andthe augmented alphabet of S-MAP which includes 0, letπ(π)π β ππ denote the symbol for level π incurring the π-th
smallest (π = 1, 2, . . . ,π +1) cost ππ. If π(π)π β π, then π
(π+1)π
will be either 0 or π(π)π + 2Ξπ. If ππ(0) < ππ(π
(π)π + 2Ξπ),
then the next symbol in the ordered enumeration should beπ(π+1)π = 0, and an auxiliary variable π
(π)π is used to cache the
subsequent symbol in the order as π(π+2)π = π
(π)π + 2Ξπ. With
π(π+1)π = 0, the auxiliary variable allows the wanted π
(π+2)π at
the next enumeration step to be retrieved from π(π)π .
Similar to SD, the ordered enumeration pursued by SSD perlevel implies a corresponding order in considering all b β ππΎ
π ,which leads to a repetition-free and exhaustive search of alladmissible candidate vectors. At the same time, the hyper-solid postulated by (24) shrinks as π decreases, until no otheradmissible vector can be found. This guarantees that the SSDoutputs the vector with the smallest π·πΎ
1 , and thus the optimalsolution to (19). The SSD algorithm can be summarized inthe following six steps 1β6 tabulated as Algorithm 2.
Remark 4: SSD inherits all the attractive features of SD [9].Specifically, during the search one basically needs to storeπ·πΎ
π per level π. Its in place update for each ππ candidateimplies that SSD memory requirements are only linear in πΎ . Inaddition, the computational efficiency of SSD (relative to thatof ML which is πͺ((π +1)πΎ)) stems from four factors: (i) theDDD solution provides an admissible initialization reducingthe search space at the outset; (ii) the recursive search enabledby the QR decomposition gives rise to the causally dependentinequalities (27), which restrict admissible candidate entries tochoices that even decrease over successive depth-first passesof the search; (iii) ordering per level increases the likelihoodof finding βnear-optimal admissibleβ candidates early, whichmeans quick and sizeable shrinkage of the hyper-solid, andthus fast convergence to the S-MAP optimal solution; and(iv) metrics involved in the search can be efficiently reusedsince children of the same level in the tree share the alreadycomputed accumulated metric of the βpartial pathβ from thislevel to the root.
Compared to other sub-optimal detection schemes proposedin previous sections, the SSD algorithm can return the S-MAP optimal solution possibly at exponential complexity,unless one stops the search at the affordable complexity βcase in which the solution is only ensured to be near-optimal.Fortunately, at medium-high SNR, both SD and SSD return theoptimal solution at average complexity which is polynomial(typically cubic). Moreover, SSD can be generalized to providesymbol-by-symbol soft output with approximate a posterioriprobabilities, as is the case with the SD; see e.g., [11, Chapter5].
VI. GENERALIZATIONS OF S-MAP DETECTORS
Up to now, four sparsity-exploiting MUD algorithms havebeen developed to solve the integer program associated withthe linear model in (1). The present section will present
Algorithm 2 (SSD): Input π = β, yβ², R, π, and π even.Output the solution bMAP := bSSD to (19).
1: (Initialization) Set π := πΎ, π·πΎπ+1 := 0.
2: Compute ππ as in (25), set ππ := βππβ, π(π)π := 0, and Ξπ :=sign(ππ β ππ).If 2ππππ β π2π β 2π/π 2
ππ < 0, then// symbol 0 yields smaller ππ than βππβ
Set π(π)π := ππ, and ππ := 0.End if and go to Step 3.
3: If ππ(ππ) := (π 2ππ/2)(ππ β ππ)
2 + πβ£ππβ£0 β₯ π βπ·πΎπ+1, then
go to Step 4. // outside hyper-solid in (24)Else if β£ππβ£ > π β 1, then
go to Step 6. // inside hyper-solid in (24),but outside ππ
Else if π > 1, thencompute π·πΎ
π := π·πΎπ+1 + ππ(ππ); set π := π β 1, and go
to Step 2. // go the next level(deeper in the tree)
Else go to Step 5. // π = 1, at the treeβs bottomEnd if
4: If π = πΎ, then terminateElse set π := π + 1, go to Step 6.End if
5: (An admissible b is found)Set π := π·πΎ
2 + π1(π1), bSSD := b, and π := π + 1; then go toStep 6.
6: (Enumeration at level π proceeds to the candidate symbol nextin the order)If ππ = 0, then
Retrieve the next (based on cost ππ ordering) symbol ππ :=π(π)π , and set π(π)π := πΉπΏπ΄πΊ.
Else set ππ := ππ + 2Ξπ , and Ξπ := βΞπ β sign(Ξπ).
If π(π)π β= πΉπΏπ΄πΊ and 2ππππ β π2π β 2π/π 2ππ < 0, then
// 0 yields smaller ππ than ππSet π(π)π := ππ, and ππ := 0.
End ifEnd if and go to Step 3.
interesting generalizations to account for correlated user(in)activity across symbols, and under-determined CDMAsystems.
A. Exploiting user (in)activity across symbols
Sparsity-aware detectors for the linear model in (1) wereso far developed on a symbol-by-symbol basis, which doesnot account for the fact that user (in)activity typically persistsacross multiple symbols. To this end, user activity across timecan for instance be thought of as a Markov chain with twostates (active-inactive). Once a user terminal starts transmittingto the AP, it becomes more likely to stay active for thenext symbol slot too; and likewise, inactive once it stopstransmitting. In this model, the state transition probability fromeither one state to the other is relatively much smaller thanthat of staying unchanged, and this manifests itself to the saiddependence of user (in)activities across time.
Admittedly, MUD schemes accounting for this dependencemust process the aggregation of data vectors y obeying (1)across slots. With ππ denoting the number of slots, the numberof unknowns (πΎππ ) can grow prohibitively with ππ . Oneapproach to cope with this βcurse of dimensionalityβ is via
![Page 8: 454 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 59, NO. 2](https://reader031.vdocuments.mx/reader031/viewer/2022012505/6180f0cc30fe322e9c3198c6/html5/thumbnails/8.jpg)
ZHU and GIANNAKIS: EXPLOITING SPARSE USER ACTIVITY IN MULTIUSER DETECTION 461
dynamic programming, which can take advantage of the factthat correlation is only present between two consecutive slots;see e.g., [17, Sec. 4.2]. However, for π -ary alphabets, theresultant sequential detector requires evaluating per symbolslot the path weights of all (π +1)πΎ possible symbol vectors.This high computational burden is impractical for real-timeimplementations.
The proposed alternative to bypass this challenge stemsfrom the observation that for a given slot π‘ the influence ofall previous slots {π‘β²}π‘β1
π‘β²=0 on the S-MAP detection rule isreflected only in the prior probability of each user being activeat time π‘; i.e., user πβs current (and time-varying) activityfactor ππ,π(π‘). The natural means to capture this influenceonline is to track each userβs activity factor using the recursiveLS (RLS) estimator [12, Ch. 12], based on activity factorsfrom previous slots; that is,
ππ,π(π‘)=arg minπ
π‘β1βπ‘β²=0
π½π‘,π‘β²π (π β β£οΏ½οΏ½π(π‘β²)β£0)2, π‘ = 1, . . . (28)
where οΏ½οΏ½π(π‘β²) denotes user πβs detected symbol at time π‘β², andthe so-called βforgetting-factorβ π½π‘,π‘β²
π describes the effectivememory (data windowing). A popular choice is the exponen-tially decaying window, for which π½π‘,π‘β²
π := π½π‘βπ‘β²π for some
0 βͺ π½π < 1. Accordingly, (28) can expressed in closed form,recursively as
ππ,π(π‘)=1 β π½π
π½π
(π‘β1βπ‘β²=0
π½π‘βπ‘β²π β£οΏ½οΏ½π(π‘β²)β£0
)/(1 β π½π‘
π)
=π½π β π½π‘
π
1 β π½π‘π
ππ,π(π‘ β 1)+1 β π½π
1 β π½π‘π
β£οΏ½οΏ½π(π‘ β 1)β£0, π‘ = 1, . . .
(29)
where the last equality comes from back substitution ofππ,π(π‘ β 1). The choice of π½π critically depends on the(in)activity correlation between consecutive slots. In the ex-treme case where user (in)activities across slots are inde-pendent, the infinite-memory window (π½π = 1) is optimal,and (29) reduces to the simple online time-averaging estimateππ,π(π‘) = 1
π‘
βπ‘β1π‘β²=0 β£οΏ½οΏ½π(π‘β²)β£0.
Adapting the user activity factors allows one to weighentries of β0-norm regularization which in turn affects the priorprobability in the S-MAP detector (5) through the coefficientππ(π‘) corresponding to ππ,π(π‘) (cf. Remark 2). Note that whenthe correlation across time is strong, it is possible that ππ,π(π‘)can approach 1, case in which ππ(π‘) is not guaranteed tostay positive. This will cause problems to the relaxed S-MAP detectors of Section IV, as those schemes rely on theconvexity of the cost function in (7). However, the DDDand SSD algorithms will remain operational, because theyrely on enumeration per symbol (in DDD) or per group ofsymbols within a sphere (in SSD). Note also that with π < 0the regularization term in the minimization of (19) is non-positive; hence, π < 0 encourages searching over the non-zeroconstellation points (and thus discourages sparsity), whereasπ > 0 promotes sparsity.
B. Under-determined CDMA Systems
Minimal-size spreading sequences, even smaller than thenumber of users, is well motivated for bandwidth and power
savings. Without finite-alphabet constraints on the wantedvector, results available in the CS literature guarantee recoveryof sparse signals from a few observations; see e.g., [4], [5] andreferences therein. Specifically, [4] shows that if the vectorof interest is sparse (or compressible) over a known basis,then it is possible to reconstruct it with very high accuracyfrom a small number of random linear projections at least inthe ideal noise-free case. For non-ideal observations corruptedwith unknown noise of bounded perturbation, [5] provides anupper bound on the reconstruction error, which is proportionalto the noise variance for a sufficiently sparse signal. However,CS theory pertains to sparse analog-valued signals. Moreover,the noise considered in a practical communication system istypically Gaussian, or generally drawn from a distributionhaving possibly unbounded support. Therefore, existing resultsfrom the CS literature do not carry over to the present context.
Nevertheless, it is still interesting to consider extensions ofall the (sub-)optimal sparsity-exploiting MUD algorithms toan under-determined CDMA system with π < πΎ , where theobservation matrix H becomes fat. Consider first the two typesof relaxed S-MAP detectors. The RD-MUD in (10) clearlyworks when π < πΎ , since the 2πI term inside the inversionrenders the overall matrix full rank. However, since the RDis a linear detector, it is expected to lose identifiability inthe under-determined case, similar to the MMSE detectorsfor the sparsity-agnostic MUD schemes. Similarly, the LDproblem (17) is also solvable for a fat H matrix, as the Lassoproblem in CS. However, neither of them accounts for theaugmented finite-alphabet constraint present in the original S-MAP problem (7).
The S-MAP detectors with lattice search are challengingto implement when π < πΎ . The main obstacle is the QRdecomposition of the fat matrix H, which yields the uppertriangular matrix R of the same dimension π ΓπΎ . Instead ofa single unknown symbol, now the sparse DDD must optimizeover the last (π β πΎ + 1) symbols in b. However, apartfrom exhaustive search there is no low-complexity methodto solve the aforementioned problem involving (π β πΎ +1) variables, because sub-optimal alternatives introduce severeerror propagation.
The same problem appears also with the SSD. To tacklethe under-determined case, the generalized SD in [9] fixes thelast (π βπΎ) symbols of b and relies on the standard SSD todetect the remaining πΎ symbols that minimize a cost similarto the one in (23). Repeating this search for every choice of thelast (π βπΎ) symbols, yields eventually the overall optimumvector. The complexity of the latter is exponential in (πβπΎ),regardless of the SNR. Recently, an alternative SD approach toavoid this exponential complexity has been developed for theunder-determined case [8]. This algorithm takes advantage ofthe fact that for constant-modulus constellations the usual LScost can be modified without affecting optimality, by addingthe β2-norm bπb constant for every vector in the alphabet.This extra term allows one to obtain an equivalent full-ranksystem on which the standard SD algorithm can be applied.This efficient method can be readily extended to handle non-constant modulus constellations.
Interestingly, for our S-MAP detectors in (7) with latticesearch of binary transmitted symbols, the norm term needed
![Page 9: 454 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 59, NO. 2](https://reader031.vdocuments.mx/reader031/viewer/2022012505/6180f0cc30fe322e9c3198c6/html5/thumbnails/9.jpg)
462 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 59, NO. 2, FEBRUARY 2011
for regularization comes naturally from the Bernoulli prior.Specifically, with π = 2 the reformulated S-MAP detectors in(7) can be equivalently written as
bMAP = arg minbβππΎ
π
1
2[bπ (HπH + 2πI)bβ 2yπHb]
= arg minbβππΎ
π
1
2β₯yβ² β Rbβ₯22 (30)
where R is the full rank πΎ ΓπΎ upper triangular matrix suchthat Rπ R = HπH + 2πI, and yβ² := RβπHπy. Utilizingthe metric of (30), the back substitution of DDD and thelattice point search of SSD can be implemented easily. Hence,these S-MAP detectors can be readily extended to under-determined systems. In this way, all the (sub-)optimal S-MAPdetectors are applicable with less observations than unknownsin a CDMA system with low activity factor, but their SERperformance will certainly be affected. In Section VII, we willprovide simulated performance comparisons of the differentoptimal and sub-optimal S-MAP algorithms proposed, for avariable number of observations.
C. Group Lassoing block activity
The last generalization considered pertains to user(in)activity in a (quasi-)synchronous block fashion, whereduring a block of ππ symbol slots, user π remains (in)activeindependently from other users and across blocks. Concatenatethe πΎ user symbols across ππ time slots in the πΎΓππ matrixB := [b(1) . . .b(ππ )], where b(π‘) collects the symbols of allπΎ users at slot π‘, and likewise for the receive-data matrixY as well as the noise matrix W, both of size π Γ ππ .With these definitions, the counterpart of (1) for this blockmodel is Y = HB + W. Letting the ππ Γ 1 vector bπ :=[ππ(1), . . . , ππ(ππ )]
π collect the ππ symbols of user π, it isuseful to consider it drawn from an augmented (due to possibleinactivity) block alphabet ππ,ππ := πππ
βͺ{0ππ }. Assumingagain binary transmissions, the S-MAP block detector willnow yield
BMAP = arg minbπβππ,ππ
1
2β₯Y βHBβ₯2πΉ +
πΎβπ=1
ππβππ
β₯bπβ₯0
= arg minbπβππ,ππ
1
2β₯Y βHBβ₯2πΉ +
πΎβπ=1
ππβ₯bπβ₯2 (31)
where ππ := 1βππ
ln(
1βππ
ππ/2ππ
).
Similar to (7), the convex reformulation of the cost in (31)will lead to what is referred to in statistics as Group Lasso[19], which effects group sparsity on a block of symbols(bπ in our case). This Group-Lasso based formulation isparticularly handy for under-determined CDMA systems. Infact, the unconstrained version of (31) can be solved first tounveil the nonzero rows (i.e., the support) of B, with improvedreliability as ππ increases. Subsequently, standard sparsity-agnostic MUD schemes can be run on the estimated set ofactive users. Note that such a two-step approach works in theunder-determined case, and also reduces number of symbolsto be detected per time slot.
5 10 15 20 25
10β3
10β2
Average SNR(dB)
SE
R
P
e
Optimal ΞΈk
ΞΈ = 0.5ΞΈ = 0.35ΞΈ = 0.65
Fig. 2. SER vs. average SNR (in dB) for RD-MUD with π = 32 andπΎ = 20 and different quantization threshold πβs.
VII. SIMULATIONS
We simulate a CDMA system with πΎ = 20 users, eachwith activity factor ππ = 0.3. Random sequences with lengthπ = 32 are used for spreading. We consider non-dispersiveindependent Rayleigh fading channels between AP and users,where the channel gain ππ of the π-th user is Rayleighdistributed with variance πΈ[π2π] = π2. Thus, the averagesystem SNR is set to be π2 since the AWGN w has unitvariance.
Test Case 1 (Quantization thresholds for RD). First, we testthe RD scheme with different quantization thresholds π in(9). The optimum threshold for the π-th symbol is obtainedas in (15) per channel realization H. The resulting SER iscompared for four choices of π: 0.5, 0.35, 0.65, and ππ. Thetheoretical minimum SER π RD
π,π in (16) using the optimum π isalso added for comparison. Fig. 2 shows that the SER curvewith π = 0.5 comes very close to the one of the optimalππ, and thus constitutes a near-optimal choice in practice.Moreover, those two curves also coincide with the analyticalSER formulation corresponding to π RD
π,π , thus corroborating theclosed-form expression in (16).
Test Case 2 (S-MAP MUD algorithms). Next, the RD, LD,DDD, and SSD MUD algorithms are all tested for both BPSKand 4-PAM constellations, and their SER performance is com-pared. For LD, the quadratic program in (17) is solved usingthe general-purpose SeDuMi toolbox [13]. The quantizationrule chooses the nearest point in ππ for both RD and LD. Forcomparison, we also include the ordinary LS detector, whichcorresponds to the RD solution in (10) with π = 0.
Fig. 3(a) shows that the LS detector exhibits the worstperformance. This is intuitive since it neither exploits the finitealphabet nor the sparsity present in b. The SSD exhibits thebest performance at the price of highest complexity. The LDoutperforms the RD algorithm, as predicted. It is interestingto observe that even at low SNR region the DDD algorithmis surprisingly competitive, especially in view of its low com-plexity that grows only linearly in the number of symbols πΎ .The diversity orders for those detectors are basically the same.
![Page 10: 454 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 59, NO. 2](https://reader031.vdocuments.mx/reader031/viewer/2022012505/6180f0cc30fe322e9c3198c6/html5/thumbnails/10.jpg)
ZHU and GIANNAKIS: EXPLOITING SPARSE USER ACTIVITY IN MULTIUSER DETECTION 463
5 10 15 20 2510
β4
10β3
10β2
10β1
Average SNR(dB)
SE
R
LSRDLDDDDSSD
(a)
5 10 15 20 25
10β3
10β2
10β1
Average SNR(dB)
SE
R
LSRDLDDDDSSD
(b)
Fig. 3. SER vs. average SNR (in dB) of sparsity-exploiting MUD algorithmswith π = 32 and πΎ = 20 for (a) BPSK, and (b) 4-PAM alphabets.
This is reasonable since independent Rayleigh fading channelsbetween AP and users were simulated here. The correspondingcurves for 4-PAM depicted in Fig. 3(b) follow the same trend.However, compared to Fig. 3(a), the RD algorithm degradesnoticeably as its SER approaches the LS one. This is becausechoices of quantization thresholds become more influential asthe constellation size increases. As expected, the LD exhibitsresilience to this influence. The DDD and SSD algorithmshave almost identical performance in high-SNR region.
Test Case 3 ((In)activity across symbols). In this case, theuser (in)activity is correlated across time slots. We modelthis random (in)activity process as a two-state (active-inactive)stationary Markov chain. The state transition matrix is P =[π (1 β π); π (1 β π)], where π is uniformly distributed over[0.8 0.85], and π over [0.05 0.1], for each user. For this model,the expected number of successive active slots is 1/(1 β π),and 1/π for the inactive ones. Also, the limiting probabilityfor the βactiveβ state becomes π/(1 β π + π), taking valuesfrom the interval [0.2 0.4]. Note that the activity factor overtime is still quite low. We use the RLS approach to estimateππ,π(π‘) as in (29) using π½ = 0.5, and test both the DDD
0 5 10 15 20 25 30 35 40 45 50
10β3
10β2
Time slot t
SE
R
DDDSSD
SNR = 5 dB
SNR = 10 dB
SNR = 15 dB
Fig. 4. SER vs. time π‘ of sparsity-exploiting MUD algorithms with RLSestimation of the activity factors.
5 10 15 20 25
10β4
10β3
10β2
10β1
Average SNR(dB)
SE
R
RDDDDSSD
N = 32
N = 16
N = 8
Fig. 5. SER vs. average SNR (in dB) of sparsity-exploiting MUD algorithmswith π = 32, 16, or 8 and πΎ = 20.
and SSD algorithms in solving the resultant S-MAP detectionproblem. The empirical SER is plotted in Fig. 4 across timefor different SNR values. Clearly, the proposed scheme iseffective in tracking the evolving time-correlated activity. Forthe same SNR value, it yields SER performance similar to theindependent case in Fig. 3(a).
Test Case 4 (Under-determined CDMA systems). We alsotest the S-MAP MUD algorithms for under-determined sys-tems, by varying π from 32 to 16 and 8. The results aredepicted in Fig. 5. Since the RD is a simple linear detector,it is expected that once π < πΎ , it will lose identifiability,and exhibits a considerably flat SER curve. At the same time,DDD still enjoys almost full diversity with a moderate choiceof π = 16. Being the optimum detector, the SSD collects thefull diversity even if π = 8; however, the other two kinds ofdetectors exhibit flat SER curves, as expected.
The Group Lasso scheme for recovering block activity isalso included for the under-determined case. Fig. 6 illustratesthe activity recovery error rate for different values of π andππ . The number of observations π affects the diversity order,
![Page 11: 454 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 59, NO. 2](https://reader031.vdocuments.mx/reader031/viewer/2022012505/6180f0cc30fe322e9c3198c6/html5/thumbnails/11.jpg)
464 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 59, NO. 2, FEBRUARY 2011
5 10 15 20
10β2
10β1
100
Average SNR(dB)
Err
or R
ate
N=16,N
s=20
N=16,Ns=10
N=16,Ns=1
N=8,Ns=20
N=8,Ns=10
N=8,Ns=1
Fig. 6. Error rate for block activity vs. average SNR (in dB) of Group Lassoalgorithm with π = 16, 8 and ππ = 20, 10, 1.
while the block size ππ influences the recovery accuracy.
VIII. CONCLUSIONS
The MUD problem of sparse symbol vectors emerging withCDMA systems having low-activity factor was considered.Viewing user inactivity as augmenting the underlying alphabet,the a priori available sparsity information was exploited inthe optimal S-MAP detector. The exponential complexityassociated with the S-MAP detector was reduced by resortingto (sub-) optimal algorithms. Relaxed S-MAP detectors (RDand LD) come with low complexity but sacrifice optimality,because they ignore the finite-alphabet constraint. The secondkind of detectors (DDD and SSD), searches over a subset ofthe alphabet and exhibits improved performance at increasedcomplexity. The performance was analyzed for the RD andDDD algorithms, and closed-form expressions were derivedfor the corresponding symbol error rates.
S-MAP detectors were further generalized to deal withcorrelated user (in)activity across symbols by recursively esti-mating each userβs activity factor online; and also with under-determined (a.k.a. over-saturated CDMA) settings emergingwhen the spreading gain is smaller than the potential numberof users. Coping with the latter becomes possible throughregularization with the β0 norm prior, or, with the GroupLasso-based recovery of the active user set. The numericaltests corroborated our analytical findings and quantified therelative performance of the novel sparsity-exploiting MUDalgorithms.
Our future research agenda includes analytical performanceevaluation in the under-determined case, which has beenavailable for the recovery of sparse signals but without finite-alphabet constraints. In addition, it will be interesting todevelop S-MAP detectors based on lattice search for higher-order constellations in the under-determined case.
APPENDIX APROOF OF (21)
To solve the optimization problem (20), consider first theunconstrained solution to the LS part of the cost, which can
be written as
πLSπ :=
(π¦β²π β
βπΎβ=π+1 π πβοΏ½οΏ½
DDDβ
)/π ππ. (32)
The detected symbol in (20) can be equivalently expressed as
οΏ½οΏ½DDDπ = arg min
ππβππ
π(ππ),
π(ππ) :=(πLSπ β ππ
)2+ (2π/π 2
ππ)β£ππβ£0. (33)
The solution οΏ½οΏ½DDDπ can be obtained by comparing π(0) with
minππβπ π(ππ). Specifically, as the cost π(ππ) is quadratic forππ β π, the minimum is achieved at π(βπLS
π β), by quantizingπLSπ to the nearest point in π. Thus, οΏ½οΏ½DDD
π = 0 only if π(0) β€π(βπLS
π β), or equivalently, after using the definition of π(β ), if2πLS
π βπLSπ β β βπLS
π β2 β 2π/π 2ππ β€ 0. This completes the proof
of (21).
APPENDIX BPROOF OF (22)
When π = 2, the DDD solution (21) reduces to
οΏ½οΏ½DDDπ = sign(πLS
π )11(2β£πLSπ β£ β 1 β 2π/π 2
ππ > 0) (34)
due to the fact that βπLSπ β = sign(πLS
π ) β {Β±1}.Recalling that b denotes the transmitted vector, and substi-
tuting y = Hb + w yields
yβ² := Qπy = Rb + u (35)
where u := Qπw is zero-mean Gaussian with identity covari-ance matrix. Supposing that there is no error propagation, theπLSπ term in (34) becomes [cf. (32)]
πLSπ =
(πΎββ=π
π πβοΏ½οΏ½β + π’π βπΎβ
β=π+1
π πβοΏ½οΏ½DDDπ
)/π ππ
= οΏ½οΏ½β + π’π/π ππ (36)
where π’π denotes the π-th entry of u.To analyze the error probability for the detector in (34)
consider the following three cases.a) οΏ½οΏ½π = 0 is sent: under this case, πLS
π = π’π/π ππ and adetection error emerges when οΏ½οΏ½DDD
π = 1 or β1. With theclosed-form DDD detector (34) in mind, such an error occursonly if both πLS
π β= 0 and 2β£πLSπ β£ β 2π/π 2
ππ β 1 > 0 hold.The first case corresponds to π’π β= 0 and the second one isequivalent to β£π’πβ£ > β£π ππβ£/2 + π/β£π ππβ£, which is included inthe event π’π β= 0. Hence, to evaluate the error probability itsuffices to consider only the case β£π’πβ£ > β£π ππβ£/2 + π/β£π ππβ£.b) οΏ½οΏ½π = 1 is sent: following the analysis in a), an error occursif sign(π ππ)π’π β€ ββ£π ππβ£/2 + π/β£π ππβ£.c) οΏ½οΏ½π = β1 is sent: following the analysis in a), an erroroccurs if sign(π ππ)π’π β₯ β£π ππβ£/2 β π/β£π ππβ£.
Given that the Gaussian distributed π’π has variance 1, theoverall SER for the π-th entry ππ becomes
πDDDπ,π =
βπ=0,Β±1 π (errorβ£οΏ½οΏ½π = π)π (οΏ½οΏ½π = π)
= 2(1 β ππ)π
( β£π ππβ£2
+π
β£π ππβ£)
+ πππ
( β£π ππβ£2
β π
β£π ππβ£)
. (37)
![Page 12: 454 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 59, NO. 2](https://reader031.vdocuments.mx/reader031/viewer/2022012505/6180f0cc30fe322e9c3198c6/html5/thumbnails/12.jpg)
ZHU and GIANNAKIS: EXPLOITING SPARSE USER ACTIVITY IN MULTIUSER DETECTION 465
ACKNOWLEDGEMENT
The authors wish to thank the anonymous reviewers for theirhelpful feedback which improved exposition of this paper.
REFERENCES
[1] D. Angelosante, E. Grossi, G. B. Giannakis, and M. Lops, βSparsity-aware estimation of CDMA system parameters," EURASIP J. AdvancesSignal Process., June 2010, article ID 417981.
[2] R. G. Baraniuk, βCompressive sensing," IEEE Signal Process. Mag.,vol. 24, pp. 118-121, July 2007.
[3] C. R. Berger, S. Zhou, J. C. Preisig, and P. Willett, βSparse channelestimation for multicarrier underwater acoustic communication: fromsubspace methods to compressed sensing," IEEE Trans. Signal Process.,vol. 58, no. 3, pp. 1708-1721, Mar. 2010.
[4] E. J. CandΓ¨s and T. Tao, βNear-optimal signal recovery from randomprojections: universal encoding strategies?" IEEE Trans. Inf. Theory,vol. 52, pp. 5406-5425, Dec. 2006.
[5] E. CandΓ¨s and M. B. Wakin, βAn introduction to compressive sampling,"IEEE Signal Process. Mag., vol. 25, pp. 21-30, Mar. 2008.
[6] M. Cetin and B. M. Sadler, βSemi-blind sparse channel estimation withconstant modulus symbols," in Proc. Intl. Conf. Acoust., Speech, SignalProcess., Philadelphia, PA, Mar. 2005.
[7] S. S. Chen, D. L. Donoho, and M. A. Saunders, βAtomic decompositionby basis pursuit," SIAM J. Scientific Comput., vol. 20, pp. 33-61, 1998.
[8] T. Cui and C. Tellambura, βAn efficient generalized sphere decoder forrank-deficient MIMO systems," IEEE Commun. Lett., vol. 9, pp. 423-425, May 2005.
[9] M. O. Damen, H. E. Gamal, and G. Caire, βOn maximum-likelihooddetection and the search for the closest lattice point," IEEE Trans. Inf.Theory, vol. 49, pp. 2389-2402, Oct. 2003.
[10] A. K. Fletcher, S. Rangan, and V. K. Goyal, βA sparsity detectionframework for on-off random access channels," in IEEE Proc. Intl.Symp. Inf. Theory, pp. 169-173, Seoul, South Korea, June 2009.
[11] G. B. Giannakis, Z. Liu, X. Ma, and S. Zhou, Space-Time Coding forBroadband Wireless Communications. John Wiley & Sons, Inc., 2007.
[12] A. H. Sayed, Fundamentals of Adaptive Filtering. John Wiley & Sons,2003.
[13] J. F. Sturm βUsing SeDuMi 1.02, a MATLAB toolbox for optimizationover symmetric cones," Optimization Methods Software, vol.11-12, pp.625-653, 1999. [Online], Available: http://sedumi.mcmaster.ca.
[14] Z. Tian, G. Leus, and V. Lottici, βDetection of sparse signals underfinite-alphabet constraints," in Proc. Intl. Conf. Acoust., Speech SignalProcess., pp. 2349-2352, Taipei, Apr. 2009.
[15] R. Tibshirani, βRegression shrinkage and selection via the lasso," J.Royal Statistical Soceity: Series B, vol. 58, pp. 267-288, 1996.
[16] M. E. Tipping, βSparse Bayesian learning and the relevance vectormachine," J. Machine Learning Research, vol. 1, pp. 211-244, 2001.
[17] S. Verdu, Multiuser Detection. Cambridge, UK: Cambridge Univ. Press,1998.
[18] E. Viterbo and J. Boutros, βA universal lattice code decoder for fadingchannels," IEEE Trans. Inf. Theory , vol. 45, pp. 1639-1642, July 1999.
[19] M. Yuan and Y. Lin, βModel selection and estimation in regression withgrouped variables," J. Royal Statistical Soceity: Series B, vol. 68, pp.49-67, 2006.
[20] H. Zhu and G. B. Giannakis, βSparsity-embracing multiuser detectionfor CDMA systems with low activity factor," in Proc. IEEE Intl. Symp.Inf. Theory, pp. 164-168, Seoul, South Korea, June 2009.
Hao Zhu (Sβ07) received her Bachelor degree fromTsinghua University, Beijing, China in 2006 andthe M.Sc. degree from the University of Minnesota,Minneapolis in 2009, both in electrical engineering.
From 2005 to 2006, she was working as a researchassistant with the Intelligent Transportation informa-tion Systems (ITiS) Laboratory, Tsinghua Univer-sity. Since September 2006, she has been workingtowards her Ph.D. degree with the Department ofElectrical and Computer Engineering, University ofMinnesota. Her research interests include sparse
signal recovery and blind source separation in signal processing.
Georgios B. Giannakis (Fellowβ97) received hisDiploma in Electrical Engr. from the Ntl. Tech. Univ.of Athens, Greece, 1981. From 1982 to 1986 he waswith the Univ. of Southern California (USC), wherehe received his MSc. in Electrical Engineering,1983, MSc. in Mathematics, 1986, and Ph.D. inElectrical Engr., 1986. Since 1999 he has been aprofessor with the Univ. of Minnesota, where he nowholds an ADC Chair in Wireless Telecommunica-tions in the ECE Department, and serves as directorof the Digital Technology Center.
His general interests span the areas of communications, networking andstatistical signal processing, subjects on which he has published more than300 journal papers, 500 conference papers, two edited books and two researchmonographs. Current research focuses on compressive sensing, cognitiveradios, network coding, cross-layer designs, mobile ad hoc networks, wirelesssensor and social networks. He is the (co-) inventor of 18 patents issued, andthe (co-) recipient of seven paper awards from the IEEE Signal Processing(SP) and Communications Societies, including the G. Marconi Prize PaperAward in Wireless Communications. He also received Technical AchievementAwards from the SP Society (2000), from EURASIP (2005), a Young FacultyTeaching Award, and the G. W. Taylor Award for Distinguished Research fromthe University of Minnesota. He is a Fellow of EURASIP, and has served theIEEE in a number of posts, including that of a Distinguished Lecturer for theIEEE-SP Society.