a unified treatment of superposition coding aided communications: theory and practice

18
IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 13, NO. 3, THIRD QUARTER 2011 503 A Unied Treatment of Superposition Coding Aided Communications: Theory and Practice Rong Zhang and Lajos Hanzo Abstract—In this tutorial, a unied treatment of the topic of SuperPosition Coding (SPC) aided communications systems is offered, where we focus our attention not only on the theoretical background of SPC but also on its implementation techniques. The design methodology of SPC aided communica- tions transceivers is described, including the Multiple Level Mod- ulation (MLM) concept, the corresponding iterative Successive Interference Cancellation (SIC) aided receiver and the quasi- random coding principle. Building on these fundamental discus- sions, various future wireless applications of the SPC technique were provided, which encompass cooperative communications- assisted relay systems and cross-layer SPC aided multiplexed Hybrid Automatic Repeat reQuest (HARQ) techniques. Our discussions demonstrate that the SPC technique is a promising enabler for diverse future wireless communications scenarios, ranging from cellular networks to cooperative networks. Index Terms—Superposition coding, turbo receiver, quasi- random coding, cooperative communications, HARQ. I. I NTRODUCTION E FFICIENT information delivery over wireless links is becoming more and more crucial, requiring sophisticated enabling techniques to satisfy the ever-evolving service re- quirements. It has been widely recognised that there is no single enabling technique, which is capable of improving the achievable system performance in all scenarios. This is the consequence of having to satisfy inherent design tradeoffs, leading to sophisticated cross-layer interactions [1]. Despite these challenges, from the stand-alone physical layer per- spective, Multiple Input Multiple Output (MIMO) [2] and Orthogonal Frequency Division Multiplexing (OFDM) [3] constitute promising techniques, especially, when combined with powerful adaptive transmission [4] and Hybrid Automatic Repeat reQuest (HARQ) [5] in the link layer, which requires sophisticated Digital Signal Processing (DSP) algorithms. At the time of writing, the Third Generation Partnership Project’s (3GPP) Long Term Evolution (LTE) [6] has proposed to employ OFDMA in the DownLink (DL) and single car- rier Frequency Division Multiple Access (FDMA) combined with Frequency Domain Equalisation (FDE) in the UpLink (UL). Although substantial data-rate improvements have been achieved by the 3GPP LTE system, it should be recognised that this improvement is mostly attributed to the MIMO technique employed and to its exible bandwidth allocation [7]. Future Manuscript received 11 November 2009; revised 7 and 9 March 2010. The nancial support of the EPSRC under the auspice of the UK-India Centre of Excellence in Wireless Communications is gratefully acknowledged. The authors are with School of ECS., Univ. of Southampton, SO17 1BJ, UK (e-mail: {rz,lh}@ecs.soton.ac.uk, Tel: +44-23-80-593 125, Fax: +44-23- 80-593 045, http://www-mobile.ecs.soton.ac.uk). Digital Object Identier 10.1109/SURV.2011.061610.00102 wireless systems are expected to support massive data rates both in the UL and in the DL, leading to the key problem of minimising the cost-per-bit. These requirements may not be readily fullled, unless further substantial advances are made. Despite the above-mentioned great strides in technology, there is a lack of a unied treatment of the philosophy of advanced SuperPosition Coding (SPC) aided communications and its applications. In a nutshell, a SPC scheme is constituted by multiple superimposed layers, where each layer is protected by a near-capacity binary channel code and the resultant coded layers are appropriately weighted by their layer-specic amplitude scaling factors and phase-rotated before their linear superposition takes place. Although the concept of SPC has been exploited in wireless communications [8]–[10], until recently SPC schemes have not been implemented. Hence, in this tutorial we aim for providing a unied treatment of near- capacity SPC aided communications, commencing from the underlying theory and covering sophisticated applications. An important motivation for aiming to stimulate further research on SPC-aided cooperative transceivers is that they may be considered power-efcient ’green’ solutions, since in con- trast the classic modulation schemes obeying the logarithmic Shannon-Hartley law, they allow a linear increase of the throughput as a function of the transmit power. In its simplest guise, a SPC scheme can be viewed as a specic modulation technique, which may be referred to as the Multiple Level Modulation (MLM) technique, where the complex-valued phaser constellation may be viewed as being Gaussian distributed, rather than obeying a predened ordered structure, as in the classic Quadrature Amplitude Modulation (QAM) schemes [11]. On the other hand, a SPC scheme may also be viewed as being a multiplexing technique, where instead of simultaneously transmitting from multiple colocated antennas as in Bell-Labs Layered Space Time Architecture (BLAST) [12] or instead of using multiple spreading codes known as multicode transmission employed in High Speed Packet Access (HSPA) [13], a high throughput is achieved by simultaneously transmitting in the form of multiple superim- posed layers. In both interpretations, the SPC concept relies on the employment of capacity approaching channel codes, which obey the random coding principle [8]. This Gaussian dis- tributed channel input of SPC is desirable from an information theoretic point of view, because it is capable of approaching the ergodic channel capacity derived by Shannon [14] and archiving the so-called shaping gain [15], which is the mutual information difference between a Gaussian distributed channel input system and a discrete conventional modulation input system. However, it renders the receiver more complex, since 1553-877X/11/$25.00 c 2011 IEEE

Upload: l

Post on 11-Oct-2016

234 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Unified Treatment of Superposition Coding Aided Communications: Theory and Practice

IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 13, NO. 3, THIRD QUARTER 2011 503

A Unified Treatment of Superposition CodingAided Communications: Theory and Practice

Rong Zhang and Lajos Hanzo

Abstract—In this tutorial, a unified treatment of the topicof SuperPosition Coding (SPC) aided communications systemsis offered, where we focus our attention not only on thetheoretical background of SPC but also on its implementationtechniques. The design methodology of SPC aided communica-tions transceivers is described, including the Multiple Level Mod-ulation (MLM) concept, the corresponding iterative SuccessiveInterference Cancellation (SIC) aided receiver and the quasi-random coding principle. Building on these fundamental discus-sions, various future wireless applications of the SPC techniquewere provided, which encompass cooperative communications-assisted relay systems and cross-layer SPC aided multiplexedHybrid Automatic Repeat reQuest (HARQ) techniques. Ourdiscussions demonstrate that the SPC technique is a promisingenabler for diverse future wireless communications scenarios,ranging from cellular networks to cooperative networks.

Index Terms—Superposition coding, turbo receiver, quasi-random coding, cooperative communications, HARQ.

I. INTRODUCTION

EFFICIENT information delivery over wireless links isbecoming more and more crucial, requiring sophisticated

enabling techniques to satisfy the ever-evolving service re-quirements. It has been widely recognised that there is nosingle enabling technique, which is capable of improving theachievable system performance in all scenarios. This is theconsequence of having to satisfy inherent design tradeoffs,leading to sophisticated cross-layer interactions [1]. Despitethese challenges, from the stand-alone physical layer per-spective, Multiple Input Multiple Output (MIMO) [2] andOrthogonal Frequency Division Multiplexing (OFDM) [3]constitute promising techniques, especially, when combinedwith powerful adaptive transmission [4] and Hybrid AutomaticRepeat reQuest (HARQ) [5] in the link layer, which requiressophisticated Digital Signal Processing (DSP) algorithms. Atthe time of writing, the Third Generation Partnership Project’s(3GPP) Long Term Evolution (LTE) [6] has proposed toemploy OFDMA in the DownLink (DL) and single car-rier Frequency Division Multiple Access (FDMA) combinedwith Frequency Domain Equalisation (FDE) in the UpLink(UL). Although substantial data-rate improvements have beenachieved by the 3GPP LTE system, it should be recognised thatthis improvement is mostly attributed to the MIMO techniqueemployed and to its flexible bandwidth allocation [7]. Future

Manuscript received 11 November 2009; revised 7 and 9 March 2010. Thefinancial support of the EPSRC under the auspice of the UK-India Centre ofExcellence in Wireless Communications is gratefully acknowledged.

The authors are with School of ECS., Univ. of Southampton, SO17 1BJ,UK (e-mail: {rz,lh}@ecs.soton.ac.uk, Tel: +44-23-80-593 125, Fax: +44-23-80-593 045, http://www-mobile.ecs.soton.ac.uk).

Digital Object Identifier 10.1109/SURV.2011.061610.00102

wireless systems are expected to support massive data ratesboth in the UL and in the DL, leading to the key problem ofminimising the cost-per-bit. These requirements may not bereadily fulfilled, unless further substantial advances are made.Despite the above-mentioned great strides in technology, thereis a lack of a unified treatment of the philosophy of advancedSuperPosition Coding (SPC) aided communications and itsapplications. In a nutshell, a SPC scheme is constituted bymultiple superimposed layers, where each layer is protectedby a near-capacity binary channel code and the resultantcoded layers are appropriately weighted by their layer-specificamplitude scaling factors and phase-rotated before their linearsuperposition takes place. Although the concept of SPC hasbeen exploited in wireless communications [8]–[10], untilrecently SPC schemes have not been implemented. Hence, inthis tutorial we aim for providing a unified treatment of near-capacity SPC aided communications, commencing from theunderlying theory and covering sophisticated applications. Animportant motivation for aiming to stimulate further researchon SPC-aided cooperative transceivers is that they may beconsidered power-efficient ’green’ solutions, since in con-trast the classic modulation schemes obeying the logarithmicShannon-Hartley law, they allow a linear increase of thethroughput as a function of the transmit power.

In its simplest guise, a SPC scheme can be viewed as aspecific modulation technique, which may be referred to asthe Multiple Level Modulation (MLM) technique, where thecomplex-valued phaser constellation may be viewed as beingGaussian distributed, rather than obeying a predefined orderedstructure, as in the classic Quadrature Amplitude Modulation(QAM) schemes [11]. On the other hand, a SPC schememay also be viewed as being a multiplexing technique, whereinstead of simultaneously transmitting from multiple colocatedantennas as in Bell-Labs Layered Space Time Architecture(BLAST) [12] or instead of using multiple spreading codesknown as multicode transmission employed in High SpeedPacket Access (HSPA) [13], a high throughput is achieved bysimultaneously transmitting in the form of multiple superim-posed layers. In both interpretations, the SPC concept relies onthe employment of capacity approaching channel codes, whichobey the random coding principle [8]. This Gaussian dis-tributed channel input of SPC is desirable from an informationtheoretic point of view, because it is capable of approachingthe ergodic channel capacity derived by Shannon [14] andarchiving the so-called shaping gain [15], which is the mutualinformation difference between a Gaussian distributed channelinput system and a discrete conventional modulation inputsystem. However, it renders the receiver more complex, since

1553-877X/11/$25.00 c© 2011 IEEE

Page 2: A Unified Treatment of Superposition Coding Aided Communications: Theory and Practice

504 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 13, NO. 3, THIRD QUARTER 2011

there is no clearly defined decision boundary as in the contextof conventional modulation schemes [16]. As a result, classicstochastic estimation theory plays a crucial role in detectingsuch a composite signal obeying Bayesian Inference [17]. Onthe other hand, Successive Interference Cancellation (SIC)techniques provide a low-complexity design alternative, whichin theory approach the ultimate Bayesian performance [8].For the sake of eliminating the residual interference duringcancellation and avoiding the error propagation, an iterativeSIC receiver has been advocated [18]. Hence, in the theorypart of this tutorial, we introduce the fundamental knowledgerequired for designing a SPC aided communications system,namely the theory of SPC, the algorithm of iterative SICreceiver and the employment of quasi-random codes.

The concept of SPC has already been adopted implicitlyin many modern communications system designs. The mostimportant one may be the family of Linear Dispersion Codes(LDC) [19], where each antenna transmits a weighted sumof many channel coded input symbols at a time as in MLMand the weighting factor of each antenna’s stream is typ-ically found subject to a predefined design criterion. DirtyPaper Coding (DPC) of [20] is also reminiscent of the SPCprinciple, where in addition to simply superimposing multiplelayers in MLM, we also subtract the recognisable sources ofinterference prior to transmission. This may be referred to asusing SIC at the transmitter. Importantly, the SPC concept isclosely related to non-orthogonal multiuser communications incellular DL [21]. The intra-cell interference can be mitigatedby the orthogonality realized in the synchronous OFDMADL, while a non-orthogonal interference-limited scenario isencountered, when the inter-cell interference of multiple cellsis taken into account from a system-level point of view.In fact, the non-orthogonal approach can be realized in ageneralised code domain having typical instantiations suchas for instance Trellis Code Multiple Access (TCMA) [22],[23] and Interleave Division Multiple Access (IDMA) [24],[25]. In addition to the above-mentioned classic applications,we will introduce two emerging applications of the SPCtechnique in the area of the cooperative communications [26],[27] and in the context of HARQ schemes. More explicitly,the SPC scheme may be invoked in a Cooperative MultipleAccess (CMA) scenario [28], where multiple sources forminga cluster jointly communicate with the destination, which isalso known as Multiple Source Cooperation (MSC) [29]–[31].It is also important to highlight the similarities between theSPC technique and the Network Coding (NC) technique [32]–[34], where the latter has been considered to be of highsignificance in future wireless networks. Furthermore, theSPC technique may also be employed in the context ofHARQ mechanisms, leading to a novel Multiplexed HARQ(M-HARQ) scheme [35], where the main philosophy is thatthe M-HARQ jointly encodes the current new packet to betransmitted and any packets that are about to be retransmitted.

This tutorial is organised as follows. Firstly, in Section II weintroduce the fundamental knowledge required for designing aSPC aided communications system. In Section II-A, the infor-mation theoretical aspects of SPC are considered, the iterativeSIC aided SPC receiver concept is introduced in Section II-B,while the benefits of quasi-random codes are discussed in

Section II-C. These discussions are followed by a discourseon the diverse applications of SPC aided communicationssystems in Section III, where we apply the SPC concept incooperative system scenarios in Section III-A. A novel SPCaided M-HARQ scheme is presented in Section III-B. Finally,we will conclude with a range of future research ideas inSection IV and provide a glossary of terms in the Appendixfor the reader’s convenience.

II. THEORY - THREE DESIGN ASPECTS

The family of SPC aided communications system has twoessential building blocks, namely the employment of MLMand of near-capacity channel coding are inherently assumed.Hence, in this section, we provide a unified view of SPC aidedtransceiver design focusing our attention on the theory of SPC,on the architecture of a powerful iterative SIC-assisted receiverand on the construction of potent quasi-random channel codes.

A. Fundamental Theory of Superposition Coding

The foundations of information theory were laid by Shan-non in his landmark paper in 1948 [14], which influencedvirtually all areas of digital technologies [36]. It was laterrevealed that the capacity of a Gaussian Multiple Access(GMA) channel may be approached by SIC combined withsingle-user decoding, provided that the users benefit fromappropriate power allocation or rate allocation [8], [37]. Inthis section, we first introduce the concept of MLM and thendemonstrate that the SPC scheme is capable of approachingthe channel capacity. This section closely follows classicinformation theory, summarising several important formulasand conclusions. More detailed expositions of informationtheory can be found in standard textbooks, such as [8].1) The Features of Multiple Layer Modulation: Apart from

classic modulation schemes [11], there is a less conventionalmodulation scheme [11], namely the so-called multiple layermodulation arrangement defined as:

x =L∑

l=1

xl, (1)

where xl ∈ Al is referred to as a layer or symbol and x ∈ Ais termed as a super-symbol. Without loss of generality, weassume that the symbol alphabet is the same for all the Llayers, i.e. Al = A0, ∀l ∈ [1, L]. There are two importantfeatures associated with MLM. Firstly, the signal constella-tion points representing the super-symbol x ∈ A exhibit anon-equiprobable nature. Since the entropy is maximised byequiprobable symbols, this property is not beneficial fromthe entropy maximisation point of view. Secondly, MLMhas a reduced cardinality, where the super-symbol set obeys|A| ≤ 2L, when we assume A0 ∈ {±1} for instance. The sec-ond property will make the symbol to super-symbol mappingambiguous, where several combinations of xl, l ∈ [1, L] resultin the same constellation x. These two undesired propertiesof MLM are further exemplified below.

Consider a MLM scheme having L BPSK modulated lay-ers, where A0 = {±1}. Then the super-symbol alphabetA =

{u1, . . . , u|A|

}in this case has |A| = L + 1 distinct

Page 3: A Unified Treatment of Superposition Coding Aided Communications: Theory and Practice

ZHANG and HANZO: A UNIFIED TREATMENT OF SUPERPOSITION CODING AIDED COMMUNICATIONS: THEORY AND PRACTICE 505

−2 0 20

0.5

super−symbol constellation (L = 2)

prob

abili

ty

−3 −1 1 30

0.5

super−symbol constellation (L = 3)

prob

abili

ty

−4 −2 0 2 40

0.5

super−symbol constellation (L = 4)

prob

abili

ty

−5 −3 −1 1 3 50

0.5

super−symbol constellation (L = 5)

prob

abili

ty

−6 −4 −2 0 2 4 60

0.2

0.4

0.6

super−symbol constellation (L = 6)

prob

abili

ty

−7 −5 −3 −1 1 3 5 70

0.2

0.4

0.6

super−symbol constellation (L = 7)

prob

abili

ty

Fig. 1. The super-symbol alphabet A =˘u1, . . . , u|A|

¯and its corre-

sponding probability for different number of layers, where each layer employsBPSK modulation |A0| = {±1}.

constellation points with its ith entry being ui = (2i− 2−L)with a probability of [38]:

P (x = ui) = 2−L

(L

i − 1

). (2)

Fig 1 shows the super-symbol alphabet and the correspondingprobability. When we have L = 3, there are L + 1 = 4distinct points with unequal probabilities. Since only fourdistinct constellation points are created, the resultant capacitywill be strictly lower than the sum of the individual lay-ers’ rate, which is 3 symbols/super-symbol, correspondingto eight distinct signal constellation points. Based on theabove discussions, we may conclude that the MLM scheme’sconstellation-constrained capacity is bounded by the sum ofthe individual rates of each layer. The actual capacity dependson the specific number of distinct signal constellation pointsand their associated probability of occurrence.

Although the MLM scheme’s constellation-constrained ca-pacity is reduced owing to the above-mentioned unfavourableproperties, it does have a particularly attractive property,namely that the resultant super-symbol x is approximatelyGaussian distributed, provided that the number of layers L be-comes sufficiently high, which is a consequence of the centrallimit theorem [38]. Having a Gaussian distributed transmittedsignal is the key to approaching the Shannon capacity. Thissignalling scheme achieves a so-called shaping gain [15] incontrast to the conventional QAM or Phase Shift Keying(PSK) modulation [11]. For the sake of achieving a highcapacity for a MLM scheme, firstly we have to increase itscardinality |A| and then make the super-symbols equiprobable.Two simple operations are thus necessary, namely that of usinga layer-specific amplitude scaling factor ρl, l ∈ [1, L] and/orlayer-specific phase rotation parameter θl, l ∈ [1, L]. Moreexplicitly, Eq. (1) becomes:

x =L∑

l=1

ρlejθlxl. (3)

Let us now reconsider the BPSK example of Fig 1. If we

assign a unique layer-specific phase rotation parameter θl toeach layer, then a total of |A| = 8 distinct equiprobablesignal constellation points can be created, thus a sum-rate of3 symbols/super-symbol may be achieved as a maximum.2) The Relatives of Multiple Layer Modulation: MLM is

also known as superposition modulation [39] and sigma map-ping [40]. By allocating a layer-specific amplitude scaling fac-tor ρl and phase-rotation parameter θl, the MLM in conjuctionwith the coding and interleaving employed becomes analogousto the so-called MultiLevel Coding (MLC) [41], [42] conceptportrayed in Fig 2(a). In MLC, each bit level employs adifferent code-rate, such as C1, C2 and C3 in Fig 2(a) and isthen modulated using a conventional QAM/PSK arrangementcombined with a particular bit-to-symbol mapping scheme. Bycontrast, in MLM, each layer is superimposed, while employ-ing a different layer-specific amplitude and phase-rotation. Theouter channel code can be employed on a per-layer basis asin Fig. 2(b) or across layers, as seen in Fig 2(c). A commonproperty of these two schemes is constituted by the unequalerror protection capability of the different levels/layers, whereunequal protection is achieved by assigning the appropriatecode-rate of C1, C2 and C3 as in MLC, while it is achievedby appropriate amplitude scaling in MLM. In addition, in orderto fully exploit the benefits of unequal error protection, whilestill maintaining a sufficiently high performance even for theleast-protected levels/layers, both schemes rely on an iterativeSIC and decoding aided receiver, which is another similarity 1.

It is important to note that Eq. (3) effectively describesthe entire family of channel-division based systems, such asfor example Spatial Division Multiple Access (SDMA) [43],where the weighting factor ρle

jθl of Eq. (3) can be consid-ered as a unique user-specific or data-stream-specific virtualchannel. By contrast, Eq. (1) describes an ensemble of code-division based systems, such as IDMA, where the differentia-tion of signalling layers is based on different interleavers, i.e.we have xl = πl[C(bl)]. In other words, combining a ’channel-division’ element with the ’code-division’ philosophy is ben-eficial. On the other hand, combining the principle of ’code-division’ with that of ’channel-division’ is also beneficial. Thisis because, at the first stage of the receiver, having distinctchannel-division information is essential in order to facilitatea unique mapping of y → xl. At the second receiver stage,having distinct code-division information ensures that a uniquemapping from the coded bits xl to the original information bitsbl can take place.3) The Capacity of Superposition Coding: Let us now

briefly introduce the underlying information theory behindthe concept of SPC. Recall from Eq. (3), provided thateach constituent layer xl is protected by a capacity-achievingcode and provided that its layer-specific amplitude scalingfactor ρl is appropriately chosen, we arrive at the concept ofsuperposition coding. Without loss of generality, we considera two-layer SPC scheme transmitting over Additive White

1MLM may be viewed as a power-efficient technique, because overlayingfour layers to quadruple the throughput potentially requires four times or 6dBmore power, while 16QAM for example may require about 9dB more power.Furthermore, each layer of MLM may be potentially amplified by individualmodest-linearity, high-power-efficiency amplifiers before their superposition,while 16QAM typically requires higher-linearity, low-power-efficiency poweramplifiers.

Page 4: A Unified Treatment of Superposition Coding Aided Communications: Theory and Practice

506 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 13, NO. 3, THIRD QUARTER 2011

Map

C1

C2

C3

π2

π3

π1

(a) MLC

π2

π3

π1

∑C

C

C

ρ1, θ1

ρ2, θ2

ρ3, θ3

(b) MLM Type I

π2

π3

π1

∑C

ρ1, θ1

ρ2, θ2

ρ3, θ3

(c) MLM Type II

Fig. 2. The block diagram of the transmitter of (a) MLC, using per-layer codes and conventional modulation; (b) MLM Type I, using per-layer codes andMLM; (c) MLM Type II, joint coding and MLM;

Gaussian Noise (AWGN) channels, as characterised by:

y = ρ1x1 + ρ2x2 + n, (4)

where y and n represents the channel output observation andthe AWGN process, respectively. Furthermore, ρl, l = 1, 2denotes the layer-specific amplitude scaling factor under thesuper-symbol power constraint of ρ2

1+ρ22 = P and we assume

ρ1 > ρ2. Given the independent channel input distributionsof px1 and px2 for the two layers, the capacity of a two-layer channel is described by the so-called capacity region,which may be defined as the set of all rate pairs, satisfyingthree constraints, namely the capacity constraint of layer x1,the capacity constraint of layers x2 and the sum capacityconstraint [10], as formulated in:

R1 < I(x1; y|x2),R2 < I(x2; y|x1),

R1 + R2 < I(x1, x2; y).

A particular rate-pair representing one of the optimal operatingpoints can be realized by the classic SIC type receiver,where the underlying principle is the so-called chain-rule ofmutual information [8], allowing us to write I(x1, x2; y) =I(x1; y)+ I(x2; y|x1). This implies that the SIC receiver maydetect x1 first, treating the signal x2 as interference. Then, theSIC receiver detects x2 conditioned on the already detectedinformation of x1. This results in the rate pair of:

(R1, R2) ={

ln(

1 +ρ21

ρ22 + N0

), ln

(1 +

ρ22

N0

)}. (5)

At this point, layer x1 operates at the rate I(x1; y). Since thesum rate constraint I(x1, x2; y) is tight at this point, layer x2

achieves its highest rate of I(x2; y|x1).

Let us now generalise these concepts to L layers and find theappropriate amplitude scaling factor ρl, ∀l ∈ [1, L]. Assuming-without any loss of generality-that we have ρ1 > ρ2 > . . . >ρL, the resultant output Signal to Interference plus Noise Ratio(SINR) when detecting the lth layer is given by:

γl =ρ2

l∑Lj=l+1 ρ2

j + N0

. (6)

Since the channel capacity for each layer is a monotonicallyincreasing function of the output SINR γl, when an equal-ratesystem is assumed, we arrive at γi = γj , ∀i, j ∈ [1, L], i �= j.This results in an exponential power allocation law [44], asso-ciated with ρ2

l+1 = ρ2l /β for the consecutive layers. Following

further manipulations, we arrive at the power allocation factorβ = L

√P/N0 + 1, where P =

∑Ll=1 ρ2

l is the sum-power

constraint. It may be observed that the power allocation factorβ is only a function of two factors, namely of the totalsignal power to noise power ratio as well as of the numberof layers L. The achievable rate of an L-layer SPC schemeis quantified in Fig 3 with the aid of the optimum powerallocation scheme, which is approaching the channel capacity,when the number of layers L increases. Furthermore, the SPCscheme may be considered a power-efficient ’green’ solution,since in contrast the classic modulation schemes obeying thelogarithmic Shannon-Hartley law, they allow a linear increaseof the throughput as a function of the transmit power, as longas the number of layers is sufficiently large. What is also worthemphasising is that:

• the power allocation scheme requires perfect SIC, imply-ing that the interference imposed by the already detectedlayers was perfectly removed. Since the perfect SICcommences its operation from the strong layers andproceeds to the weaker layers, in parlance, SIC is alsoreferred to as the so-called onion pealing [37] or strippingaided detection [45].

• the power allocation scheme does not explicitly specifythe distributions of the channel input, which may either bea Gaussian input as typically assumed in theoretical stud-ies or any type of modulation constellation constrainedinput, where the channel capacity is a monotonicallyincreasing function of the SINR for both.

• the power allocation scheme forms one of the L! pos-sible power allocation combinations, where the sum-rateequals to the channel capacity. Moreover, (L!− 1) otherpower allocation schemes may be obtained by usingdifferent power assignments across layers. For example,when assuming ρL > ρL−1 > . . . > ρ1, another powerallocation scheme may be found.

4) The Effect of Non-ideal Codes: Although in typical the-oretical studies the employment of capacity-achieving channelcodes are assumed, the benefits of SPC may be retainedeven when employing non-ideal channel codes. Let us nowquantify the effects of non-ideal channel codes and generatethe matching profile of amplitude scaling factors. Given atotal super-symbol power constraint of P and assuming anequal-rate scenario for each layer, we have R = Lr, where Rand r denotes the sum-rate and each layer’s rate, respectively.The minimum required SINR γm at a given rate R is givenby [8], γm = (22R − 1), while according to the individualchannel decoder of layer xl, we have a threshold value γt,at which the channel decoder exhibits an infinitesimally lowBit Error Ratio (BER). The channel’s output SINR requiredfor successfully detecting the lth layer should be equal to

Page 5: A Unified Treatment of Superposition Coding Aided Communications: Theory and Practice

ZHANG and HANZO: A UNIFIED TREATMENT OF SUPERPOSITION CODING AIDED COMMUNICATIONS: THEORY AND PRACTICE 507

−10 0 10 20 30 400

1

2

3

4

5

6

7

8

SNR (dB)

Rat

e (b

its/s

/Hz)

Superimposed layers L = 1,2,…,10

Capacity

Fig. 3. The achievable rate of L layer superposition coding, where eachlayer employs BPSK modulation and L = 1, . . . , 10.

−10 −5 0 5 10 15 20 25 300

0.5

1

1.5

2

2.5

3

3.5

4

SNR (dB)

Rat

e (b

its/s

/Hz)

Channel Capacity

RA Long Information Length

RA Short Information Length

RA Infinite Information Length

Fig. 4. The achievable rate of RA codes aided SPC system for both shortinformation bit length of 4096, long information bit length of 10 000 and aninfinite information sequence length.

the threshold value γt for the channel code, which is givenby ρ2

l /(N0 + I), where the total uncancelled interferencecontribution is given by I =

∑Lj=l+1 ρ2

j and the noise power isgiven by N0 = P/τγm, where τ > 1 represents the differencebetween the minimum SINR γm and the required SINR γr atrate R. In this way, we may arrive at a certain power profileρ2

l = γt(N0 + I) for layers l = 1, . . . , L. For each valueof R, there is a corresponding value of τ at which we haveP =

∑Ll=1 ρ2

l = P and the resultant value τ quantifies thedistance from capacity when using ideal codes.

Fig 4 shows the achievable rate of a Repeat Accumulate(RA) code [46] aided SPC system for both a short informationsequence length of 4096 and for a long sequence of 10000 bits. Although not explicitly shown here owing to spacelimitations, when a rate Rc = 1/4 RA code was employed,the actual Eb/N0 values required at these two informationsequence lengths for achieving an infinitesimally low BERwere observed to be at 0.8dB and 0.5dB, when using bit-by-bit Monte-Carlo simulations in conjunction with IRA = 50RA decoding iterations. Furthermore, the minimum decoding

......DET DEC

RC

DET DEC

RC

DET DEC

RC

DET DEC

RC

π−1L

π−11

π−12

π−13

Ledet

Ladet

π1

π2

π3

πL

Ladec

Ledec

xq1

xq2

xq3

xqL

xq−11 rq

1

rq2

rq3

rqL

xq−12

xq−13

xq−1L

rqrq−1

Fig. 5. Block diagram of the qth iteration of the iterative successiveinterference cancellation receiver, which consists of the detector (DET) andthe decoder (DEC), separated by layer specific interleavers πl, l ∈ [1, L].Further, RC is short for reconstruction.

threshold Eb/N0 value for a rate Rc = 1/4 RA code having aninfinite information sequence length was reported to be 0.1dB,when using the sum-product decoding algorithm [47]. Observein Fig 4 that the system operates close to the achievablechannel capacity in conjunction with ideal codes. However, theoverall SINR difference τ increases further, when the numberof superimposed layers L becomes high, especially when ashort information sequence length is used.

B. Iterative Receiver of Superposition Coding

1) The Rationale of Iterative SIC: A strong assumptionmade in the above SPC theory-related discourse was that theSIC receiver is capable of perfectly cancelling the interferencewhen detecting each layer, which implies that the interferenceimposed when detecting the current layer xl is constitutedby the undetected layers [xl+1, . . . , xL]. In other words, theinterference imposed by all the previously decoded layers[x1, . . . , xl−1] was perfectly cancelled, because the reconstruc-tion of the previously decoded layers was perfect. Naturally,this cannot be achieved in practice owing to inaccurate powerallocation or imperfect channel decoding, hence the residualinterference may result in potentially catastrophic error prop-agation at low SNRs [48], [49]. Hence, the employment of aniterative SIC receiver becomes necessary [50], where the SICoperation is repeated for several iterations exchanging extrinsicinformation between the detector and the channel decoder [51]in order to eliminate the effects of residual interference, asseen in the schematic of Fig 5.

More explicitly, at iteration q of the iterative SIC, whendetecting the lth layer xl, the detector cancels the interferenceimposed by the previously decoded layers using their recon-structed estimates xq

l−l. The detected layer xl is then subjected

Page 6: A Unified Treatment of Superposition Coding Aided Communications: Theory and Practice

508 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 13, NO. 3, THIRD QUARTER 2011

to channel decoding and signal reconstruction, resulting inxq

l , which is then subtracted from the composite multi-layersignal for the sake of detecting the next layer xl+1. Theseprocedures are continued for several iterations, until a prede-termined stopping criterion is satisfied. More explicitly, theSIC iterations maybe curtailed upon approaching convergenceassociated with high-confidence soft-values or upon reachingthe maximum affordable number of iterations. The detection,decoding and reconstruction process carried out at each itera-tion for each layer may take place processing either a hard- ora soft-values, where the latter operation results in a powerfulso-called turbo receiver [18], [51]–[55].

2) The Principle of Iterative SIC: Recall from Eq. (3), thatat iteration q, when detecting the lth layer xl with the aid ofthe reconstructed versions of the previously decoded layers[xq

1, . . . , xql−1], the signal after interference cancellation may

be expressed as:

rql = hlxl +

L∑j=l+1

hj(xj − xq−1j ) +

l−1∑j=1

hj(xj − xqj) + n, (7)

where we have hl = ρlejθl , l ∈ [1, L] and the second term

stands for the residual interference imposed by the layers,which have not been detected at iteration q, but have alreadybeen channel decoded and hence ’de-contaminated’ at iteration(q − 1), in particular, xq−1

j = 0, q = 1, ∀j ∈ [1, L]. On theother hand, the third term of Eq. (7) stands for the residualinterference arising from the layers, which have already beenchannel decoded at iteration q. It may be readily inferredfrom Eq. (7) that for a single iteration q = 1 and for perfectreconstruction of xj = xj , j ∈ [1, l − 1] typically assumedin theoretical studies, the third term of Eq. (7) is cancelledand the interference only arises from the undetected layers[xl+1, . . . , xL] constituted by the second term of Eq. (7). Wenow introduce the more practical scenario when imperfect SICis encountered.

Focusing our attention on detecting the lth layer xl, theobjective of the detector is to maximise the a posteriori prob-ability Pr(xl|rq

l ). When BPSK is employed, where we havexl ∈ A = {±1}, l ∈ [1, L], the corresponding a posterioriprobability can be expressed in terms of a numerically morestable expression of the Log Likelihood Ratios (LLR) [56],resulting:

Ldet = lnPr(xl = +1|rq

l )Pr(xl = −1|rq

l )

= lnp(rq

l |xl = +1)p(rq

l |xl = −1)︸ ︷︷ ︸extrinsic Le

det

+ lnPr(xl = +1)Pr(xl = −1)︸ ︷︷ ︸

apriori Ladet

, (8)

where the second equation was derived based on Bayes’rule. The extrinsic information Le

det is delivered by the SICdetector to the channel decoder, while the a priori informationLa

det gleaned from the channel decoder is used by the SICdetector, as seen in Fig 5. To elaborate a little further, theextrinsic information Le

det may be found based on diversealgorithms, such as for example the Minimum Mean SquareError (MMSE) algorithm [18] or the Joint Gaussian Sum-Product (JGSP) algorithm [57] etc. On the other hand, the

channel decoder invokes the classic soft decoding algorithm,such as the BCJR [51] algorithm, processing its input apriori information La

dec provided by SIC detector’s extrinsicinformation Le

det and delivers the extrinsic information Ledec of

the codeword, which in turn acts as the a priori informationLa

det input to the SIC detector. This above-mentioned inputand output information processing forms the basis of a singleturbo iteration and its convergence can be visualised andtracked for example by EXtrinsic Information Transfer (EXIT)charts [58], [59], which was also detailed in [2].3) Complexity versus Performance Tradeoff: When each

layer is protected by the same channel code, the receivercomplexity of the SPC aided scheme is jointly determined bythe complexity of the detection algorithm employed and thenumber of turbo iterations exchanging extrinsic informationbetween the detection and decoding components. For thesake of further augmenting the complexity versus performancetradeoffs, we assume that each superimposed layer has beenallocated the same power and this allows the receiver to havea parallel structure.

Instead of drawing the conventional BER curve, we providemore insightful EXIT charts for characterising the iterativereceiver. Firstly, we record the a priori information La

det, Ladec

and the extrinsic information Ledet, Le

dec of both the detectorand decoder components. These information is then convertedto the corresponding Mutual Information (MI), namely to thea priori MI Ia

det, Iadec and to the extrinsic MI Ie

det, Iedec. Since

the detector’s (decoder’s) extrinsic MI acts as the a priori MIfor the decoder (detector), in the EXIT charts we alternatelyswap the abscissa and ordinate axes, depending on which ofthe two components acts as the source of a priori information,corresponding to the abscissa, as discussed in [58]. Fig. 6compares the two detectors’ EXIT curves for a L = 8-layer SPC aided system communicating over an uncorrelatednon-dispersive Rayleigh fading channel, when employing aRA code having a code-rate of r = 1/4 and operating atEb/N0 = 11dB, where we employed a sub-optimum MatchedFilter (MF) detector and the optimum Maximum Likelihood(ML) detector.

At the left of Fig. 6, which corresponds to the interference-limited region, where the no Inter-Layer-Interference (ILI)were cancelled, the ML detector outputs only marginallyhigher extrinsic information than the MF detector. As theamount of available a priori information increases, the dis-crepancy between these two EXIT curves becomes moresubstantial owing to the suboptimal nature of the MF detec-tor. Ultimately, these two curves tend to the correspondingsingle-layer performance, when the effects of ILI have beenmore or less eliminated, corresponding to the noise-limitedregion at the right of Fig. 6. The consistently higher extrinsicinformation provided by the ML detector is a benefit of theexponentially increased detection complexity - which is on theorder of 2L - while the MF only exhibits a linearly increasingdetection complexity as a function of L.

To provide further insights, the bit-by-bit Monte-Carlosimulation based decoding trajectory is also portrayed in Fig.6, where each staircase-like step of the trajectory represents aspecific turbo iteration between the detector and decoder com-ponent. This trajectory demonstrates that in order to achieve a

Page 7: A Unified Treatment of Superposition Coding Aided Communications: Theory and Practice

ZHANG and HANZO: A UNIFIED TREATMENT OF SUPERPOSITION CODING AIDED COMMUNICATIONS: THEORY AND PRACTICE 509

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Iadet

/Iedec

Ie det/Ia de

c

1/4 RA code

ML detector

MF detector

Switch Traj

Noise Limited

Interference Limited

L = 8, Eb/N

0 = 11dB

Fig. 6. EXIT chart analysis of the SPC aided system.

near error-free single-layer performance at the point of perfectextrinsic information associated with Ie

dec = 1, a sufficientlyhigh number of iterations are necessitated. This implies thatthere is a tradeoff between the complexity expressed in termsof the number of iterations and the achievable performance.In general, the less iterations are used, the lower the extrinsicMI gleaned. Upon appropriately configuring the number ofiterations and intelligently activating the various detector al-gorithms, the hybrid detector [60] characterised in Fig. 6 maybe created, whose trajectory switches between those of thetwo constituent detectors and hence succeeds in maintainingan open EXIT tunnel between the EXIT curves of the activateddetector and the decoder. This ensures convergence to avanishingly low BER.

C. Channel Coding for Superposition Coding

When iterative SIC is employed, the construction and em-ployment of appropriate channel codes becomes important,which is expected to be capable of eliminating the residualinterference as well as of differentiating the constituent layers.Both of the above-mentioned properties are elaborated onbelow.1) The Concept of Random Coding: The design of

near-capacity transmission arrangements typically relies onchannel-coding schemes. A probabilistic approach is that ofusing a technique referred to as random coding, which wasused to prove the well-known Shannon-coding theorem [8],[16]. The Shannon-coding theorem states that for any rateRc < C, there exits at least one sequence for the memorylesschannel ensuring that the probability of error obeys Pe → 0,as the block length tends to infinity. Conversely, any sequenceassociated with Pe → 0 must have Rc < C. The detailedproof of Shannon’s coding theorem can be found in variousstandard textbooks [8], [16]. As a result of this proof, thereis a high chance that this randomly generated codebook is

indeed that of a near-capacity code, provided that we have asufficiently long block length.

However, this random coding principle does not specifyan explicit way of constructing an efficient channel encodingand decoding procedure. Hence, other types of codes, whichhave attractive practical implementations have been proposed,such as the family of algebraic codes [5]. It is important tonote that two popular channel coding families, namely thoseof Low Density Parity Check (LDPC) codes [61]–[63] andof concatenated codes [52], [64], are based on this randomcoding principle and may be referred to as quasi-random codefamilies, where random interleavers are either implicitly orexplicitly embedded in the code’s construction.2) The Introduction of Factor Graph: Let us first briefly

introduce the graphical representation of binary channelcodes [65], which is employed in order to characterise theabove-mentioned two classes of quasi-random codes. In [66],Kschischang et al. used the so-called factor graph, whichwas originally introduced by Tanner [67] as an effectivemodel for the description, construction and decoding of bothblock-based and convolutional error-correcting codes, wherethe fundamentals of iterative decoding on graphs were alsopresented. Typically, factor graphs have two group nodes [68],where these groups of nodes are referred to as variable nodesand function nodes, which are denoted by circles and squares,respectively, as seen in Fig 7.

A linear block code of rate Rc = M/N can be definedin terms of a [(N − M) × N ]-element parity-check matrixH, where M and N denote the length of the information bitand coded bit segments, respectively. More specifically, Fig 7shows the factor graph of a (8,4) Hamming code having theparity check matrix H of:

H8,4 =

⎡⎢⎢⎣

1 1 1 0 1 0 0 01 1 0 1 0 1 0 01 0 1 1 0 0 1 00 1 1 1 0 0 0 1

⎤⎥⎥⎦ . (9)

In this figure, there is one variable node for each of the N = 8coded bits (columns), and there is one function node for eachof the (N −M) = 4 parity-check equations of H (rows). Anedge exists between the ith variable node and the jth parity-check node, if and only if hi,j = 1. For example, a binary onein the first column indicates the presence of a parity check linkbased on the mod 2 connection of the first three input bits. Thenumber df

j of the edges connected to the jth function node anddv

i of the edges connected to the ith variable node is referredto as the degree of function node j and of variable node i,respectively. Furthermore, the function nodes are termed asbeing regular, since their degrees are the same, i.e. we havedf

i = 4, i = 1, . . . , 4, while the variable nodes are irregular,since their degrees fall into two groups, namely dv

i = 3, i =1, . . . , 4 and dv

i = 1, i = 5, . . . , 8. There are a total of 16edges, which coincide with the total number of binary ones inthe parity-check matrix of Eq. (9). The term socket refers toa point on a node to which an edge may be attached. Thereare 16 possible variable node sockets and 16 function nodesockets in Fig 7.

The so-called sum-product algorithm [66] is the typicaldecoding algorithm used in conjunction with the factor graph

Page 8: A Unified Treatment of Superposition Coding Aided Communications: Theory and Practice

510 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 13, NO. 3, THIRD QUARTER 2011

6

13fj

vi

Fig. 7. Factor graph representation of a Hamming (8,4) code, where eightvariable nodes (columns) and four function nodes (rows) are denoted bycircles and squares, respectively.

representation of channel codes. The messages representedin the form of extrinsic LLRs are processed by both thevariable nodes as well as by the function nodes, whichiteratively exchange their soft-estimates along the edges. Letus denote the a priori LLR and extrinsic LLR of a nodeby La and Le, respectively. The sum-product algorithm usedduring this information exchange process may be summarisedas follows [66]:

• v → f: the message passed along edge j from a variablenode to a function node is given by the sum of allthe other function nodes’ a priori information, which isexpressed as: Le

j =∑dv

i=1,i�=j Lai .

• f → v: the message passed along edge j from afunction node to a variable node is given by the box-plus operation [69] of all the other variable nodes’a priori information, which is expressed as: Le

j =

2 tanh−1(∏df

i=1,i�=j tanh(Lai /2)).

3) Quasi-random Codes with Implicit Interleaver: In orderto show the implicit interleaver embedded in a linear blockcode, let us reconsider the above simple (8,4) Hamming codewith the aid of the factor graph seen in Fig 7. We emphasisethat an edge connection can be viewed as an interleaver.To elaborate a little further, viewing the ordering of edgesform the variable node sockets’ side, this interleaver patternis π = {1, 5, 9, 2, 6, 13, 3, 10, 14, 7, 11, 15, 4, 8, 12, 16}, whichstates that the variable node socket i is connected to thefunction node socket π(i). For example, as shown by thedotted line, the 6th variable node socket connects to the 13thfunction node socket, which means that we have π(6) = 13according to the above-mentioned interleaver pattern π. Sincethe block length of the above Hamming code is short and thenumber of binary ones in it is high, potentially a large numberof short cycles may be encountered, one of which is depictedby the dashed line seen in Fig 7. It was shown that these shortcycles represent an inter-dependence amongst the nodes, hencepreventing effective message passing and information updatingon the basis of independent soft information, thus resultingin a poor performance. These problems are mitigated by thefamily of the well-known LDPC codes [61].

An LDPC code is a linear block code equipped with animplicit interleaver having a parity-check matrix that is sparse,which means that it has a low number of binary one entries.A regular LDPC code construction was proposed by Gal-lager [61], where the randomly placed binary ones and zerosin the parity-check matrix H are subjected to the constraintthat each row of H had the same number df of binary onesand similarly, each column of H had the same number dv ofones. For example, a (15 × 20)-element parity-check matrixmay be formulated as Eq (10), which has df = 4 as well asdv = 3 and defines an LDPC code with a length of N = 20.

Hence it may be referred to as a regular (df , dv, N) LDPCcode. In such a regular (df , dv, N) LDPC code ensemble, eachinformation bit is involved in dv parity checks and each parity-check bit protects df information bits. The fraction of onesin the parity-check matrix of a regular LDPC code is df/N ,which approaches zero, as the block length N becomes largeand hence leads to the terminology of low-density parity-checkcodes.

In fact, LDPC codes constitute quasi-random codes inthe sense that the interleaver pattern employed generates theresultant 2M codewords by a random selection from the entirecodebook of size 2N , when the block length N is large. Thisis an essential assumption in the context of the random codingprinciple. However, having a large block length N in generalresults in a complex decoder, hence full-search based decodingbecomes unrealistic. On the other hand, the message passingalgorithm [66] maybe employed to circumvent the full-search,which ensures convergence towards a near optimum solution.The resultant performance depends on the construction of theparity check matrix. It was shown in [63] that specificallydesigned irregular LDPC codes are capable of approachingthe Shannonian channel capacity within a small discrepancyof about 0.5 dB. Having introduced the quasi-random LDPCcoding approach based on implicit interleavers, let us considerthe explicitly-interleaved quasi-random code based counterpartnext.4) Quasi-random Codes with Explicit Interleaver: Apart

from LDPC codes, another class of quasi-random codes isconstituted by the family of interleaved concatenated randomcodes [52], [64]. Consider for example transmitting M = 100information bits employing short codes, such as for instancerepetition codes having a repetition factor of κ = 4, whichhave only two codewords (the all-one and the all-zero). Thetotal number of coded bits becomes N = 400. We then invokea random interleaver, which scrambles these identical consec-utive coded bits, resulting in an interleaved repetition codehaving 2M legitimate codewords, which are randomly selectedfrom a 2N -entry codebook. However, this interleaving-aidedpermutation does not change the code’s properties, since onecan deinterleave, i.e. descramble the randomised codewords inorder to recover the original codewords. In order to preventthis decomposition, another code acting as an inner codemaybe serially concatenated with the original code acting asan outer code, resulting in a serial concatenated code, as seenin Fig 8(a). It is known that the inner code should have arecursive property, i.e. an infinite impulse response so as toprovide a good performance by ensuring that the interleavedouter codewords cannot be decomposed. When a simpleunity-rate code acting as an accumulator is incorporated, theresultant serial concatenated code becomes a powerful, yetlow-complexity RA code [46], where the encoder may besimply represented as:

• Repetition: [c(i−1)κ+1, . . . , ciκ] = b(i)11×κ, i ∈ [1, M ],where c and b denote the repeated bits and informationbits, respectively. Furthermore, 11×κ denotes a (1 × κ)-component identity vector.

• Accumulate: d(i) = c(i) ⊕ d(i − 1); d(0) = 0, whered and c denote the accumulated bits and the interleavedversion of c. Furthermore, ⊕ denotes modulo-2 addition.

Page 9: A Unified Treatment of Superposition Coding Aided Communications: Theory and Practice

ZHANG and HANZO: A UNIFIED TREATMENT OF SUPERPOSITION CODING AIDED COMMUNICATIONS: THEORY AND PRACTICE 511

REP ACCΠ

(a)

ACC

REP

Π

(b)

Fig. 8. (a) The block diagram of repeat accumulate code. (b) Factorgraph representation of a regular RA code, where we have 3 informa-tion nodes, a rate-1/4 repetition code and an interleaver pattern of π ={3, 5, 1, 6, 4, 9, 11, 2, 7, 12, 8, 10}. Hence we have a total of 12 check nodes.

RA codes may be viewed both as a serial concatenatedcode as described above, as well as a special type of LDPCcode. Consider the factor graph of a regular non-systematicRA code depicted in Fig 8(b), where we have 3 informa-tion bits and a length-4, i.e. rate-1/4 repetition code. Theinterleaver pattern viewed from the repetition code side’s isπ = {3, 5, 1, 6, 4, 9, 11, 2, 7, 12, 8, 10}. The major drawbackof LDPC codes is that the conversion of the large parity-checkmatrix H to the generator matrix G is complex and requires alarge memory. The family of RA codes is of particular interest,since it circumvents the above-mentioned problem and ithas an encoding complexity, which only linearly rather thanexponentially increases with N . Furthermore, they are capableof approaching the capacity at a moderate complexity [70].

It is now clear that both types of quasi-random codes’ de-sign relies on the interleaver employed. To demonstrate the im-portance of the interleaver, we may take a look from the classiccoding point of view by considering for example the family ofconvolutional codes. The free distance of convolutional codes

is relatively low for a low code memory, while convolutionalcodes with a large memory are generally complex and thusmaybe unattractive. The weakness of convolutional codes witha low memory is that they have a large number of low-weight codewords [5], which may result in persistent errorseven at high SINRs. By combining them with appropriatelydesigned interleavers, an additional source of memory isintroduced and the resultant code becomes effectively a longrandom block code. Thus the distance spectrum substantiallyimproves due to interleaving, which is the fundamental benefitof turbo codes [52], [56]. To elaborate a little further, turbocoding employs simple component codes and achieves a near-optimum performance by exchanging extrinsic informationbetween the component decoders separated by interleaversduring each iteration.5) The Necessity of Code Division: When member of

the family of quasi-random channel codes are employed inthe context of a SPC aided system, the MLM componentconstitutes an inner code as well, while the receiver becomesa serial concatenation of the SIC detector and of the channeldecoder as seen in Fig 5. Consider for example a SPC aidedcommunications system transmitting two independent layersgenerated by the following two different configurations:

• different outer channel codes, the same interleavers;• the same inner channel codes, different interleavers.

The corresponding factor graphs are depicted in Fig 9 andFig 10. In the subfigures (a) of both figures, the listed functionnodes ψ represented by squares represent the MLM operation,while each side of the function nodes corresponds to oneof the two layers, which are interleaved and channel codedseparately. Then the iterative SIC receiver detects the first layerby treating the second layer as interference and then performschannel decoding for the first layer. It then detects the secondlayer by subtracting the reconstructed interference from thefirst layer partially and then performs channel decoding forthe second layer. These operations take place several times,passing soft-information from side to side along the edges.According to the first configuration of Fig 9, when the sameinterleavers are employed, the graph can be decomposed intosubgraphs. Since the iterative SIC receiver has to pass informa-

Hm,n =

⎡⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎣

1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 11 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 00 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 00 0 1 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 00 0 0 1 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 00 0 0 0 0 0 0 1 0 0 0 1 0 0 0 1 0 0 0 11 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 00 1 0 0 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 00 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 00 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 00 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1

⎤⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎦

(10)

Page 10: A Unified Treatment of Superposition Coding Aided Communications: Theory and Practice

512 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 13, NO. 3, THIRD QUARTER 2011

tion between the two layers several times, this decompositionis not beneficial from an iterative soft message passing point ofview, since it results in short cycles between these subgraphsand introduces a high correlation between the soft informa-tion exchanged, regardless whether the outer channel codeemployed is the same or different. However, for the secondconfiguration of Fig 10, although the same channel code isemployed, the differently interleaved codewords effectivelyobey the random coding principle. Hence, the resultant factorgraph cannot be decomposed into subgraphs, which in turnenlarges the decoder’s full-search-based codebook size whenpassing information between the two layers.

These observations suggest that the channel codes employedfor a SPC aided communications system should be not onlycapacity achieving but also distinguishable based on theprinciple of code division, when we treat the interleaver inconjunction with channel codes as a constituent component.The most straightforward way to simultaneously increase thegrade of randomness as well as to improve the differentiationof codes is to use layer-specific interleavers, which invokesthe fundamental principle of IDMA systems. Furthermore,when LDPC codes are employed, the interleavers are redun-dant, since they have already been implicitly embedded inthe generator matrix. Unfortunately, an optimum interleaverpattern design, which is capable of effectively lengtheningthe cycles and hence improving the efficiency of the itera-tive decoding process remains an open problem, especiallyfor a large number of data streams [71]. More explicitly,the problem is tractable, when the block length is short,while this optimisation becomes unrealistic for a long blocklength. Nonetheless, simple random interleavers are capable ofpromising a good performance based on the random codingprinciple.

Let us conclude the role of interleavers by revisiting theiterative SIC receiver introduced in Sec II-B, which consists ofa data detector and a bank of L individual single-user channeldecoders, where the detector acts as the inner ’decoder’and the channel decoder acts as the outer decoder, as seenin Fig 5. These two components are separated by randominterleavers, which play two primary roles in the SPC aidedcommunications system:

• We may consider the interleavers in conjunction withthe outer channel codes as random codes, since theyeffectively enlarge the memory of the outer channelcodes, hence increasing their minimum distance.

• The interleaver employed makes the consecutive codedbits uncorrelated, since any existing statistical depen-dencies between the intrinsic and extrinsic informationdegrades the achievable iterative decoding performance.

III. PRACTICE: SUPERPOSITION CODING AIDED

COMMUNICATIONS DESIGN EXAMPLES

Having discussed the principles of a SPC aided commu-nications system in Section III, which covered the funda-mental theoretical aspects, the channel coding design andthe receiver’s architecture, let us now consider two detailed

design examples in the field of cooperative communicationsand HARQ-aided cross-layer design.

A. Cooperative Multiple Access

1) Multiple Source Cooperation Strategy: For the sake ofimproving the capacity of wireless communications systems,the MIMO concept [13] was shown to be capable of providingboth diversity and coding gains in the context of Space-TimeCodes (STC) [72] as well as of supporting a high multiplexinggain, when using for example the BLAST architecture [12].However, at the Mobile Station (MS), it may be impracticalto accommodate multiple antennas. Alternatively, the novelconcept of cooperative communications [73], [74] allows usto assign the individual MIMO elements to geographicallyseparated single-antenna-aided cooperating MSs, which are nolonger prone to shadowing-induced correlated fading, lead-ing to the concept of Virtual MIMOs (VMIMO) [26], [27].Hence this novel technique is capable of improving the BERperformance, while supporting a high throughput as well asproviding an improved cell-edge coverage [75]. Recently, theCMA channel has attracted substantial research interests [28],where multiple sources forming a cluster of cooperating nodescommunicate with the destination, which is also known asMSC [29]–[31].

Consider a cluster of single-antenna-aided sources cooper-atively communicating with a destination employing a singlereceive antenna, which constitute a VMIMO system, as seenin Fig 11. In this VMIMO cluster, we assume having a total ofN Cooperative Sources (CS), which consist K Active Sources(AS) and (N − K) Relay Sources (RS). Our MSC schemeentails two phases. In the Phase-I cooperation portrayed inFig 11, the source information emanating from all K ASs isbroadcast to all N CSs in a Time Division Duplex (TDD)manner under the simplifying assumption of having perfectsynchronisation. By contrast, Phase-II cooperation is definedas the joint VMIMO transmission of a combined signalgenerated by the concerted action of all the N CSs. Therefore,in the MSC setting, each CS simultaneously transmits multipleASs’ information, resulting in a high throughput. This impliesthat each AS is served simultaneously by multiple CSs andhence the entire set of ASs benefits from a high diversitygain.2) Encoding at Multiple Cooperative Nodes: We now focus

our attention on developing channel coding schemes for MSCin the context of Phase-II cooperation. Various relaying tech-niques may be employed at the CSs, including for examplethe amplify-forward principle, where the analogue waveform-based signals received at each of the N CSs are amplifiedand retransmitted to the destination. Alternatively, each of theN CSs may decode its received signal and then reencodesit, followed by its retransmission to the destination, resultingin the so-called decode-forward principle. In the context ofthe decode-forward relaying technique designed for MSC,which is different from classic time-multiplexing for multiplesources’ information, we may employ the SPC technique [39]of Section II, where multiple sources’ information is code-multiplexed in order to generate the superimposed and appro-priately rotated composite signal. On the other hand, we may

Page 11: A Unified Treatment of Superposition Coding Aided Communications: Theory and Practice

ZHANG and HANZO: A UNIFIED TREATMENT OF SUPERPOSITION CODING AIDED COMMUNICATIONS: THEORY AND PRACTICE 513

C1

C1 C2

C2

ψ

(a) Configuration 1

y(1)y(16)y(4)

y(5)y(9)y(13)y(7)

y(15)

y(11)

y(6)y(14)y(2)y(12)y(8)y(10)

y(3)subgraph

subgraph

C1 C2

C1 C2

(b) Configuration 1

Fig. 9. (a) Factor graph representation of SPC receiver employing configuration 1, where we employ different channel code C1 and C2 and the sameinterleaver. The interleaver pattern is π = {11, 3, 6, 14, 2, 12, 8, 10, 1, 16, 4, 15, 5, 9, 13, 7}. (b) In configuration 1, the original graph is decomposed intosubgraphs.

C1

C1 C1

C1

ψ

(a) Configuration 2

y(11)

y(6)y(14)y(2)y(12)y(8)y(10)

y(3)

y(1)y(16)y(4)

y(5)y(9)y(13)y(7)

y(15)

C1 C1

C1 C1

(b) Configuration 2

Fig. 10. (a) Factor graph representation of SPC receiver employing configuration 2, where we employ the same channel code and different interleavers.The interleaver pattern of the left data stream is π = {11, 3, 6, 14, 2, 12, 8, 10, 1, 16, 4, 15, 5, 9, 13, 7}, the interleaver pattern of the right data stream isπ = {11, 8, 5, 15, 13, 2, 14, 12, 7, 6, 16, 4, 10, 1, 3, 9}. (b) In configuration 2, where the original graph can not be decomposed into subgraphs.

employ the so-called NC technique [32] in the physical layerfor conveying a linear combination of multiple informationstreams. Both encoding techniques are discussed in more detailbelow.

The philosophy of NC was proposed by Yeung [32] for thesake of enhancing the wired channel’s capacity. Apart fromthe original network-layer applications, it has recently beenrecognised that the physical-layer of wireless networks alsobenefits from NC. The most classic application is the two-way communications scenario [76], where source A and Bare communicating with each other with the aid of a relaynode C. More explicitly, in the first phase, source A (B)transmits its packet sA (sB) to the relay node C, which thenbroadcasts a combined packet sR = sA ⊕ sB to both source

A and B. At the intended destination B of the source nodeA, node B retrieves source A’s packet, when receiving thecombined packet sR from the relay node C by the operationsA = sA ⊕ sB ⊕ sB . In simple terms, the binary ones in the⊕ function indicate the positions of different logical valuesat the source and destination. It is only these symbols whichhave to be explicitly signalled, while the identical symbols arealready known at both end of the link. However, the extensionof NC to MSC is not straightforward, since the uniquerecovery of the ith information flow si from an aggregateof N module 2 superimposed information flows created ass1 ⊕ s2, . . . , sN−1 ⊕ sN is generally impossible [77], simplybecause there are numerous legitimate decompositions. Wetherefore generalise the concept of network coding, leading

Page 12: A Unified Treatment of Superposition Coding Aided Communications: Theory and Practice

514 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 13, NO. 3, THIRD QUARTER 2011

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

CS 1

CS 2

CS n

CS N

Phase−II Joint Tx

Interleave Puncture

Repeat PunctureJoint Encode

SuperimposeSPC

PANC

Encoding At Nodes

AS 1

AS 2

AS K

CS 1

CS 2

CS n

CS N

Phase−I TDD

s(1)1,n

s(1)2,n

cn,1

cn,2

cn,K

Σ

s(1)1,n

s(1)2,1

s(1)K,n

s(1)K,n

πκ(s)

π2(s)

π1(s)

G

Fig. 11. Multiple source cooperation scenario (K ASs, N CSs) employing decode-forward relaying technique, where the SPC and PANC coding schemesare illustrated.

Interleave

Superimpose

Puncture

SPC

f(·)

π1(·)

f(·)

π2(·)Repeat

Joint Encode

Puncture

PANC

π1(·) π2(·) π3(·) π4(·)

Fig. 12. Encoding diagram of SPC and PANC in a N = K = 2-source MSC arrangement. The outer channel code-rate of SPC is Rc = 1/8, while therepetition factor of PANC is κ = 4, both schemes have the same effective throughput of η = 1/2.

to the so-called Physical-layer Algebraic Network Coding(PANC) scheme.

PANC may be defined as a coding function f(·), whichjointly encodes all the incoming multiple information flows.Thus, the PANC of K linearly coded information flowssiGi, i ∈ [1, K] becomes equivalent to encoding the vectorss = [s1, s2, . . . , sK ] using a nested Generator Matrix (GM),which can be written as:

c = s1G1 ⊕ s2G2, . . . ,⊕sKGK (11)

= [s1, s2, . . . , sK ][G1,G2, . . . ,GK ]T. (12)

It is important to note that the NC concept may be consideredas a SPC scheme defined over the Galois Field (GF) 2,while the SPC concept may be considered as a NC schemedefined over the complex-valued field. Furthermore, PANCmay be considered as a conventional NC scheme exhibitinga channel coding gain, which is a benefit of the mutualdependencies introduced by the linear modulo 2 addition ofmultiple streams.

Having established the relations between SPC and PANC,let us now introduce their encoding schemes conceptually.As seen in Fig 11, the SPC technique based encoding iscomprised of an interleaving, puncturing and superimpose

operation, while encoding in PANC is comprised of repeat,joint encode and puncture operations. In SPC, each of theK decoded information streams arriving from K ASs at thenth CS, namely sk,n, is individually reencoded by an iden-tical function f(·) and interleaved by unique distinguishableinterleaver πk,n, resulting in cn,k, which is then individuallypunctured and finally, all the K punctured information streamsare linearly combined, i.e. superimposed. On the other hand,in PANC, all the K decoded information streams that arrivedfrom the K ASs at the nth CS are considered as a singleamalgamated information stream, which is jointly repeated,interleaved and superimposed as specified by an appropriatelynested GM, where the resultant coded information stream issubjected to puncturing as detailed below. All operations aredefined over the GF 2.

Let us for example consider the encoding diagram of thetransmitted signal of the first CS in a N = K = 2-sourceMSC arrangement. The SPC encoding is illustrated on theleft of Fig 12, where the outer channel code-rate of theencoder f(·) is Rc = 1/8 and the streams of K = 2 sourcesare superimposed, while the PANC encoding procedure isillustrated on the right of Fig 12, where the repetition factoris κ = 4. Both schemes have the same effective throughputof η = 1/2, i.e. 4 inputs 8 outputs. It may be seen in Fig

Page 13: A Unified Treatment of Superposition Coding Aided Communications: Theory and Practice

ZHANG and HANZO: A UNIFIED TREATMENT OF SUPERPOSITION CODING AIDED COMMUNICATIONS: THEORY AND PRACTICE 515

12 that the PANC arrangement benefits from the joint non-linear encoding of multiple copies of the same informationstream, while SPC performs single-stream channel encodingand independent linear superposition modulation.

At the destination, iterative receivers are employed for boththe SPC and PANC aided MSC, where the detector algorithmused may be identical. However, the soft channel decoderdesign of SPC aided MSC depends on the choice of thespecific outer channel coding function f(·) employed. On theother hand, the soft decoding of a PANC may be carriedout by the sum-product algorithm and inherit quasi-randomcode-like properties [70], where a full decoding iterationcomprises a three-stage process, namely the soft-informationexchange between the detector and the joint channel decoder-inverse repetition chain. Since the joint encoder is a simpleaccumulator, we will employ a RA code as the outer channelcode f(·), when the SPC scheme is used.3) Performance of SPC and PANC: Let us now quantify the

achievable performance of both coding schemes. We assumeerror-free Phase-I cooperation, which is achieved with the aidof using Cyclic Redundancy Check (CRC) during each Phase-Itransmission and by ensuring that cooperation is only activatedby a perfect CRC check. The flat Rayleigh faded channelsbetween the N CSs and the destination are assumed to beindependent and are perfectly known at the destination. Blockfading is used, where the fades are kept constant during aSPC or PANC codeword, while faded independently betweenconsecutive codewords.

Before comparing these two coding schemes, we firstlydefine our performance metric set χ, which consists of theachievable throughput η, the BLock Error Ratio (BLER)P bl

e , the delay τ and the complexity ι, i.e. we have χ ={η, P bl

e , τ, ι}

. By setting the same system throughput η andthe same source information segment length of M = 512symbols, resulting in a fixed delay τ , we compare the twocoding schemes in terms of their BLER P bl

e and associatedcomplexity ι. The complexity ι is simply quantified in termsof the number of iterations invoked. The total number ofiterations of a SPC aided MSC scheme is the product of thenumber of detector-decoder outer iterations and the numberof inner iterations within the channel decoder, while that ofa PANC aided system is deemed to be proportional to thenumber of iterations exchanging extrinsic information betweenthe detector and the joint channel decoder-inverse repetitionchain.

Fig 13 suggests that both coding schemes are capable ofapproaching the outage probability bound depicted by dashedline at the corresponding system throughput η = 1/2, whenN = K = 2 sources cooperate in a cluster. For the PANCscheme, the system throughput becomes η = N/κ, wherethe repetition factor is κ = 4. For the SPC scheme thesystem throughput is formulated as η = RcNL, where theouter channel code rate is Rc = 1/8 and the number ofsuperimposed layers is L = N = 2. As characterised inFig 13, the PANC scheme employing ιPANC = 20 iterationsperforms within a small fraction of a dB from the SPC scheme,which requires a total of ιSPC = 5 × 20 = 100 iterations,hence the former is deemed to have a lower complexity. If theaffordable complexity is not an issue, then the SPC system

0 5 10 15 20 2510

−3

10−2

10−1

100

Eb/N

0(dB)

BLE

R

outage prob (η = 1/2)SPC (ι = 5 x 20)SPC (ι = 1 x 20)PANC (ι = 20)

Fig. 13. BLER of the SPC and PANC scheme aided MSC against theirupper and lower bounds, where we have K = N = 2 and η = 1/2.

may outperform the PANC system, as seen in Fig 13. Since thecomplexity imposed determines the total power consumption,the PANC scheme may be considered as being more power-efficient.

B. Multiplexed HARQ

1) The Motivation of M-HARQ: A general packet-basedwireless network is constituted by a wired link spanning froma server to an access point and a wireless link from the accesspoint to the MS. The classic Transport Control Protocol (TCP)supports reliable end-to-end data transmission and facilitatescongestion control, where the transmission frame loss dueto link errors is often assumed to be negligible in wiredTCP networks [78]. However, direct transplantation of theTCP into wireless applications suffers from link impairments,such as radio link attenuation, fading, handover, mobility andco-channel interference. For the sake of avoiding congestiondue to physical retransmissions induced by channel errors,link-layer approaches such as HARQ attempt to conceal thechannel-induced packet loss events from the TCP-enabledtransmitter by reducing the effects of wireless link errorswith the aid of channel coding combined with retransmis-sions on a prompt packet-based timescale [79]. In HARQ,the receiver asks for a packet’s retransmission using thereverse-direction channel with the aid of a single-bit Negative-ACKnowledgement (NACK) flag, whenever its currently de-coded packet is deemed to be erroneous based on the decisionof the CRC scheme [5]. This solution is appealing, sinceit does not incur the typical overhead associated with TCP-awareness and yet obeys the TCP semantics. However, thisHARQ aided approach introduces extra delay due to locallink layer retransmissions, which may potentially lead to atimeout and hence may trigger the slow-start phase of theTCP transmission.

The conventional strategy of transmitting the next newpacket only when the successful reception of the current onewas confirmed is highly inefficient. However, we may exploit

Page 14: A Unified Treatment of Superposition Coding Aided Communications: Theory and Practice

516 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 13, NO. 3, THIRD QUARTER 2011

..

..

..

..

.

Conventional HARQ

Proposed HARQ

f0code(um) f1

code(um) f2code(um)

f2code(um+1)

f1code(u3) f2

code(u3)

f0code(u2) f1

code(u2) f2code(u2)

f0code(u1) f1

code(u1) f2code(u1)

f0code(u3)

f2code(uM−1)

f2code(uM )f1

code(uM )f0code(uM )

f1code(uM−1)

f1code(um+1)

f0code(uM−1)

f0code(um+1)

Fig. 14. Classic HARQ and the proposed multiplexed HARQ in conjunctionwith the number of retransmissions Q = 2 and a total of M transmissionpackets.

the multiplexing capability inherently provided by channelcodes having a channel coding rate less than unity by super-imposing different packets and minimising their interferencewith the aid of their unique, packet-specific interleavers [80].If the receiver is capable of tolerating a modest amount ofadditional interference, the next new packet can be simulta-neously transmitted with the retransmissions of the previouserroneous packets, as seen in Fig. 14. In other words, thenew packets are continuously transmitted, while the erroneouspackets are transmitted on a virtual channel, appropriatelycombined with the new packets, leading to the concept ofSPC aided M-HARQ scheme, which is aimed at improving theoverall end-to-end TCP transmission efficiency by reducingthe link layer’s hop-by-hop HARQ retransmission delay. Asa benefit, the SPC aided M-HARQ scheme is capable ofjointly and simultaneously transmitting multiple packets andthe proposed solution is equally applicable to both Type Iand II HARQ techniques [5]. Hence, the technique advocatedmay be seamlessly integrated with diverse existing and futuresystems.2) The Construction of M-HARQ: In general, different

packets require different number of HARQ retransmissionsQ, depending on the instantaneous channel conditions. Weconsider the worst-case scenario, where each packet exploitedthe maximum number of retransmissions. In the worst-casescenario considered and when employing the SPC scheme tobe introduced shortly, the resultant interference of our M-HARQ arrangement becomes similar to that of the Inter-Symbol-Interference (ISI) effects experienced for transmissionover a dispersive channel in the absence of HARQ trans-missions. Analogously, our scheme may be interpreted asgenerating Inter-Packet-Interference (IPI), as seen in Fig. 14.

Let us assume that there are a total of M packets um, m =1, . . . , M in a frame. Generally speaking, the joint encodingfunction F of the mth transmission can be represented asF (ua1 , . . . ,ua2), where we have:⎧⎨

⎩(a1, a2) = (m, 1) 1 ≤ m ≤ Q,(a1, a2) = (m, m − Q) Q < m ≤ M,(a1, a2) = (M, m − Q) M < m ≤ M + Q.

(13)

Although in principle specifically designed coding functionsmay be created, we opt for the previously detailed SPCconcept of Section II:

F (·) =a1∑

i=a2

ρiejθifmodu

[fm−i

code (ui)], (14)

where each superimposed packet is referred to as a layer,while ρi and θi ∈ [0, π) denote the layer-specific amplitudeand phase-rotation, respectively. Without loss of generosity,an identical amplitude allocation combined with layer-specificuniformly increasing phase rotations are employed for theindividual superimposed layers. The benefit of choosing thisparticular SPC technique is that by opting for this simplelinear superposition operation, the specific modulation func-tion fmodu(·) and channel coding function fcode(·) of theindividual layers may be retained, where the superscript ofthe channel coding functions denotes the specific channelcode employed for different transmission attempts, which areassumed to be identical.

The M-HARQ scheme employs iterative detection andchannel decoding exchanging extrinsic information betweenthese two receiver components. The choice of the decodingalgorithm depends on the specific channel code employed,however, a host of detection schemes may be invoked, wherewe may opt for employing the low-complexity SIC of Fig5 discussed in Section II-B, which has a linearly increaseddetection complexity as a function of the number of superim-posed packets. The previously detected packets may provide apriori information for the current detection even if they werepartially erroneous and the soft-detected packets generatedfrom all different transmission attempts may be appropriatelycombined before soft-decoding. Alternatively, they may beindividually soft-decoded without requiring the buffering ofthe previous transmission packets.

3) The Performance of M-HARQ: Fig. 15 shows the linklayer Packet Error Ratio (PER) performance of the M-HARQarrangement against that of the conventional scheme for atotal of Q + 1 = 3 transmissions employing Type-I HARQ.In our simulations, each packet of length Ni = 256 bits isQPSK modulated and channel coded by a rate-1/3 irregularsystematic RA code [46]. A Rayleigh distributed block-fadingchannel is used and the feedback channel conveying the NACKindicator is assumed to be error-free. Again, we consider theworst-case scenario, where each of the M packets employsthe maximum affordable number of Q = 2 retransmissions.We investigate the PER of all the (Q + 1) transmissionsfor each of the first (Q + 1) packets, since they correspondto different typical interference patterns Ω. For instance,when Q = 2 is considered, the number of layers for eachof the 3 transmissions of the 3 first packets is given byΩpck1 = [1, 2, 3], Ωpck2 = [2, 3, 3] and Ωpck3 = [3, 3, 3].Fig. 15 suggests that during the first transmission the PERperformance of our proposed scheme is the same as thatof the conventional scheme. By contrast, for two and threetransmissions, an observable but marginal PER degradationis imposed by our proposed scheme compared to that of theconventional one. Apart from this slight difference, all packetsexperience a near-identical PER performance.

Page 15: A Unified Treatment of Superposition Coding Aided Communications: Theory and Practice

ZHANG and HANZO: A UNIFIED TREATMENT OF SUPERPOSITION CODING AIDED COMMUNICATIONS: THEORY AND PRACTICE 517

−5 0 5 10 1510

−3

10−2

10−1

100

Eb/N

0 (dB)

PE

R

Conventional

Proposed, packet 1

Proposed, packet 2

Proposed, packet 3

1st Transmission

2nd Transmission3rd Transmission

Fig. 15. The PER performance of all Q + 1 = 3 transmissions for boththe conventional HARQ and for the first Q + 1 = 3 packets of the proposedM-HARQ scheme.

The normalised effective throughput achieved at the TCPlayer may be measured by the so-called mean transmissionframe arrival rate λ encountered, which is determined bythe average number of TCP frames successfully transmittedwithin the average Round Trip Time (RTT) [81] constraint.Fig. 16 compares the mean successful frame arrival rate λrecorded for a Poissonian source frame generation process forboth the conventional HARQ scheme and for the M-HARQarrangement at the TCP layer. Two different phenomena maybe observed in Fig. 16, namely the ’buffer-size limited’ andthe ’Frame Error Ratio (FER) limited’ situations seen at theright and left of the figure. More explicitly, at low Eb/N0

values, λ is limited by the high FER imposed by the channel,which activates more retransmissions. On the other hand, λ islimited by the finite buffer-size of B TCP frames, when theFER is low, such as P f

e ≤ 0.02 for our proposed scheme andP f

e ≤ 0.003 for the conventional one. Although both schemesreach the same maximum value of λ bounded by the buffer-size B, the proposed M-HARQ arrangement substantiallyimproves the mean successful TCP frame arrival rate at lowEb/N0 values, when comparing points A and C in Fig. 16.This clearly implies that the SPC aided M-HARQ scheme ismore tolerant to frame error events and hence has a higherend-to-end throughput.

The M-HARQ scheme is based on the SPC approach andhence the resultant composite packet of multiple superimposedlayers becomes effectively ’interference-limited’. Therefore,the per-layer throughput should not be excessive in order toensure that the decoded PER approaches the single-layer best-case performance. More explicitly, this requirement discour-ages the employment of high-throughput, but interference-sensitive, high-order modulation schemes. Furthermore, rel-atively low-rate channel codes are preferred for the sake ofsupporting the transmission of multiple superimposed layersat a near-single-layer PER performance. Since the numberof retransmissions is typically low in practice, so is thenumber of superimposed layers. This makes the SPC aided M-

0 2 4 6 8 10 12 14 16 18 200.4

0.5

0.6

0.7

0.8

0.9

1

Eb/N

0 (dB)

Mea

n A

rriv

al R

ate

λ

proposed

conventional

Max. Gain app. 5dB

ProposedFER limited

Conventional FER limited

Increased Buffer Size BB = [1,4,8,12,16,20]

A: pe = 0.0183

C: pe = 0.0028

A C

Fig. 16. The mean arrival rate for both the conventional HARQ and for theproposed M-HARQ scheme.

HARQ scheme particularly suitable for delay-constrained low-rate applications providing cell-edge users with an improvedtransmission integrity.

IV. CONCLUSION

We may conclude that the SPC technique is capable of pro-viding both a high bandwidth efficiency and power efficiencybecause of its non-orthogonal code-multiplexing nature, whichtheoretically approaches the channel capacity, as discussedin Section II-A3. For example, in the HARQ scenario ofSection III-B employing the SPC technique, the amount ofdelay imposed by local retransmissions through conventionaltime-multiplexing of retransmitted packets is reduced, whichultimately reduces the end-to-end delay. We may also concludethat the SPC technique is practically implementable, where theerror propagation effects of SIC may be overcome with theaid of using an iterative receiver. This is evidenced by ourcooperative communications design example of Section III-A,where the performance of a SPC aided MSC arrangement wasshown to approach the lower bound of the outage probability.Finally, the concept of SPC has an inherent linkage with theconcept of NC, where the former is defined over the complex-valued field, while the latter is defined over GF 2. This maybe verified by our example provided in Fig 12 of Section III-Aas detailed in the literature [82], [83].

However, there are also some open problems associated withthe SPC technique. The first problem is its relatively high Peakto Average Power Ratio (PAPR). When the number of layersof the SPC scheme is sufficiently high, the resultant signallingconstellation becomes near-Gaussian distributed, which hasthe drawback of a high PAPR from an implementation pointof view, since linear amplifiers having a high dynamic rangeare costly. This may be overcome for example with theaid of peak-clipping and compensation [84] or amplifier-linearisation [11]. Another problem associated with the SPCtechnique is the power allocation issue. Although a rangeof power allocation methods are available, when non-idealchannel codes are employed, the power allocation should be

Page 16: A Unified Treatment of Superposition Coding Aided Communications: Theory and Practice

518 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 13, NO. 3, THIRD QUARTER 2011

TABLE IACRONYMS

3GPP Third Generation Partnership Project AWGN Additive White Gaussian NoiseAS Active Sources BER Bit Error RatioBLAST Bell-Labs Layered Space Time Architecture BLER BLock Error RatioCMA Cooperative Multiple Access CRC Cyclic Redundancy CheckCS Cooperative Sources DL DownLinkDPC Dirty Paper Coding DSP Digital Signal ProcessingEXIT EXtrinsic Information Transfer FDE Frequency Domain EqualisationFDMA Frequency Division Multiple Access FER Frame Error RatioGF Galois Field GM Generator MatrixGMA Gaussian Multiple Access HARQ Hybrid Automatic Repeat reQuestHSPA High Speed Packet Access IDMA Interleave Division Multiple AccessILI Inter-Layer-Interference IPI Inter-Packet-InterferenceISI Inter-Symbol-Interference JGSP Joint Gaussian Sum-ProductLDC Linear Dispersion Codes LDPC Low Density Parity CheckLLR Log Likelihood Ratios LTE Long Term EvolutionMF Matched Filter M-HARQ Multiplexed HARQMI Mutual Information MIMO Multiple Input Multiple OutputML Maximum Likelihood MLC MultiLevel CodingMLM Multiple Level Modulation MMSE Minimum Mean Square ErrorMSC Multiple Source Cooperation NACK Negative-ACKnowledgementNC Network Coding OFDM Orthogonal Frequency Division MultiplexingPANC Physical-layer Algebraic Network Coding PER Packet Error RatioPSK Phase Shift Keying QAM Quadrature Amplitude ModulationRA Repeat Accumulate RS Relay SourcesRTT Round Trip Time SDMA Spatial Division Multiple AccessSIC Successive Interference Cancellation SINR Signal to Interference plus Noise RatioSPC SuperPosition Coding STC Space-Time CodesTCP Transport Control Protocol TCMA Trellis Code Multiple AccessTDD Time Division Duplex UL UpLinkVMIMO Virtual MIMOs

jointly designed with practical codes. The impact of practicalreceiver architectures should also be taken into account, sinceeven the number of iterations employed at the receiver wouldchange the power allocation design [85]. In this case, wemay seek deterministic optimisation solutions, such as theemployment of linear programming techniques [86].

In conclusion, this tutorial provided a comprehensive treat-ment of SPC aided communications systems, outlining itspotential applications in future wireless networks and weclosed with the portrayal of a range of open design challenges.

APPENDIX

Acronyms (see Table I.)

REFERENCES

[1] S. Shakkottai, T. S. Rappaport, and P. C. Karlsson, “Cross-layer designfor wireless networks,” IEEE Commun. Mag., vol. 41, pp. 74– 80, Oct.2003.

[2] L. Hanzo, O. Alamri, M. El-Hajjar, and N. Wu, Near-Capacity Multi-functional MIMO Systems. IEEE Press - John Wiley, April 2009.

[3] L. Hanzo, M. Munster, B. J. Choi, and T. Keller, OFDM and MC-CDMA for Broadcasting Multi-User Communications, WLANs andBroadcasting. Wiley-IEEE Press, 2003.

[4] L. Hanzo, C. H. Wong, and M. S. Yee, Adaptive Wireless Transceivers:Turbo-Coded, Space-Time Coded TDMA, CDMA and OFDM Systems.Wiley-IEEE Press, 2002.

[5] S. Lin and D. J.Costello, Error Control Coding: Fundamentals andApplications., 2nd ed. New York: Prentice-Hall, Inc., 2005.

[6] “Third Generation Partnership (3GPP) ,” Technical specification groupradio access network, 1999, ftp://ftp.3gpp.org/.

[7] H. Ekstrom, A. Furuskar, J. Karlsson, M. Meyer, S. Parkvall, J. Torsner,and M. Wahlqvist, “Technical solutions for the 3G long-term evolution,”IEEE Commun. Mag., vol. 44, pp. 38– 45, Mar. 2006.

[8] T. Cover and J. Thomas, Elements of Information Theory. New York:Wiley, 2004.

[9] A. J. Goldsmith, Wireless Communication. Cambridge University Press,2005.

[10] N. C. Tse and P. Viswanath, Fundamentals of Wireless Communication.Cambridge University Press, 2005.

[11] L. Hanzo, S. X. Ng, T. Keller, and W. T. Webb, Quadrature Am-plitude Modulation: From Basics to Adaptive Trellis-Coded, Turbo-Equalised and Space-Time Coded OFDM, CDMA and MC-CDMASystems. Wiley-IEEE Press, 2004.

[12] G. J. Foschini and M. J. Gans, “On limits of wireless communications ina fading environment when using multiple antennas,” Wireless PersonalCommunications, vol. 6, pp. 311–355, Mar. 1998.

[13] L. Hanzo, J. Blogh, and S. Ni, 3G, HSDPA, HSUPA and Intelligent FDDversus TDD Networking: Smart Antennas and Adaptive Modulation.IEEE Press - John Wiley, 2008.

[14] C. E. Shannon, “A mathematical theory of communication,” Bell Syst.Tech. J., vol. 27, pp. 379/623–423/656, 1948.

[15] A. R. Calderbank and L. H. Ozarow, “Non-equiprobable signalling onthe Gaussian channel,” IEEE Trans. Inf. Theory, vol. 36, pp. 126–740,Jan. 1990.

[16] J.G.Proakis, Digital Communications. New York: McGrawHill, 2001.[17] J. C. MacKay, Information Theory, Inference and Learning Algorithms.

Cambridge University Press, 2003.[18] X. D. Wang and H. V. Poor, “Iterative (turbo) soft interference cancel-

Page 17: A Unified Treatment of Superposition Coding Aided Communications: Theory and Practice

ZHANG and HANZO: A UNIFIED TREATMENT OF SUPERPOSITION CODING AIDED COMMUNICATIONS: THEORY AND PRACTICE 519

lation and decoding for coded CDMA,” IEEE Trans. Commun., vol. 47,pp. 1046–1061, July 1999.

[19] B. Hassibi and B. M. Hochwald, “High-rate codes that are linear inspace and time,” IEEE Trans. Inf. Theory, vol. 48, pp. 1804–1824, June2002.

[20] M. Costa, “Writing on dirty paper,” IEEE Transactions on InformationTheory, vol. 29, pp. 439 – 441, May 1983.

[21] P. Wang, J. Xiao, and P. Li, “Comparison of orthogonal and non-orthogonal approaches to future wireless cellular systems,” IEEE Veh.Technol. Mag., vol. 1, pp. 4–11, Sept. 2006.

[22] F. Brannstrom, T. M. Aulin, and L. K. Rasmussen, “Iterative detectorsfor trellis-code multiple-access,” IEEE Trans. Commun., vol. 50, pp.1478– 1485, Sept. 2002.

[23] , “Iterative multi-user detection of trellis code multiple access usinga posteriori probabilities,” in Proc. IEEE ICC ’01, vol. 1, Helsinki,Finland, June 11–14, 2001, pp. 11–15.

[24] P. Li, L. H. Liu, K. Y. Wu, and W. K. Leung, “Interleave-divisionmultiple-access,” IEEE Trans. Wireless Commun., vol. 5, pp. 938–947,Apr. 2006.

[25] H. Schoeneich and P. A. Hoeher, “Adaptive interleave-division multipleaccess-A potential air interference for 4G bearer services and wirelessLANs,” in Proc. WOCN 04, Muscat, Oman, June7–9, 2004, pp. 179 –182.

[26] A. Sendonaris, E. Erkip, and B. Aazhang, “User cooperation diversity.Part I. System description,” IEEE Trans. Commun., vol. 51, pp. 1927–1938, Nov. 2003.

[27] , “User cooperation diversity. Part II. Implementation aspects andperformance analysis,” IEEE Trans. Commun., vol. 51, pp. 1939–1948,Nov. 2003.

[28] K. Azarian, H. E. Gamal, and P. Schniter, “On the achievable diversitymultiplexing tradeoff in half-duplex cooperative channels,” IEEE Trans.Inf. Theory, vol. 51, pp. 4152–4172, Dec. 2005.

[29] O. Shalvi, “Multiple source cooperation diversity,” IEEE Commun. Lett.,vol. 8, pp. 712–714, Dec. 2004.

[30] A. Ribeiro, R. Q. Wang, and G. B. Giannakis, “Multi-source coopera-tion with full-diversity spectral-efficiency and controllable-complexity,”IEEE J. Sel. Areas Commun., vol. 25, pp. 415–425, Feb. 2007.

[31] R. Zhang and L. Hanzo, “Interleaved random space-time coding formultisource cooperations,” IEEE Trans. Veh. Technol., vol. 58, pp. 2120– 2125, May 2009.

[32] R. Ahlswede, N. Cai, S. Y. R. Li, and R. W. Yeung, “Networkinformation flow,” IEEE Trans. Inf. Theory, vol. 4, pp. 1204–1216, July2000.

[33] Y. D. Chen, S. Kishore, and J. Li, “Wireless diversity through networkcoding,” in Proc. IEEE WCNC ’06, Las Vegas, USA, Apr.3-6, 2006, pp.1681–1686.

[34] R. Zhang and L. Hanzo, “Coding schemes for energy efficient multi-source cooperation aided uplink transmission,” IEEE Signal Process.Lett., vol. 16, pp. 438 – 441, May 2009.

[35] , “Superposition-coding aided multiplexed hybrid ARQ schemefor improved end-to-end transmission efficiency,” IEEE Trans. Veh.Technol.. [Online]. Available: http://eprints.ecs.soton.ac.uk/17230/

[36] S. Verdu, “Fifty years of Shannon theory,” IEEE Trans. Inf. Theory,vol. 44, pp. 2057–2078, Oct. 1998.

[37] G. Caire, S. Guemghar, A. Roumy, and S. Verdu, “Maximizing thespectral efciency of coded cdma under successive decoding,” IEEETrans. Inf. Theory, vol. 50, pp. 152–164, Jan. 2004.

[38] M. H. DeGroot and M. J. Schervish, Probability and Statistics. AddisonWesley, 2001.

[39] E. G. Larsson and B. R. Vojcic, “Cooperative transmit diversity based onsuperposition modulation,” IEEE Commun. Lett., vol. 9, pp. 778–780,Sept. 2005.

[40] X. Ma and P. Li, “Coded modulation using superimposed binary codes,”IEEE Trans. Inf. Theory, vol. 50, pp. 3331–3343, Dec. 2004.

[41] H. Imai and S. Hirakawa, “A new multilevel coding method using error-correcting codes,” IEEE Trans. Inf. Theory, vol. 23, pp. 371–377, May1977.

[42] M. Isaka and H. Imai, “On the iterative decoding of multilevel codes,”IEEE J. Sel. Areas Commun., vol. 19, pp. 935–943, May 2001.

[43] R. Roy, “Spatial division multiple access technology and its applicationto wireless communication systems,” in Proc. IEEE VTC 97-Spring,Phoenix, USA, May4-7, 1997, pp. 730–734.

[44] P. A. Hoeher and H. Schoeneich, “Interleave-division multiple accessfrom a multiuser point of view,” in Proc. 5th Int. Symposium onTurbo Codes and Related Topics in connection with the 6th Int. ITG-Conference on Source and Channel Coding, Munich, Germany, Apr.3–7,2006, pp. 140 – 144.

[45] B. Rimoldi and R. Urbanke, “A rate-splitting approach to the Gaussianmultiple-access channel,” IEEE Trans. Inf. Theory, vol. 42, pp. 364–375,Mar. 1996.

[46] J. Jin, A. Khandekar, and R. J. McEliece, “Irregular repeat-accumulatecodes,” in Proc. 2nd International Conference on Turbo Codes, Munich,Germany, Sept.4 – 7, 2000, pp. 125–127.

[47] H. Jin, Analysis and Design of Turbo-like Codes. Ph.D. Thesis,California Institute of Technology, Pasadena, 2001.

[48] S. Verdu, Multiuser Detection. Cambridge University Press, 1998.[49] A. Duel-Hallen, J. Holtzman, and Z. Zvonar, “Multiuser detection for

CDMA systems,” IEEE Pers. Commun. Mag., pp. 53–66, Apr. 1995.[50] M. Kobayashi, J. Boutros, and G. Caire, “Successive interference

cancellation with SISO decoding and EM channel estimation,” IEEEJ. Sel. Areas Commun., vol. 19, pp. 1450–1460, Aug. 2001.

[51] L. Hanzo, T. H. Liew, and B. L. Yeap, Turbo Coding, Turbo Equalisationand Space-Time Coding for Transmission over Fading Channels. Wiley-IEEE Press, 2002.

[52] C. Berrou and A. Glavieux, “Near optimum error-correcting coding anddecoding: turbo codes,” IEEE Trans. Commun., vol. 44, pp. 1261–1271,Oct. 1996.

[53] J. Hagenauer, “The turbo principle in mobile communications,” in Proc.2004 Nordic Radio Symposium, Oulu, Finland, Aug.16–18, 2004.

[54] H. V. Poor, “Turbo multiuser detection: A primer,” J. Commun. Netw.,vol. 3, pp. 196–201, Sept. 2001.

[55] Z. N. Shi and C. B. Schlegel, “Iterative multiuser detection and errorcontrol code decoding in random CDMA,” IEEE Trans. Signal Process.,vol. 54, pp. 1886– 1895, May 2006.

[56] B. Sklar, “A primer on turbo code concepts,” IEEE Commun. Mag.,vol. 35, pp. 94–102, Dec. 1997.

[57] Q. H. Guo and P. Li, “LMMSE turbo equalization based on factorgraphs,” IEEE J. Sel. Areas Commun., vol. 26, pp. 311–319, Feb. 2008.

[58] S. ten Brink, “Convergence behavior of iteratively decoded parallelconcatenated codes,” IEEE Trans. Commun., vol. 49, pp. 1727–1737,Oct. 2001.

[59] K. Li and X. D. Wang, “EXIT chart analysis of turbo multiuserdetection,” IEEE Trans. Wireless Commun., vol. 4, pp. 300–311, Jan.2005.

[60] R. Zhang, L. Xu, S. Chen, and L. Hanzo, “Repeat accumulate codedivision multiple access and its hybrid detection,” in Proc. IEEE ICC’08, Beijing, China, May19-23, 2008, pp. 4790–4794.

[61] R. G. Gallager, Low-Density Parity-Check Codes. MIT Press, Cam-bridge, 1963.

[62] S. Y. Chung, T. J. Richardson, and R. Urbanke, “Analysis of sum-product decoding of low-density parity-check codes using a Gaussianapproximation,” IEEE Trans. Inf. Theory, vol. 47, pp. 657–670, Feb.2001.

[63] T. J. Richardson, M. A. Shokrollahi, and R. L. Urbanke, “Design ofcapacity-approaching irregular low-density parity-check codes,” IEEETrans. Inf. Theory, vol. 47, pp. 619–637, Feb. 2001.

[64] S. Benedetto, D. Divsalar, G. Montorsi, and F. Pollara, “Serial concate-nation of interleaved codes: performance analysis, design, and iterativedecoding,” IEEE Trans. Inf. Theory, vol. 44, pp. 909–926, May 1998.

[65] B. Schlegel and L. C. Perez, Trellis and Turbo Coding. IEEE Press -John Wiley, 2004.

[66] F. Kschischang, B. Frey, and H. Loeliger, “Factor graphs and the sum-product algorithm,” IEEE Trans. Inf. Theory, vol. 47, pp. 498–519, Feb.2001.

[67] R. M. Tanner, “A recursive approach to low complexity codes,” IEEETrans. Inf. Theory, vol. 27, p. 533 ?547, Sept. 1981.

[68] N. Wiberg, “Codes and decoding on general graphs,” Ph.D. dissertation,Linkoping University, Linkoping, Sweden, 1996.

[69] J. Hagenauer, E. Offer, and L. Papke, “Iterative decoding of binary blockand convolutional codes,” IEEE Commun. Mag., vol. 42, pp. 429–445,Mar. 1996.

[70] S. ten Brink and G. Kramer, “Design of repeat-accumulate codes foriterative detection and decoding,” IEEE Trans. Signal Process., vol. 51,pp. 2764–2772, Nov. 2003.

[71] A. Tarable, G. Montorsi, and S. Benedetto, “Analysis and design ofinterleavers for iterative multiuser receivers in coded CDMA systems,”IEEE Trans. Inf. Theory, vol. 51, pp. 1650–1666, May 2005.

[72] S. M. Alamouti, “A simple transmit diversity technique for wirelesscommunications,” IEEE J. Sel. Areas Commun., vol. 16, pp. 1451–1458,Oct. 1998.

[73] J. N. Laneman, N. C. Tse, and G. W. Wornell, “Cooperative diversityin wireless networks: Efficient protocols and outage behavior,” IEEETrans. Inf. Theory, vol. 50, pp. 3062–3080, Dec. 2004.

Page 18: A Unified Treatment of Superposition Coding Aided Communications: Theory and Practice

520 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 13, NO. 3, THIRD QUARTER 2011

[74] G. Kramer, M. Gastpar, and P. Gupta, “Cooperative strategies andcapacity theorems for relay networks,” IEEE Trans. Inf. Theory, vol. 51,pp. 3037–3063, Sept. 2005.

[75] R. Pabst, B. H. Walke, D. C. Schultz, P. Herhold, H. Yanikomeroglu,S. Mukherjee, H. Viswanathan, M. Lott, W. Zirwas, M. Dohler, H. Agh-vami, D. D. Falconer, and G. P. Fettweis, “Relay-based deploymentconcepts for wireless and mobile broadband radio,” IEEE Commun.Mag., vol. 42, pp. 80–89, Sept. 2004.

[76] C. Hausl and J. Hagenauer, “Iterative network and channel decoding forthe two-way relay channel,” in Proc. IEEE ICC ’06, Istanbul, Turkey,June11-15, 2006, pp. 1568–1573.

[77] T. R. Wang and G. B. Giannakis, “Complex field network coding formultiuser cooperative communications,” IEEE J. Sel. Areas Commun.,vol. 26, pp. 561–571, Apr. 2008.

[78] W. R. Stevens, TCP/IP Illustrated, Volume I: The Protocols. MA:Addison-Wesley, 1994.

[79] A. Bakre and B. R. Badrinath, “Improving TCP over wireless throughadaptive link layer setting,” in Proc. IEEE GLOBECOM ’01, SanAntonio, USA, Nov.25-29 2001, pp. 1766–17 703.

[80] R. Zhang and L. Hanzo, “Three design aspects of multicarrier interleavedivision multiple access,” IEEE Trans. Veh. Technol., vol. 57, pp. 3607–3617, Nov. 2008.

[81] L. Kleinrock, Queueing Systems, Volume I and II. New York: Wiley,1976.

[82] N. Fawaz, D. Gesbert, and M. Debbah, “When network coding anddirty paper coding meet in a cooperative ad hoc network,” IEEE Trans.Wireless Commun., vol. 7, pp. 1862–1867, May 2008.

[83] L. Xiao, T. Fuja, J. Kliewer, and D. Costello, “A network codingapproach to cooperative diversity,” IEEE Trans. Inf. Theory, vol. 53,pp. 3714–3722, Oct. 2007.

[84] J. Tong and P. Li, “Iterative decoding of superposition coding,” in Proc.5th Int. Symposium on Turbo Codes and Related Topics in conjunctionwith the 6th Int. ITG-Conference on Source and Channel Coding,Munich, Germany, Apr.3–7, 2006, pp. 135 – 139.

[85] P. Wang, P. Li, and L. H. Liu, “Power allocation for multiple accesssystems with practical coding and iterative multiuser detection,” in Proc.IEEE ICC’06, Istanbul, Turkey, June11–15, 2006, pp. 260–269.

[86] J. Boutros and G. Caire, “Iterative multiuser decoding: unified frame-work and asymptotic performance analysis,” IEEE Trans. Inf. Theory,vol. 48, pp. 1772–1793, 2002.

Rong Zhang received the BEng degree in Communications Engineering fromSoutheast University, Nanjing, China, in 2003; and the MSc degree with Dis-tinction in Radio Frequency Communications Engineering from University ofSouthampton, U.K., in 2005; and the PhD degree in Wireless Communicationsfrom University of Southampton, U.K., in 2009. He was a system engineerwith Mobile Communications Division of China Telecom from 2003 to 2004and is currently a Research Fellow with the Communications Research Group,School of Electronics and Computer Science, University of Southampton. Heis a member of the IEEE and was awarded a joint EPSRC and Mobile VCEscholarship in 2006. His current research interests include multiple accesstechniques, MIMO and cooperative communications.

Lajos Hanzo FREng, FIEEE, FIET, DSc received his degree in electronicsin 1976 and his doctorate in 1983. During his 34-year career in telecom-munications he has held various research and academic posts in Hungary,Germany and the UK. Since 1986 he has been with the School of Electronicsand Computer Science, University of Southampton, UK, where he holds thechair in telecommunications. He has co-authored 19 John Wiley - IEEEPress books on mobile radio communications totalling in excess of 10 000,published 684 research papers at IEEE Xplore, acted as TPC Chair ofIEEE conferences, presented keynote lectures and been awarded a number ofdistinctions. Currently he is directing an academic research team, working on arange of research projects in the field of wireless multimedia communicationssponsored by industry, the Engineering and Physical Sciences ResearchCouncil (EPSRC) UK, the European IST Programme and the Mobile VirtualCentre of Excellence (VCE), UK. He is an enthusiastic supporter of industrialand academic liaison and he offers a range of industrial courses. He is also anIEEE Distinguished Lecturer as well as a Governor of both the IEEE ComSocand the VTS. He is the acting Editor-in-Chief of the IEEE Press. For furtherinformation on research in progress and associated publications please referto http://www-mobile.ecs.soton.ac.uk