correction of mismatched l-values in bicm receivers

11
3198 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 60, NO. 11, NOVEMBER 2012 Correction of Mismatched L-values in BICM Receivers Leszek Szczecinski, Senior Member, IEEE Abstract—In this work we analyze the problem of correction of the reliability metrics (L-values) in bit-interleaved coded modulation (BICM) receivers. First, we propose a method for finding the linear correction factors that minimize the probability of error of a maximum likelihood decoder that uses the corrected L-values. To this end, we use the efficient approximation of the pairwise error probability in the domain of the cumulant generating functions (CGF) of the L-values and conclude that the optimal correction factors are equal to the twice of value of the saddlepoint of the CGF. Next, a simple extension of the proposed method to the non-linear correction is presented. We provide numerical examples demonstrating improvements attainable with the proposed method comparing it to the competitive solutions. Index Terms—Generalized mutual information, linear cor- rection, LLR, logarithmic likelihood ratio, L-value, maximum likelihood decoding, mismatched decoding, mismatched L-values, ML, non linear correction, pairwise error probability, PEP. I. I NTRODUCTION T HE logarithmic likelihood ratios (LLRs, or L-values) calculated at the receiver for the transmitted bits, are a convenient representation of the likelihood of the observations and are often used in all of the processing operations in the receiver (such as “soft” detection, decoding, iterative process- ing, etc). In this work we consider the so-called mismatched L-values, which only approximate the true L-values, and we analyze the linear scaling aiming at the correction of the mismatch. We formulate the problem in the context of bit- interleaved coded modulation (BICM) receivers and aim at the minimization of the probability of error of the maximum likelihood (ML) decoder that uses the corrected L-values. The L-value l n of the bit C n (transmitted at time n) is a well known representation of the reliability of the transmitted bit. It is related to the observation y n via l n = log p Yn|Cn (y n |1) p Yn|Cn (y n |0) (1) where p Yn|Cn (y n |b) is the probability density function (pdf) of the observation Y n conditioned on the sent bit C n = b. The L-values are basic signals/messages exchanged be- tween the processing units. The multiplications of probabilities required in many processing steps transform into additions of corresponding L-values; the numerical simplicity of the Paper approved by H. Leib, the Editor for Communication and Information Theory of the IEEE Communications Society. Manuscript received October 17, 2011; revised May 2, 2012. The author is with INRS-EMT, Montreal, Canada (e-mail: [email protected]). Part of this work was presented at the IEEE International Conference on Communications, 10–15 June 2012, Ottawa, Canada. Digital Object Identifier 10.1109/TCOMM.2012.082812.110697 resulting operations is the reason behind the popularity of the L-values. For example, in BICM receivers, the L-values are calculated by the front-end detector and then passed to the decoder [1]. In some cases, operations on the L-values are carried out before decoding as happens when combining the signals obtained in independent transmissions of the same bit [2]. The L-values are also used in binary decoders that operate in an iterative fashion, e.g., turbo-decoders [3] or message passing algorithms used for decoding of low-density parity check (LDPC) codes [4]. In some situations, however, the L-values are not appro- priately calculated, or are mismatched. Ignoring the mismatch when processing the L-values is, in general, suboptimal and to correct it, nonlinear operations on the L-values may be required. To make the correction simple, a linear operation (i.e., multiplication by a correction factor) is often considered. This idea was already studied in the context of BICM receivers [5], turbo-decoding [6]–[10], or LDPC decoding [4]. However, the correction factor was most often found through a brute- force search, that is, among the results obtained for different correction factors the one ensuring the best performance is deemed optimal. While this is a pragmatic approach when searching for one or two correction factors, it cannot be applied when many correction factors have to be found (the search space becomes too large) and/or when the correction has to be done on-line (i.e., when it depends on many continuously varying parameters). The works in [6], [8], [10], [11] aimed at finding the correction factor using the pdf of the L-value. The method of [6], based on a Gaussian model of the L-value fails to capture properties of non-Gaussian pdfs where [8], [10], [11] can be used but rely on simulation to find the correction factor. The drawbacks of [11] is that the pdf has to be known or estimated and the functional in the optimization problem is not explicitly related to any performance criterion. This disadvan- tage was recently removed in [12][13], where the correction factor was formally found via maximization of the so-called generalized mutual information (GMI) between the L-values and the corresponding bits. Then, a numerical integration is necessary as in most cases the analytical solutions are not available. While this approach was (experimentally) shown to improve the performance of BICM receivers operating with the capacity-approaching codes, it does not explicitly address the problem of minimizing the error probability of the decoder. In this paper we explicitly aim at the minimization of the probability of errors in ML decoders, which results in a novel correction principle and provides a new insight into correction of the L-values. The problem is formulated in the domain 0090-6778/12$31.00 c 2012 IEEE

Upload: leszek

Post on 28-Feb-2017

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Correction of Mismatched L-values in BICM Receivers

3198 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 60, NO. 11, NOVEMBER 2012

Correction of Mismatched L-values inBICM Receivers

Leszek Szczecinski, Senior Member, IEEE

Abstract—In this work we analyze the problem of correctionof the reliability metrics (L-values) in bit-interleaved codedmodulation (BICM) receivers. First, we propose a method forfinding the linear correction factors that minimize the probabilityof error of a maximum likelihood decoder that uses the correctedL-values. To this end, we use the efficient approximation ofthe pairwise error probability in the domain of the cumulantgenerating functions (CGF) of the L-values and conclude that theoptimal correction factors are equal to the twice of value of thesaddlepoint of the CGF. Next, a simple extension of the proposedmethod to the non-linear correction is presented. We providenumerical examples demonstrating improvements attainable withthe proposed method comparing it to the competitive solutions.

Index Terms—Generalized mutual information, linear cor-rection, LLR, logarithmic likelihood ratio, L-value, maximumlikelihood decoding, mismatched decoding, mismatched L-values,ML, non linear correction, pairwise error probability, PEP.

I. INTRODUCTION

THE logarithmic likelihood ratios (LLRs, or L-values)calculated at the receiver for the transmitted bits, are a

convenient representation of the likelihood of the observationsand are often used in all of the processing operations in thereceiver (such as “soft” detection, decoding, iterative process-ing, etc). In this work we consider the so-called mismatchedL-values, which only approximate the true L-values, and weanalyze the linear scaling aiming at the correction of themismatch. We formulate the problem in the context of bit-interleaved coded modulation (BICM) receivers and aim atthe minimization of the probability of error of the maximumlikelihood (ML) decoder that uses the corrected L-values.

The L-value ln of the bit Cn (transmitted at time n) is awell known representation of the reliability of the transmittedbit. It is related to the observation yn via

ln = logpYn|Cn

(yn|1)pYn|Cn

(yn|0) (1)

where pYn|Cn(yn|b) is the probability density function (pdf)

of the observation Yn conditioned on the sent bit Cn = b.The L-values are basic signals/messages exchanged be-

tween the processing units. The multiplications of probabilitiesrequired in many processing steps transform into additionsof corresponding L-values; the numerical simplicity of the

Paper approved by H. Leib, the Editor for Communication and InformationTheory of the IEEE Communications Society. Manuscript received October17, 2011; revised May 2, 2012.

The author is with INRS-EMT, Montreal, Canada (e-mail:[email protected]).

Part of this work was presented at the IEEE International Conference onCommunications, 10–15 June 2012, Ottawa, Canada.

Digital Object Identifier 10.1109/TCOMM.2012.082812.110697

resulting operations is the reason behind the popularity of theL-values. For example, in BICM receivers, the L-values arecalculated by the front-end detector and then passed to thedecoder [1]. In some cases, operations on the L-values arecarried out before decoding as happens when combining thesignals obtained in independent transmissions of the same bit[2]. The L-values are also used in binary decoders that operatein an iterative fashion, e.g., turbo-decoders [3] or messagepassing algorithms used for decoding of low-density paritycheck (LDPC) codes [4].

In some situations, however, the L-values are not appro-priately calculated, or are mismatched. Ignoring the mismatchwhen processing the L-values is, in general, suboptimal andto correct it, nonlinear operations on the L-values may berequired. To make the correction simple, a linear operation(i.e., multiplication by a correction factor) is often considered.This idea was already studied in the context of BICM receivers[5], turbo-decoding [6]–[10], or LDPC decoding [4]. However,the correction factor was most often found through a brute-force search, that is, among the results obtained for differentcorrection factors the one ensuring the best performance isdeemed optimal. While this is a pragmatic approach whensearching for one or two correction factors, it cannot beapplied when many correction factors have to be found (thesearch space becomes too large) and/or when the correctionhas to be done on-line (i.e., when it depends on manycontinuously varying parameters).

The works in [6], [8], [10], [11] aimed at finding thecorrection factor using the pdf of the L-value. The methodof [6], based on a Gaussian model of the L-value fails tocapture properties of non-Gaussian pdfs where [8], [10], [11]can be used but rely on simulation to find the correction factor.The drawbacks of [11] is that the pdf has to be known orestimated and the functional in the optimization problem is notexplicitly related to any performance criterion. This disadvan-tage was recently removed in [12][13], where the correctionfactor was formally found via maximization of the so-calledgeneralized mutual information (GMI) between the L-valuesand the corresponding bits. Then, a numerical integration isnecessary as in most cases the analytical solutions are notavailable. While this approach was (experimentally) shown toimprove the performance of BICM receivers operating withthe capacity-approaching codes, it does not explicitly addressthe problem of minimizing the error probability of the decoder.

In this paper we explicitly aim at the minimization of theprobability of errors in ML decoders, which results in a novelcorrection principle and provides a new insight into correctionof the L-values. The problem is formulated in the domain

0090-6778/12$31.00 c© 2012 IEEE

Page 2: Correction of Mismatched L-values in BICM Receivers

SZCZECINSKI: CORRECTION OF MISMATCHED L-VALUES IN BICM RECEIVERS 3199

of the cumulant generating function (CGF) of the L-values,and since CGF may be found analytically in many practicallyinteresting cases, the numerical integration is then avoided.We find a simple correction principle which says that thecorrection factor should be equal to the twice the value ofthe so-called saddlepoint of the CGF. The proposed correctionprinciple is then extended to a case of non-linear correction.

The paper is organized as follows. The definitions andnotation are presented in Section II and the new correctionprinciple we propose is explained in Section III. A detailedillustration of our analysis in shown using numerical examplesin Section IV and the conclusions are drawn in Section V.

II. MODEL

We consider a scenario where a codeword of N bitsc = [c1, c2, . . . , cN ] is sent over a binary-input channel withoutputs y1, . . . , yN . We assume that the bits cn, n = 1, . . . , Ncan be well modeled as independent identically distributedrandom variables Cn with uniform distribution, i.e., PCn(cn =1) = PCn(cn = 0) = 1

2 . All the codewords in the code C arethen equiprobable.

Upon reception of y1, . . . , yN , in order to minimize theprobability of detection error, the decoder decides in favour ofthe codeword that maximizes the likelihood of the observation,i.e.,

c = argmaxc∈C

log pY1,...,Yn|C1,...,CN(y1, . . . , yN |c1 . . . , cN ),

(2)

which, assuming that the channel is memoryless

pY1,...,Yn|C1,...,CN(y1, . . . , yN |c1, . . . , cN )

=

N∏n=1

pYn|Cn(yn|cn) (3)

becomes

c = argmaxc∈C

N∑n=1

log pYn|Cn(yn|cn). (4)

Applying the Bayes’ formula

pYn|Cn(yn|cn) = PCn|Yn

(cn|yn)pYn(yn)/PCn(cn) (5)

in (1), and from the uniform distribution property PCn(cn) =12 , we obtain an alternative expression of the aposterioriprobability PCn|Yn

(cn|yn) = eln·cn/(1+ eln), where we usedthe relationship PCn|Yn

(cn|yn) = 1−PCn|Yn(1− cn|yn). The

expression for PCn|Yn(cn|yn) applied in (5) transforms (4)

into the decoding based on the L-values

c = argmaxc∈C

N∑n=1

lncn (6)

where the terms independent of c were removed from themaximization. Since the decoding (6) is carried out in thelogarithmic domain, the implementation is simplified as theadditions are used instead of multiplications required in (2);that is why it is preferable to operate on L-values ln insteadof the likelihoods pYn|Cn

(yn|cn).The simple case of a binary phase shift-keying mod-

ulation (BPSK), where the channel is indeed memoryless

(yn is affected solely by cn) is a useful example whenthe above decoding principle is optimal. In a general caseof multilevel modulation a group of m consecutive bitscm·k+1, cm·k+2, . . . , cm·k+m, k = 0, . . . , N/m− 1 (a symbol)is sent over the channel. As there is more input bits thanobservations, we logically “replicate” the latter as ym·k+1 ≡ym·k+2 ≡ . . . ≡ ym·k+m, which allows us to use indexingn = 1, . . . , N so the likelihood for each bit cn is stillcalculated using the output yn. Since the replicated outputsare identical, the channel is not memoryless anymore, i.e., (3)does not hold. Then, the decoding (6) is not equivalent to (2)but since the former is simple to implement when using thebinary encoders/decoders, (6) is often used in practice despiteits suboptimality. This is the decoding principle of the so-called bit-interleaved coded modulation (BICM) [14].

The L-values ln are modeled as random variables Ln and,provided they are calculated exactly as defined in (1), theirpdf satisfies the so-called consistency condition [15, Sec. III]

pLn|Cn(l|1)

pLn|Cn(l|0) = el. (7)

The so-called symmetry condition [15, Sec. III]

pLn|Cn(l|c) = pLn|Cn

(−l|1− c) (8)

simplifies the analysis even if it does not have to be alwayssatisfied1 as it depends on the conditional pdf pYn|Cn

(yn|cn).However, it may be forced by a pseudo-random scrambling ofthe bits cn prior to modulation, followed by the change of thesign of the L-values ln if the bit was negated [14], in such acase the “symmetrized” pdf has the form

psL|C(l|c) =

1

2[pL|C(l|c) + pL|C(−l|1− c)] (9)

which satisfies (8). Thus, from now on, we assume that thecondition (8) is always satisfied.

Rewriting (7) as pLn|Cn(l|1)e−l/2 = el/2pLn|Cn

(l|0) andusing (8) yields what we call a consistency-symmetry condition

pLn|Cn(l|0)el/2 = e−l/2pLn|Cn

(−l|0). (10)

A. Mismatched Decoding and Correction of L-values

In practice, the calculation of some L-values via (1) may beinexact because i) the model pYn|Cn

(yn|cn) is not accurate,ii) its parameters are not well estimated, or iii) the likelihoodis calculated using simplifications introduced to diminish thecomputational requirements. In general, these effects maybe represented as if a “mismatched” likelihood q(yn, cn) �=pYn|Cn

(yn|cn) was used in (1) yielding the “mismatched” L-values [16][12]

ln = logq(yn, 1)

q(yn, 0). (11)

If the mismatch is ignored, that is, ln is incorrectly assumedto be identical with ln, the receiver will operate in a suboptimalfashion because ln cannot be transformed into the likelihoodpYn|Cn

(yn|cn). Nevertheless, if the conditional distributionpLn|Cn

(l|c) of Ln (that models the mismatched metrics ln)

1An example is shown in Section IV-B.

Page 3: Correction of Mismatched L-values in BICM Receivers

3200 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 60, NO. 11, NOVEMBER 2012

is known, we may calculate a post-processing or “correction”function [8][11]

f c(l) = logpLn|Cn

(l|1)pLn|Cn

(l|0) (12)

and then, calculate the “corrected” L-value as lcn = f c(ln).

In general, the effect of the mismatch cannot be eliminated,i.e., lcn �= ln. However, using lcn instead of ln should improvethe performance of the decoder, because lcn does represent thelikelihood of the observation ln conditioned on the bit cn. Thismeans that the L-values lcn satisfy (7). We also immediatelyconclude that if the L-value is matched, i.e., its pdf satisfies(7), no correction is necessary as we obtain f c(l) = l, that is,lcn = ln.

Example 1: If we assume Gaussian form of the pdfpLn|Cn

(l|0) = Ψ(l + μ, σ2

)= pLn|Cn

(−l|1), where

Ψ(l;σ2) =1√2πσ

exp

(− l2

2σ2

)(13)

with μ = −ELn|Cn{Ln|0} and σ2 = ELn|Cn

{L2n|0} −

E2Ln|Cn

{Ln|0}, the correction function

f c(l) =2μ

σ2· l = αGauss · l (14)

is linear and the correction factor is determined by the doubleof the ratio of the mean and the variance of the L-value.

The Gaussian model from the above example was used in[6] to justify the correction based on αGauss.

The resulting correction lc = α · l has an appealingsimplicity and in many cases (treated mostly in the areaof iterative decoding) f c(l) was observed to be relativelywell approximated by a linear function [8][11][10]. Therefore,using f(l) = αl instead of non-linear function f c(l) oftenprovides the satisfactory correction effect [8][11][10] andhas the advantage of the implementation simplicity (scalingonly) and a relatively simple design (one parameter needsto be found). Then, the main question is how to choose thecorrection factor α.

The contributions in [9][10] attempted to answer this ques-tion making f(α) = α · l “close” to f c(l). In particular, [9]finds the correction factor via the weighted least-square fit(WLSF) to the function f c(l)

αWLSF = argminαELn|Cn

{|f c(Ln)− αLn|2

∣∣ 0}= argminα

∫ ∞

−∞pLn|Cn

(l|0)(f c(l)− α · l)2dl. (15)

This criterion, however, is not explicitly associated with theperformance of the decoder. Moreover, since we use thefunction f c(l), the form of the pdf pLn|Cn

(l|0) has to beknown or explicitly estimated. This is a drawback becausea reliable estimation of the pdf requires extensive simulations,which precludes on-line correction.

In the recent works [12][13], the correction factor wasfound through maximization of the GMI [16] between the mis-matched L-values and the corresponding bits. Assuming (8),

this criterion boils down to solving the following optimizationproblem

αGMI = argmaxαIGMI(α) (16)

IGMI(α) = 1− ELn|Cn

{log2(1 + eLnα)

∣∣ 0}= 1−

∫ ∞

−∞pLn|Cn

(l|0) log2(1 + elα)dl. (17)

The minimum is reached when the derivative of the integralin (17) goes to zero

ddα

∫ ∞

−∞pLn|Cn

(l|0) l · eαl1 + eαl

dl = 0 (18)∫ ∞

−∞pLn|Cn

(l|0) l · eα l2

cosh(α l2 )

dl = 0. (19)

Solving (17) requires numerical quadrature as the logarithmwithin (17) or the hyperbolic cosine within (19) will resistanalytical integration. However, it is simpler to implement than(15) because we do not need to know the function f c(l).

While it is argued (and demonstrated on examples) in[12][13] that the maximization of GMI should improve theperformance of the capacity-approaching codes, the correctioncriterion (17) does not relate directly to the performance ofthe ML decoder we are interested in.

B. Pairwise Error Probability

To describe the behaviour of the ML decoder (6) based onthe corrected L-values2

c = argmaxc∈C

N∑n=1

αn lncn (20)

we will use the pairwise error probability (PEP) defined asthe probability of detecting codeword c when sending thecodeword c, the event we denote by c → c.

Assuming that the code C is linear and (8) holds, instead ofcalculating the PEP for all pairs (c, c) it is enough to calculatethe PEP for all c �= c assuming the all-zeros codeword c =0 = [0, . . . , 0] was sent

Pr{0 → c} = PEP({αn}Nn=1, c)

= Pr

{N∑

n=1

αnLncn >

N∑n=1

αnLncn

}(21)

= Pr

{N∑

n=1

Lcncn > 0

}(22)

=

∫ ∞

0

[pLc

1|C1(l|0)]c1 � . . . � [pLc

N |CN(l|0)]cN dl (23)

where � is the convolution operator.The notation PEP({αn}Nn=1, c) emphasizes that the PEP

depends on the correction factors {αn}Nn=1 and the codewordc.

2Note that, in general, the correction may vary with n, that is, be differentfor each L-value.

Page 4: Correction of Mismatched L-values in BICM Receivers

SZCZECINSKI: CORRECTION OF MISMATCHED L-VALUES IN BICM RECEIVERS 3201

If we denote by {nk}dk=1 the set of indices such that cnk=

1, where d is the Hamming weight of c, the PEP (23) can bewritten as

PEP({αn}Nn=1, c) =

∫ ∞

0

pLcn1

|Cn1(l|0)�

. . . � pLcnd

|Cnd(l|0)dl (24)

that is, it depends solely on the pdfs of the L-values indexedby {nk}dk=1.

We also note quickly that multiplying all the L-values lnin (20) by αn ≡ α cannot change the decoding results so,in such a case, the linear correction is useless. Nevertheless,it still may be useful if other than ML decoding is applied(e.g., iterative decoding of turbo or LDPC codes) as shown in[13][17].

III. PEP-MINIMIZING CORRECTION

We want answer now the question: how to choose thecorrection factors {αn}Nn=1 so that the error of the decoderthat uses the corrected L-values lcn = αn · ln is minimized?

From the previous discussion we conclude that, in orderto improve the performance of the decoder, we should find{αn}Nn=1 that minimizes PEP

({αn}Nn=1, c)

in (23) for anycodeword c. Thus, we have to solve the following optimizationproblem

{αn}Nn=1 = argmin{αn}Nn=1

PEP({αn}Nn=1; c

)(25)

and its solution should be independent of c because we wantto apply the correction factors to all L-values prior to decodingand we do not know which error (0 → c) will occur.

At first sight, the problem may appear intractable due tothe dependence of the PEP on various c, each resulting in adifferent set of L-values indexed by {nk}dk=1, whose pdf arethen convolved as per (24).

A. Two-state Mismatch

Before attacking the problem (25) we will analyze a simplercase of a two-state mismatch, where N1 of the L-values areindependent identically distributed (i.i.d) and mismatched andthe remaining N − N1 are i.i.d and matched. In this case,all the mismatched L-values will be multiplied by the samecorrection factor α, that is, αn = α, n = 1, . . . , N1 and thematched L-values will remain unaltered, that is, αn = 1, n =N1 + 1, . . . , N .

Since, we do not know a priori the indices {nk}dk=1, we donot know how many mismatched metrics will affect the PEP.We thus assume initially that among the L-values affectingthe PEP calculation, d1 are mismatched and d2 L-values arematched. Then (24) becomes

PEP({αn}Nn=1, c) = PEP(α)

= Pr

{d1∑k=1

α · Lnk+

d1+d2∑k=d1+1

Lnk> 0

}

=

∫ ∞

0

[pLc|C(l|0)

]�d1

�[pL|C(l|0)

]�d2

dl (26)

where [f(l)]�d is a d-fold self-convolution of f(l) and thenotation PEP(α) emphasizes that the PEP depends uniquelyon one parameter α.

We want to minimize PEP(α) (26) for any d1 and d2 thus,the solution of

α = argminαPEP(α) (27)

should be independent of d1 and d2.Example 2: Assume that the bits cn are sent using a binary

phase-shift keying (BPSK) modulation so

pYn|Cn(yn|cn) = Ψ

(yn − (2cn − 1);

1

)(28)

where γ has the meaning of the average signal-to-noise ratio(SNR) and Ψ(·) is given by (13).

To calculate the L-values via (1) using (28) we need toknow the value of γ and we assume that its estimate γ �= γ isused for the first N1 L-value. This results in ln = 4yn · γ, n =1, . . . , N1 and these L-values are mismatched. The exact valueof γ is used for the remaining L-values ln = 4yn ·γ, n = N1+1, . . . , N so these L-value are matched. It is straightforwardto see that the pdf of the mismatched L-values is given bypLn|Cn

(l|0) = Ψ(l+4γ; 8γ2/γ) while the pdf of the matchedL-values by pLn|Cn

(l|0) = Ψ(l + 4γ; 8γ) [18].Since all the L-values affecting the PEP are Gaussian, the

result of their convolution is also Gaussian and we can write(26) as

PEP(α) = Q

(√2

αd1γ + d2γ√α2d1γ2/γ + d2γ

)(29)

where Q(x) = 1√2π

∫∞x exp(−t2/2)dt.

Verifying that (29) is convex with respect to α and settingits derivative to zero yields the global minimum of (27) givenby α = γ/γ.

Note that, as required, the correction factor is independentof d1 and d2 thus the PEP is minimized independently of theerror event 0 → c.

We can also immediately see that lcn = α · ln = lnγ/γ =4ynγ, that is, lcn = ln and we recover the matched L-value. Ofcourse, if we knew that γ should be used, we would not useγ to calculate the L-values in the first place, but this exampleis shown only to illustrate the principle of correction.

B. Approximation of the PEP

To apply the PEP-minimization principle (27) in a generalcase, we must be able to find the PEP for arbitrary distributionsof the L-values. Since, in general, this cannot be done exactlyin analytical form, we will turn to approximations.

Defining

LΣ =

d1∑k=1

αLnk+

d1+d2∑k=d1+1

Lnk(30)

we can write (26) as PEP(α) = Pr{LΣ > 0

}.

Then, the Bhattacharyya upper bound for the PEP is givenby [14][19]

PEP(α) ≤ PEPUB(α) = eκΣ(s) (31)

Page 5: Correction of Mismatched L-values in BICM Receivers

3202 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 60, NO. 11, NOVEMBER 2012

where κΣ(s) is the cumulant generating function (CGF) ofLΣ

κΣ(s) = d1κLc(s) + d2κL(s) = d1κL(sα) + d2κL(s) (32)

and

κL(s) = logEL|C{esL∣∣ 0} = log

∫ ∞

−∞pL|C(l|0)esldl (33)

κL(s) = logEL|C{esL∣∣ 0} (34)

κLc(s) = logEL|C{esαL∣∣ 0} = κL(sα). (35)

In (31), s = argmins∈RκΣ(s) is the so-called saddlepoint[19] of κΣ(s). The saddlepoint of any variable L is uniquebecause the CGF is always convex, thus can be found solvingsolving the so-called saddlepoint equation

dds

κL(s)|s=s = 0

EL

{esL · L} = 0. (36)

Theorem 1: The upper bound for the PEP in (31) is mini-mized setting the correction factor to α = s1

s2, where s1 and s2

are the saddlepoints of the matched and mismatched L-values,that is, κ′

L(s1) = 0 and κ′

L(s2) = 0.Proof: We write (31) as

minα

PEPUB(α) = eminα,s κΣ(s)

= eminα,s

(d1κL(sα)+d2κL(s)

)(37)

≥ ed1κL(s1)+d2κL(s2) (38)

where (38) is the global and unique minimum of (31).We can see that the exponent in (37) minα

(d1κL(sα) +

d2κL(s))

reaches its global minimum for sα = s1 and s = s2,that is, when α = s1/s2; this means that the saddle point ofκΣ(s) is s = s2.

As required, the bound on the PEP is minimized indepen-dently of d1 and d2.

The bound (31) was shown in [19] to be quite looseand a much more accurate estimation of the PEP can beobtained using the so-called saddlepoint approximation (SPA)[19][20][21]

PEP(α) ≈ PEP(α) =eκΣ(s)

|s|√2πκ′′Σ(s)

. (39)

However, minimization of (39) is quite difficult due to theimplicit dependence of s on α, therefore, for simplicity weopt for minimization of the upper bound (31). Nevertheless,even if the correction factors α minimizing of PEPUB(α) andPEP(α) would not be the same, we expect them to be similaras the exponential term eκΣ(s) dominates both expressions.

C. Arbitrary Mismatch

We are now ready to abandon the context of the two-statemismatch and may extend the previous result to the casetreated in (25).

Let LΣ =∑N

n=1 αn ·Ln · cn has the CGF given by κΣ(s) =∑Nn=1 κLn

(s · αn) · cn, where κLn(s) is the CGF of the L-

value Ln conditioned on Cn = 0. Define the upper bound onthe PEP (25) as

PEP({αn}Nn=1; c

) ≤ PEPUB({αn}Nn=1; c

)= eκΣ(s). (40)

Proposition 1: The linear correction factors that minimizethe upper bound on PEP (40) are given by αn = sn/s0, wheresn is the saddlepoint of the CGF κLn

(s), and s0 > 0 is chosenarbitrarily.

Proof: As in the proof of Theorem 1 we write

min{αn}N

n=1

PEPUB({αn}Nn=1; c

)= e

mins,{αn}N

n=1

(∑Nn=1 κLn

(s·αn)·cn)

(41)

≥ e∑N

n=1 κLn(sn)·cn . (42)

The global minimum is reached when s · αn = sn, n =1, . . . , N . To satisfy these N equalities with N + 1 variables(N factors αn and one parameter s) we can arbitrarilyfix s = s0. Since s1, . . . , sN are positive, we will uses0 > 0. Then, the condition for reaching the minimum isαn = sn/s0, n = 1, . . . , N , which terminates the proof.

Remark 1: Although s0 may be chosen arbitrarily (recallthat the multiplication of all L-values by the same correctionfactor does not change the ML decoding results), the choices0 = 1

2 is convenient because the saddlepoint of the matchedmetrics equals to sn = 1

2 [19] and their correction factor isthen given by α = 1. That is, no correction is necessary aswe would expect it. Thus, the simple rule consists in doublingthe saddlepoint of the L-values’ CGF

αn = 2sn. (43)

Remark 2: We recall that if we want to use the pdfconditioned on Cn = 1, pLn|Cn

(l|1), instead of pLn|Cn(l|0),

the saddlepoint in negative sn < 0, but then also for thematched metrics s0 = − 1

2 . Thus, to take both cases intoaccount we might reformulate (43) as

αn = 2|sn|. (44)

Remark 3: Since the CGF of the L-values Lcn (after

correction) is equal to κLcn(s) = κLn

(sαn), the saddlepointof κLc

n(s) is given by sn/αn = s0. That is, the saddlepoint

of the CGF of each corrected L-values is equal to s0 = 12 . In

other words, the multiplication by αn allows us to “align” at12 the saddlepoints of all the L-values affecting the PEP.

Remark 4: Using (35) and Remark 3, an alternative for-mulation of Proposition 1 is the following

αn = argminακLcn(0.5). (45)

D. Non-linear Correction

A natural question to ask is whether it is possible to extendthe proposed correction approach to a non-linear correction ofthe L-values

lcn = αn · f cn(ln;pn) (46)

where pn is a vector of parameters defining the correctionfunction that have to be optimized together with the linearcorrection term αn, whose presence links us to the previousresults and simplifies analysis.

In this case, the upper bound on the PEP is defined as

PEP({αn,pn}Nn=1; c

) ≤ PEPUB({αn,pn}Nn=1; c

)= eκΣ(sΣ) (47)

Page 6: Correction of Mismatched L-values in BICM Receivers

SZCZECINSKI: CORRECTION OF MISMATCHED L-VALUES IN BICM RECEIVERS 3203

where, as before, sΣ = argminsκΣ(s) with

κΣ(s) =

N∑n=1

κLcn(s) · cn. (48)

Proposition 2: The correction parameters that minimize theupper bound on PEP (47) are given by α and pn

{αn, p} = argminα,pκLcn(0.5). (49)

Proof: From the Remark 3 and Remark 4 that followProposition 1, we observe that multiplication by αn allowsus to align the saddlepoints of the transformed L-valuesf cn(Ln) with the saddlepoint of the matched L-values, while

the parameters p are used to decrease the value of the CGFof the corrected L-value κLc

n(sn), which directly minimizes

the PEP.

E. Relationship to the GMI-maximizing Correction

Let us compare now the correction factor defined using (43)to αGMI defined in (19) where the maximization is obtainedfinding the zero of its derivative∫ ∞

−∞pL|C(l|0)

l · eαl2

cosh(α l2 )

dl = 0 (50)∫ ∞

−∞pLc|C(l|0)e

l2

l

cosh( l2 )

dl = 0 (51)

where (51) is obtained from (50) after the change of variablesusing the pdf of the corrected L-value Lc = αL, i.e.,pLc|C(l|0) = α−1 · pLc|0(α

−1 · l|0).On the other hand, the condition we derived in (43) states

that the saddlepoint of κLc(s) should be equal to 12 , which

may be written as follows

dds

κLc(s)|s= 12=

ELc|C=0{LceLc

2 }ELc|C=0{e

Lc2 }

= 0 (52)∫ ∞

−∞pLc|C(l|0)e

l2 · l dl = 0. (53)

Since cosh( l2 ) in the denominator of (51) is symmetric, we

can see that if pLc|C(l|0)el2 is symmetric, both (51) and (53)

are satisfied. This is also the condition we derived in (10)and can be interpreted as follows: if the optimal correctionfunction f c(l) is linear, i.e., after the linear correction the L-value satisfies the symmetry-consistency condition (10), bothcriteria yield the same correction factor. In general, however,they need not be the same.

As for the practical aspect of using both correction method,the main difference lies in the complexity of implementationof the search for the optimal scaling factor. We already notedin Section II-A that finding αGMI requires optimization ofthe results of numerical quadratures, which may be com-putationally demanding. On the other hand, the proposedmethod relies on the knowledge of the Laplace transform ofthe pdf, which in some cases may be calculated analytically.Consequently, finding the correction factor will be simplified.This is illustrated by examples in the Section IV.

As done in [12], [13], Monte-Carlo integration (simulations)may be used to calculate the integral in (17) or (19); the same

approach can be also used to implement saddlepoint-relatedintegrals, e.g., (53). However, such an approach precludes on-line (i.e., model-based) correction. Of course, if pLn|Cn

(l|0) isnot known (and the correction factors can be found off-line),Monte-Carlo integration will provide the required solutions.

IV. EXAMPLES

We will present now two examples of the introducedcorrection principle. The first one relates directly to the linearcorrection discussed throughout the paper and the secondwill take advantage of the non-linear correction proposed inSection III-D.

A. Correcting the Interference Effects

Consider a BPSK transmission, where the sent symbolsxn = 2cn − 1 pass through a channel corrupted with additiveGaussian noise and a BPSK interference

yn = hn · xn + zn + gn · dn (54)

where hn ∈ R is the known gain of the channel, zn is a zero-mean Gaussian signal with known variance σ2

z = N0/2; dn ∈{−1, 1} is the BPSK-modulated interference signal receivedwith the gain gn ∈ R. We define also the SNR and the signal-to-interference ratio (SIR), respectively, as SNR = h2

n/N0

and SIR = h2n/g

2n.

Although, using (1) it is relatively simple to calculate theL-values in this case as

ln = logexp(− (yn−hn−gn)

2

2σ2z

) + exp(− (yn−hn+gn)2

2σ2z

)

exp(− (yn+hn−gn)2

2σ2z

) + exp(− (yn+hn+gn)2

2σ2z

)(55)

for the purpose of illustration, we assume that the receiverignores the interference, thus uses gn = 0, and then, from(55) we obtain the L-value

ln =2hn · yn

σ2z

(56)

which is mismatched due to assumed absence of interference.To apply the correction principle we derived, we need to

calculate the CGF of L = 2hn

σ2z·Y , conditioned on sending the

bit Cn = 0, κL(s) = κY (2hn

σ2zs), where

κY (s) = −hns+1

2σ2zs

2 + log

(1

2es·gn +

1

2e−s·gn

). (57)

Then, finding the saddlepoint of κY (s), as κ′Y (s) = 0, the

saddlepoint of κL(s) equals to σ2z

2hns thus, the correction factor

applied to the ln is given by (43) α =σ2z

hns and, after the

correction, the L-value is calculated as

lcn = 2s · yn. (58)

This means that instead of calculating the saddlepoint ofκL(s) and correcting ln, we might directly calculate thesaddlepoint of κY (s) and apply the resulting correction factorto the observation yn. This is possible, of course, because theL-value ln is a scaled version of yn, cf. (56).

To find the saddlepoint s we differentiate (57) and obtainthe following saddlepoint equation

hn − σ2z · s = gn · tanh(gn · s) (59)

Page 7: Correction of Mismatched L-values in BICM Receivers

3204 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 60, NO. 11, NOVEMBER 2012

LHS of (59)

RHS of (59)

0

gn

hn

hn

σ2z

ss0s∞

Fig. 1. Graphical interpretation of the solution of the saddlepoint equation(59); the dashed line corresponds to the linearization tanh(x) ≈ x.

whose graphical interpretation as the intersection of the right-had- and left-hand sides is shown in Fig. 1.

While (59) cannot be solved in a closed-form, we may ob-tain approximations in particular cases. Namely, for SNR →0 (i.e., when σ2

z → ∞), we easily see that s → 0 so, using thelinearization tanh(x) ≈ x (shown as a dashed line in Fig. 1),we obtain

s0 =hn

σ2z + g2n

=hn

σ2N+I,n

(60)

where σ2N+I,n is the variance of the noise and interference. The

corresponding correction factor is given by α0 =σ2z

σ2N+I,n

. Note

that using μ = −ELn|Cn=0 =h2n

σ2z

and σ2 = Var{Ln} =4h2

n

σ4z

in (14) yields exactly the same results αGauss = α0.This means that for the low SNR, when the noise “domi-

nates” the interference, the effect of noise and the interferencecan be modeled as a Gaussian variable with variance σ2

N+I,n;the corrected L-values, in this case, would be calculated as

lc,0n =2hn · ynσ2

N+I,n. (61)

For high SNR, i.e., when SNR → ∞ (i.e., σ2z → 0), we

observe that s → −∞ so taking advantage in (59) of the“saturation” tanh(∞) = 1, the saddlepoint is given by3

s∞ =hn − gn

σ2z

(62)

and the corresponding correction factor by

α∞ = 1− gn/hn. (63)

The L-values in this case would be calculated as

lc,∞n =2(hn − gn) · yn

σ2z

(64)

3We assume gn < hn, i.e., the interference is weaker than the desiredsignal.

that is, the interference-related term decreases the gain of theuseful signal. Intuitively, this can be explained as follows: forhigh SNR, the interference can be “distinguished” from thenoise and becomes the part of the transmitted constellation,i.e., sending bit cn = 1, we will effectively be able to makea difference between hn − gn or hn + gn. Moreover, for highSNR, the symbol that is the most likely to provoke the erroris the one closest to the origin, that is hn − gn. This leadsto assuming that BPSK symbols are sent over a channel withgain hn − gn, cf. (64).

For 0 < SNR < ∞, the saddlepoint can be obtainedby numerically solving (59). For example, we might use theNewton-Raphson method

s(i) = s(i− 1)− κ′Y

(s(i− 1)

)κ′′Y

(s(i− 1)

) , i = 1, 2, . . . , Imax (65)

where

κ′′Y (s) = σ2

z +g2

cosh2(g · s) (66)

and s(0) is the appropriately chosen starting point for therecursion, e.g., s(0) = max{s∞, s0}. In this work we usedImax = 2 so a small computation load is incurred due tothe on-line calculation of the correction factors; alternatively,the correction factors might be interpolated using a tableprecalculated for different values of SNR and SIR.

Since hn

σ2z> s (cf. Fig. 1) we can also immediately conclude

that the correction factor always satisfies α =σ2z

hns < 1,

that is, ignoring the interference, our reliability metric is too“optimistic” and must be scaled down. On the other hand,since s > s0, if the mismatched L-value is calculated usingthe Gaussian approximation of the interference, that is using(61), the correction would be α > 1. That is, the Gaussianapproximation is too “pessimistic”.

We show in Fig. 2 the values of the optimal correctionfactors for different values of SNR and SIR. For high SNR,as predicted by (63), α = α∞ = 1 − gn = 1 − 10−SIR/20;for example, when SIR = 6 dB, α = 0.5. We also show thevalue of αGMI, cf. (17) and we see that it is quite close to α.Both factors are increasing with SIR as the case SIR → ∞corresponds to the assumed absence of the interference, thatis, α → 1.

It is also interesting to compare the correction factors tothose that might be obtained by minimizing the actual PEP.To make it possible, we analyze again the two-state mismatchfrom Section III-A. That is, we assume that d1 L-valuesaffecting PEP are mismatched as in our example, while d2 L-values are matched, that is, no interference is present duringthe transmission.

From (56) we easily deduce the distribution of the mis-matched L-values L and the matched ones as

pL|C(l|0) =1

2

[Ψ(l + μ1;σ

2)+Ψ

(l+ μ2;σ

2)]

pL|C(l|0) = Ψ(l+ μ0;σ2) (67)

where μ1 = 2h·(h−g)σ2z

, μ2 = 2h·(h+g)σ2z

, μ0 = 2h2

σ2z

, and σ2 =4h2

σ2z

.Since the convolution of d1 corrected mismatched L-values

Lc with d2 matched L-values L is a mixture of Gaussian

Page 8: Correction of Mismatched L-values in BICM Receivers

SZCZECINSKI: CORRECTION OF MISMATCHED L-VALUES IN BICM RECEIVERS 3205

−4 −2 0 2 4 6 8 10 120.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

SNR

corr

ecti

onfa

ctor

s

αGMI

αα2SM

αWLSF

SIR = 3 dB

SIR = 6 dB

SIR = 12 dB

Fig. 2. Correction factors αGMI, α αWLSFobtained for various values ofSNR and SIR. The dotted line show the value of α2SM, cf. (69) for variouscombinations of 2 ≤ d1, d2 ≤ 8.

function, we easily obtain the analytical expression for thePEP of two-state mismatch (2SM)

PEP2SM(α)

=1

2d

d1∑k=0

(d1k

)Q

((d1 − k)αμ1 + kαμ2 + d2μ0

σ√d1α2 + d2

)(68)

and we define

α2SM = argminαPEP2SM(α) (69)

which is shown in Fig. 2 for various combinations of 2 ≤d1, d2 ≤ 8 as, now α2SM depends on d1 and d2. We canappreciate that the optimal values α2SM are very close to αresulting from the saddlepoint-based criterion we proposed,which is obtained without restrictive assumptions on the two-state structure of the mismatch.

For completeness we also show the results of WLSF definedin (15). We can expect it to provide reasonable results whenf c(l) is almost linear, that is when the pdf pLn|Cn

(l|0) is closeto Gaussian. This happens when interference is dominated bythe noise, i.e, for low SNR and high SIR and then, as we cansee in Fig. 2, αWLSF is close to the PEP-minimizing valuesα2SM. On the other hand, when SNR is high and SIR low, theresults obtained are far from the optimal values.

Now, to take our solution out of the PEP-related considera-tion, and to verify how the correction affects the performanceof a practical decoder, we consider a case where a block ofNb = 1000 bits is encoded using the convolutional code (CC)of rate 1

2 with generating polynomial {15, 17}8 [22] and theturbo code (TC) {1, 15/13}8 [7] with rate 3

4 (obtained viapuncturing of the parity bits).

We recall that for identically distributed L-values, theperformance of the ML decoder cannot be improved vialinear correction. Thus, to show the eventual advantage of thecorrection, we assume that the channel gains hn are unitary-energy, Rayleigh variables, so the average SNR SNR = 1/N0

is used to characterize the channel. Then, the correction factor

4 6 8 10 12 14 16 18 2010

−4

10−3

10−2

10−1

SNR

BE

R

α = 1ααGMI

true L-valuesα0

TC

CC

Fig. 3. Simulation results for the rate- 12

convolution code (CC) and rate- 34

turbo code (TC) obtained using the L-values without correction (56) (i.e.,α = 1), the L-values with optimal correction we propose (calculated solving(59)), the true L-values (55). The results obtained from L-values correctedusing αGMI are shown with the filled markers. For CC we use ML decoding,so the results based on the Gaussian model of the interferences (α0), (61) areidentical to those obtained without the correction as discussed at the end ofSection IV.

has to be found for each value of hn that is assumed perfectlyknown at the receiver. The average signal-to-interference isset to SIR = 6 dB for CC and SIR = 8 dB for TC;for these values, the measurable bit-error rate (BER) resultscan be presented in the same range of SNR. The results ofdecoding (Viterbi decoding for the convolutional code andturbo decoding with five iterations for turbo-code) in termsof BER are shown in Fig. 3. We also show the results of thedecoding using true L-values, i.e., L-values obtained via (55).

The comparison with the GMI-based correction, is in ordereven if, as shown Fig. 2, the correction factors are similarto those obtained using our method. Note that, unlike in ourmethod, solving the GMI-based problems (17) or (19), numer-ical integration is needed and the solutions αGMI turns out to besensitive to the number of points of the numerical quadrature(we used Gauss-Hermite method with 40-100 points). Due tothese numerical issues, beyond SNR = 15 dB and particularlyfor large SIR we were not able to find the solution of (17)in the interval α ∈ (0, 1) (where it must belong). Thesepractical aspects also speak in favour of the correction basedon the saddlepoint equation (59), where no integration wasnecessary and the solution was readily obtained using (65). Toget around these implementation problems, for instantaneousSNR |hn|2/N0 larger than 15 dB (i.e., where α and αGMI arequite similar, cf. Fig. 2) we used α instead of αGMI. We alsoopted for an off-line solution: we pre-calculate a table of αGMI

for many values of SNR and SIR and, during the simulations,for each instantaneous SNR and instantaneous SIR |hn|2/g2nthe value of αGMI is interpolated from the table.

We can see that the correction results based on our methodor on the GMI approach are similar and partially bridge thegap to the results based on the true L-values. The performanceimprovement is particularly notable for high average SNR,which is consistent with the results of Fig. 2 where the most

Page 9: Correction of Mismatched L-values in BICM Receivers

3206 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 60, NO. 11, NOVEMBER 2012

notable correction (small values of α) are also obtained forhigh SNR.

In Fig. 3 we also show the results of the correctionderived assuming that the interference is Gaussian yieldingthe correction factor α0 =

σ2z

σ2N+I,n

. Since α0 is independent ofthe channel gains hn, it is common to all the L-values andthus irrelevant to the performance of ML decoder. For thisreason, the results obtained with α0 and with α are identicalfor CC, where the ML (Viterbi) decoder is used; they are thusnot shown in Fig. 3. On the other hand, the turbo decoder,based on the iterative exchange of information between theconstituent decoders, depends on the accurate representationof the aposteriori probabilities via the L-values [12]. It is,therefore, sensitive to the scaling and the correction with α0

improves the decoding results.

B. Simplified Calculation of the L-values in BICM Transmis-sion

Consider now a BICM system where the transmitted bitsare grouped in pairs [c2m+1, c2m+2],m = 0, . . . , N/2 − 1,where c2m+1 are the most significant bits (MSB) and c2m+2,the least significant bits (LSB). These pairs are mapped ontothe symbols from the quaternary pulse-amplitude modulation(4-PAM) xm = Φ{[c2m+1, c2m+2]} obtained following themapping rule Φ : {00, 01, 11, 10} → {a1, a2, a3, a4}, whereak = (−3 + 2k) · Δ with Δ = 1/

√5, which guarantees

that E{x2m} = 1. The symbols are sent over the unitary gain

channel corrupted by a zero-mean, white Gaussian noise zmwith variance σ2

z

ym = xm + zm, m = 0, . . . , N/2− 1. (70)

The SNR is defined again as SNR = 1/(2σ2z).

To obtain the L-value l2m+1 and l2m+2, non-linear functions

l2m+1 = loge−SNR·(ym−a3)

2

+ e−SNR·(ym−a4)2

e−SNR·(ym−a1)2 + e−SNR·(ym−a2)2, (71)

l2m+2 = loge−SNR·(ym−a2)

2

+ e−SNR·(ym−a3)2

e−SNR·(ym−a1)2 + e−SNR·(ym−a4)2(72)

should be used. The well-known max-log approximation maybe applied to ease the implementation

l2m+1 = θ1(ym)

= SNR · (min{|ym − a3|2, |ym − a4|2

}−min

{|ym − a1|2, |ym − a2|2})

(73)

=

⎧⎪⎨⎪⎩(ym/Δ− 1) · 8γ, if ym > 2Δ

(ym/Δ) · 4γ, if |ym| < 2Δ

(ym/Δ+ 1) · 8γ, if ym < −2Δ

(74)

l2m+2 = θ2(ym)

= SNR · (min{|ym − a2|2, |ym − a3|2

}−min

{|ym − a1|2, |ym − a4|2})

(75)

= (2 − |ym|/Δ) · 4γ, (76)

where γ = Δ2 · SNR.

Here, to show the proposed principle of correction, andfollowing the ideas of [13], we will simplify θ1(ym) by alinear operation on ym

˜l2m+1 = ˜θ1(ym) = t1 · 4γ/Δ · ym (77)

and generalize the form of θ2(y) so that the linear pieces itsbuild of, depend on the value of SNR

˜l2m+2 =

˜θ2(ym) = q2 · 8γ − t2 · 4γ · |ym|/Δ. (78)

Note that for t2 = q2 = 1 we obtain ˜θ2(y) ≡ θ2(y).Our objective is then to find the “correction” factors

t1, t2, q2 > 0 so as to minimize the PEP.To find t1 we immediately see that the problem is equivalent

to the linear correction problem we just treated in Sec-tion IV-A, because 4-PAM modulation may be seen as BPSKmodulation transmitted over channel with hn = 2Δ corruptedby the additive BPSK interference with gain gn = Δ. Usingthese parameters we have to solve (59) and, similarly to (58),apply the resulting saddlepoint s to obtain the correction

t1 =2sΔ

4γ=

sσ2z

Δ. (79)

In particular, for σ2z → 0, we obtain from (62) s =

(hn − gn)/σ2z = Δ/σ2

z , thus t1 = 1 and ˜θ2(ym) is identicalto θ2(ym) for ym ≈ 0. This result is consistent with theobservation of [23, Sec. III.C] according to which, for highSNR, the most relevant model of the L-values is obtainedconsidering only the “zero-crossing” part of the functionθ1(y).

As for the function ˜θ2(y), the modification of the L-valueswith t2 and q2 may be seen as particular case of the non-linearcorrection treated in Section III-D, where the linear correctionαn is included in both parameters. We thus have to minimizeκ ˜L2m+2

(0.5) with respect to t2 and q2.

It is easy to see that p ˜L2m+2|C2m+2

(l|0) �=p ˜L2m+2|C2m+2

(−l|1), i.e., the symmetry condition (8) is

not satisfied4. We thus enforce it via scrambling (9) thatyields

ps˜L2m+2|C2m+2

(l|0) = 1

4

⎛⎝ ∑a∈{−3Δ,3Δ}

p ˜L2m+2|Xm

(l|a)

+∑

a∈{−Δ,Δ}p ˜L2m+2|Xm

(−l|a)⎞⎠(80)

where we also introduced the conditioning on the transmittedsymbol, which comes the fact that the pdf conditioned on thebit C2m+2 results from marginalization over the constellationsymbols used to convey that bit.

4This can be immediately concluded noting that ˜l2m+2 = ˜θ2(y) ≤ q2 ·8γ,thus p ˜

L2m+2|C2m+2(l|c2m+2) ≡ 0 for l > q2 · 8γ as illustrated, e.g., in

[24, Fig. 3]. Then it is obvious that the pdf with limited support cannot satisfy(8).

Page 10: Correction of Mismatched L-values in BICM Receivers

SZCZECINSKI: CORRECTION OF MISMATCHED L-VALUES IN BICM RECEIVERS 3207

Consequently,

E ˜L2m+2|C2m+2

{e

12˜L2m+2 |0

}=

1

4

∑a∈{−3Δ,3Δ}

EYm|Xm

{e

12˜θ2(Ym)|a

}+1

4

∑a∈{−Δ,Δ}

EYm|Xm

{e−

12˜θ2(Ym)|a

}(81)

where

EYm|Xm

{e

12˜θ2(Ym)|a

}=

∫ ∞

−∞Ψ(y − a;σ2

z)eq2·4γ−t22γ|y|/Δdy

= eq2·4γ ·∫ ∞

0

(Ψ(y − a;σ2

z) + Ψ(y + a;σ2z))

e−t2y·2γ/Δdy.

(82)

Thus, finally, (81) becomes

E ˜L2m+2|C2m+2

{e

12˜L2m+2 |0

}=

1

4

⎛⎝ ∑a∈{−3Δ,3Δ}

e4γq2 · Ω0(a, 2t2γ/Δ)+

∑a∈{−Δ,Δ}

e−4γq2 · Ω0(a,−2t2γ/Δ)

⎞⎠ (83)

where

Ω0(a, β) = 2

∫ ∞

0

Ψ(y − a;σ2z) · e−β·y dy

= exp

(1

2β2σ2

z − aβ

)· erfc

(βσ2

z − a√2σz

). (84)

In order to minimize κL2m+2(0.5) =

logEL2m+2|C2m+2

{e

12 L2m+2 |0

}, the derivatives with respect

to the optimization parameters must be set to zero, i.e.,d

dt2κL2m+2

(0.5) = 0 and ddq2

κL2m+2(0.5) = 0; this yields two

following non-linear equations∑a∈{−3Δ,3Δ}

e4γq2 · Ω0(a, 2t2γ/Δ)

−∑

a∈{−Δ,Δ}e−4γq2 · Ω0(a,−2t2γ/Δ) = 0 (85)

∑a∈{−3Δ,3Δ}

e4γq2 · Ω1(a, 2t2γ/Δ)

−∑

a∈{−Δ,Δ}e−4γq2 · Ω1(a,−2t2γ/Δ) = 0 (86)

where

Ω1(a, β) =d

dβΩ0(a, β)

= exp

(1

2β2σ2

z − aβ

)·[(βσ2

z − a) · erfc

(βσ2

z − a√2σz

)

− 2σ2zΨ

(βσ2

z − a;σ2z

) ]. (87)

The solution of (85) and (86) obtained using a matlabequation solver are shown in Fig. 4. For high SNR, the valuesof the optimal correction terms tend to one, which confirms

−5 0 5 100.9

1

1.1

1.2

1.3

1.4

1.5

1.6

1.7

1.8

SNR

corr

ecti

onfa

ctor

s

t1t2q2

Fig. 4. Correction factors t1, t2, and q2 obtained solving (85) and (86).

that the max-log approximation (75) is an optimal piece-wiselinear approximation to calculate L-values l2m+2 when SNRgrows.

The results of decoding when sending Nb = 1000 bits en-coded with rate- 13 turbo code we already used in Section IV-Aare shown in Fig. 5 for various correction options: the “exact”calculation means that (71) and (72) were used and max-log means (73) and (75) were used. We also consider the“mixed” case when we use ˜

l2m+1 and l2m+2 (the MSB’s L-values is optimized while LSB’s one is obtained via max-logapproximation). Interestingly, even the mixed L-values outper-form the max-log L-values. While the differences between therespective results are rather small, they are obtained “for free”,i.e., without additional computational effort, when comparingto max-log simplifications. In fact, the L-values ˜

l2m+1 areeven slightly less complex to calculate than l2m+1. Additional

gains are achieved using both optimized L-values ˜l2m+2 and

˜l2m+1, which brings the results close to those obtained withthe optimal L-values calculation l2m+2 and l2m+1.

V. CONCLUSIONS

In this work we propose a new method to find the linearcorrection factor of the mismatched L-values. Aiming atthe minimization of the probability of errors made by themaximum-likelihood decoder, we found that the correctionfactor equals to the twice of the value of the saddle pointof the cumulant generating function (CGF) of the L-values.Our method is shown to bear similarities to the one basedon generalized mutual information (GMI) proposed beforebut, in many cases, working in the domain of CGF, wemay avoid the numerical integration necessary in GMI. Wealso propose an extension of the correction principle tothe non-linear correction of the L-values. Our findings areillustrated with numerical examples: we analyze the case ofthe BPSK transmission in the presence of the interference,and we consider the case of BICM transmission, where weimproved the decoding results, simplifying at the same time

Page 11: Correction of Mismatched L-values in BICM Receivers

3208 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 60, NO. 11, NOVEMBER 2012

0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 1.410

−6

10−5

10−4

10−3

10−2

SNR

BE

R

exactmax-logmixedoptimized

Fig. 5. Decoding results for different L-values calculations.

the calculation of the L-values, when comparing to the max-log approximation often used to calculate L-values.

Although this work was focused on the correction of theL-values calculated at the front-end of BICM receivers theproposed correction method may be applied in any contextwhere the L-values are being used, e.g., for the correction ofthe L-values in iterative decoders.

ACKNOWLEDGEMENT

Many thanks to Alex Alvarado, Cambridge University, UK,for a careful and critical reading of the manuscript.

REFERENCES

[1] E. Zehavi, “8-PSK trellis codes for a Rayleigh channel,” IEEE Trans.Commun., vol. 40, no. 3, pp. 873–884, May 1992.

[2] C. Wengerter, A. von Elbwart, E. Seidel, G. Velev, and M. Schmitt,“Advanced hybrid ARQ technique employing a signal constellationrearrangement,” in Proc. 2002 IEEE Veh. Technol. Conf. – Fall, pp.2002–2006.

[3] S. Benedetto, D. Divsalar, G. Montorsi, and F. Pollara, “A soft-inputsoft-output APP module for iterative decoding of concatenated codes,”IEEE Commun. Lett., vol. 1, no. 1, pp. 22–24, Jan. 1997.

[4] J. Chen and M. Fossorier, “Near optimum universal belief propagationbased decoding of low-density parity check codes,” IEEE Trans. Com-mun., vol. 50, no. 3, pp. 406–414, 2002.

[5] B. Classon, K. Blankenship, and V. Desai, “Channel coding for 4Gsystems with adaptive modulation and coding,” IEEE Wireless Commun.Mag., vol. 9, no. 2, pp. 8–13, Apr. 2002.

[6] L. Papke, P. Robertson, and E. Villebrun, “Improved decoding with theSOVA in a parallel concatenated (turbo-code) scheme,” in Proc. 1996IEEE Int. Conf. Commun., vol. 1, pp. 102–106.

[7] J. Vogt and A. Finger, “Improving the max-log-map turbo decoder,”IEEE Electron. Lett., vol. 36, no. 23, pp. 1937–1939, Nov. 2000.

[8] M. van Dijk, A. Janssen, and A. Koppelaar, “Correcting systematic mis-matches in computed log-likelihood ratios,” Eur. Trans. Telecommun.,vol. 14, no. 3, pp. 227–224, 2003.

[9] G. Lechner, “Efficient decoding techniques for LDPC codes,” Ph.D.dissertation, Vienna University of Technology, Austria, July 2007.

[10] A. Alvarado, V. Núñez, L. Szczecinski, and E. Agrell, “Correctingsuboptimal metrics in iterative decoders,” in Proc. 2009 IEEE Int. Conf.Commun.

[11] G. Lechner and J. Sayir, “Improved sum-min decoding for irregularLDPC codes,” in Proc. 2006 Int. Symp. Turbo Codes Related Topics.

[12] T. Nguyen and L. Lampe, “Bit-interleaved coded modulation withmismatched decoding metrics,” IEEE Trans. Commun., vol. 59, no. 2,pp. 437–447, Feb. 2011.

[13] R. Yazdani and M. Ardakani, “Efficient LLR calculation for non-binarymodulations over fading channels,” IEEE Trans. Commun., vol. 59,no. 5, pp. 1236–1241, May 2011.

[14] G. Caire, G. Taricco, and E. Biglieri, “Bit-interleaved coded modula-tion,” IEEE Trans. Inf. Theory, vol. 44, no. 3, pp. 927–946, May 1998.

[15] M. Tüchler, “Design of serially concatenated systems depending on theblock length,” IEEE Trans. Commun., vol. 52, no. 2, pp. 209–218, Feb.2004.

[16] A. Martinez, A. Guillén i Fàbregas, and G. Caire, “Bit-interleaved codedmodulation revisited: a mismatched decoding perspective,” IEEE Trans.Inf. Theory, vol. 55, no. 6, pp. 2756–2765, June 2009.

[17] T. Nguyen and L. Lampe, “Mismatched bit-interleaved coded nonco-herent orthogonal modulation,” IEEE Commun. Lett., vol. 15, no. 5, pp.563–565, May 2011.

[18] A. Martinez, A. Guillén i Fàbregas, and G. Caire, “A closed-formapproximation for the error probability of BPSK fading channels,” IEEETrans. Wireless Commun., vol. 6, no. 6, pp. 2051–2054, June 2007.

[19] ——, “Error probability analysis of bit-interleaved coded modulation,”IEEE Trans. Inf. Theory, vol. 52, no. 1, pp. 262–271, Jan. 2006.

[20] L. Szczecinski, A. Alvarado, and R. Feick, “Distribution of max-log metrics for QAM-based BICM in fading channels,” IEEE Trans.Commun., vol. 57, no. 9, pp. 2558–2563, Sep. 2009.

[21] A. Kenarsari-Anhari and L. Lampe, “An analytical approach for per-formance evaluation of BICM transmission over Nakagami-m fadingchannels,” IEEE Trans. Commun., vol. 58, no. 4, pp. 1090–1101, Apr.2010.

[22] P. Frenger, P. Orten, and T. Ottosson, “Convolutional codes withoptimum distance spectrum,” IEEE Trans. Commun., vol. 3, no. 11,pp. 317–319, Nov. 1999.

[23] A. Alvarado, L. Szczecinski, R. Feick, and L. Ahumada, “Distributionof L-values in Gray-mapped M2-QAM signals: closed-form approxi-mations and applications,” IEEE Trans. Commun., vol. 57, no. 7, July2009.

[24] M. Benjillali, L. Szczecinski, S. Aissa, and C. Gonzalez, “Evaluation ofbit error rate for packet combining with constellation rearrangement,”Wiley J. Wireless Commun. Mobile Comput., pp. 831–844, Sep. 2008.

Leszek Szczecinski (M’98-SM’07) obtained theM.Eng. degree from the Technical University ofWarsaw in 1992 and the Ph.D. from INRS-Telecommunications, Montreal, in 1997. From 1998to 2001, he held the position of Assistant Professorat the Department of Electrical Engineering, the Uni-versity of Chile. He is now an Associate Professor atINRS-EMT, the University of Quebec, Canada, andan Adjunct Professor at the Electrical and ComputerEngineering Department of McGill University. In2009–2010, he was a Marie-Curie Research Fellow

with CNRS, Laboratory of Signals and Systems, Gif-sur-Yvette, France. Hisresearch interests are in the areas of modulation and coding, communicationtheory, wireless communications, and digital signal processing.