decoding algorithms for nonbinary ldpc codes over gf\u003cformula...

11
IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 4, APRIL 2007 633 Transactions Letters Decoding Algorithms for Nonbinary LDPC Codes Over GF David Declercq, Member, IEEE, and Marc Fossorier, Fellow, IEEE Abstract—In this letter, we address the problem of decoding nonbinary low-density parity-check (LDPC) codes over finite fields GF , with reasonnable complexity and good performance. In the first part of the letter, we recall the original belief propagation (BP) decoding algorithm and its Fourier domain implementation. We show that the use of tensor notations for the messages is very convenient for the algorithm description and understanding. In the second part of the letter, we introduce a simplified decoder which is inspired by the min-sum decoder for binary LDPC codes. We called this decoder extended min-sum (EMS). We show that it is possible to greatly reduce the computationnal complexity of the check-node processing by computing approximate relia- bility measures with a limited number of values in a message. By choosing appropriate correction factors or offsets, we show that the EMS decoder performance is quite good, and in some cases better than the regular BP decoder. The optimal values of the factor and offset correction are obtained asymptotically with simulated density evolution. Our simulations on ultra-sparse codes over very-high-order fields show that nonbinary LDPC codes are promising for applications which require low frame-error rates for small or moderate codeword lengths. The EMS decoder is a good candidate for practical hardware implementations of such codes. Index Terms—Complexity reduction, iterative decoder, nonbi- nary low-density parity-check (LDPC) codes. I. INTRODUCTION L OW-DENSITY parity-check (LDPC) codes designed over GF (also referred to as GF -LDPC codes) have been shown to approach Shannon-limit performance for and very long code lengths [1], [2], [4], [5]. On the other hand, for moderate code lengths, the error performance can be improved by increasing [6], [7]. However, this improvement is achieved at the expense of increased decoding complexity. A straightforward implementation of the belief propagation (BP) algorithm to decode GF -LDPC codes has computa- tional complexity dominated by operations for each check sum processing [6]. As a result, no field of order larger than was initially considered. Extending the ideas presented in [3], a more efficient approach using Fourier trans- forms over GF was presented in [8]. The description of this Paper approved by T.-K. Truong, the Editor for Coding Theory and Tech- niques of the IEEE Communications Society. Manuscript received April 10, 2005; revised April 14, 2006. D. Declercq is with the ETIS ENSEA/UCP/CNRS UMR-8051, 95014 Cergy Pontoise, France (e-mail: [email protected]). M. Fossorier is with the Department of Electrical Engineering, University of Hawaii, Honolulu, HI 96822 USA (e-mail: [email protected]). Digital Object Identifier 10.1109/TCOMM.2007.894088 algorithm in the log domain has been given in [9]. Note that the Fourier transform is easy to compute only when the Galois field is a binary extension field with order . In that case, this approach allows reducing the computational complexity of the BP algorithm to . Consequently, results for were reported with this method [7], [10]. In [10], the formula- tion of this algorithm was further elegantly and conveniently modified based on the introduction of a tensoral representation. With this representation, the generalization of BP decoding over GF(2) to any field of order becomes very natural. We present in detail the BP algorithm using tensoral notations in the first part of this letter. Simplified iterative decodings of GF -LDPC codes have also been investigated. For , the min-sum (MS) algorithm with proper modification has been shown to result in negli- gible performance degradation (less than 0.1 dB for regular LDPC codes) while performing additions only, and becoming independent of the channel conditions [11]. Extension of this approach to any value seems highly attractive. Unfortunately, such extensions are not straightforward, as many simplifications can not be realized in conjunction with Fourier transforms. In [9], the authors present a log-domain BP decoder combined with a fast Fourier transform (FFT) at the check-node input. However, combining log-values and FFT requires a lot of exponential and logarithm computations, which may not be very practical. To overcome this issue, the authors propose the use of a look-up table (LUT) to perform the required operations. Although simple, this approach is of limited interest for codes over high-order fields, since the number of LUT accesses grows in for a single message. As a result, for fields of high order, unless the LUT has a prohibitively large size, the performance loss induced by the LUT approx- imation is quite large. In [9], the authors present simulation results for LDPC codes over fields up to GF(16), in which case, the LUT approach remains manageable. In [12], the MS algorithm is extended to any finite field of order . Although only additions are performed and no channel information is necessary, its complexity remains . As a result, only small values of can be considered by this algorithm and for , a degradation of 0.5 dB over BP decoding is reported. Simplifications of BP decoding of GF -LDPC codes have also been considered in [13] for nonbinary signaling. In this letter, we develop a generalization of the MS algorithm which not only performs additions without the need of channel estimation, but also with the two following objectives: 1) a com- plexity much lower than so that finite fields of large order 0090-6778/$25.00 © 2007 IEEE

Upload: independent

Post on 18-Nov-2023

0 views

Category:

Documents


0 download

TRANSCRIPT

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 4, APRIL 2007 633

Transactions Letters

Decoding Algorithms for Nonbinary LDPC Codes Over GF(q)David Declercq, Member, IEEE, and Marc Fossorier, Fellow, IEEE

Abstract—In this letter, we address the problem of decodingnonbinary low-density parity-check (LDPC) codes over finite fieldsGF( ), with reasonnable complexity and good performance. Inthe first part of the letter, we recall the original belief propagation(BP) decoding algorithm and its Fourier domain implementation.We show that the use of tensor notations for the messages is veryconvenient for the algorithm description and understanding. Inthe second part of the letter, we introduce a simplified decoderwhich is inspired by the min-sum decoder for binary LDPC codes.We called this decoder extended min-sum (EMS). We show thatit is possible to greatly reduce the computationnal complexityof the check-node processing by computing approximate relia-bility measures with a limited number of values in a message.By choosing appropriate correction factors or offsets, we showthat the EMS decoder performance is quite good, and in somecases better than the regular BP decoder. The optimal values ofthe factor and offset correction are obtained asymptotically withsimulated density evolution. Our simulations on ultra-sparse codesover very-high-order fields show that nonbinary LDPC codes arepromising for applications which require low frame-error ratesfor small or moderate codeword lengths. The EMS decoder is agood candidate for practical hardware implementations of suchcodes.

Index Terms—Complexity reduction, iterative decoder, nonbi-nary low-density parity-check (LDPC) codes.

I. INTRODUCTION

LOW-DENSITY parity-check (LDPC) codes designed overGF (also referred to as GF -LDPC codes) have been

shown to approach Shannon-limit performance for andvery long code lengths [1], [2], [4], [5]. On the other hand, formoderate code lengths, the error performance can be improvedby increasing [6], [7]. However, this improvement is achievedat the expense of increased decoding complexity.

A straightforward implementation of the belief propagation(BP) algorithm to decode GF -LDPC codes has computa-tional complexity dominated by operations for eachcheck sum processing [6]. As a result, no field of order largerthan was initially considered. Extending the ideaspresented in [3], a more efficient approach using Fourier trans-forms over GF was presented in [8]. The description of this

Paper approved by T.-K. Truong, the Editor for Coding Theory and Tech-niques of the IEEE Communications Society. Manuscript received April 10,2005; revised April 14, 2006.

D. Declercq is with the ETIS ENSEA/UCP/CNRS UMR-8051, 95014 CergyPontoise, France (e-mail: [email protected]).

M. Fossorier is with the Department of Electrical Engineering, University ofHawaii, Honolulu, HI 96822 USA (e-mail: [email protected]).

Digital Object Identifier 10.1109/TCOMM.2007.894088

algorithm in the log domain has been given in [9]. Note that theFourier transform is easy to compute only when the Galois fieldis a binary extension field with order . In that case, thisapproach allows reducing the computational complexity of theBP algorithm to . Consequently, results forwere reported with this method [7], [10]. In [10], the formula-tion of this algorithm was further elegantly and convenientlymodified based on the introduction of a tensoral representation.With this representation, the generalization of BP decodingover GF(2) to any field of order becomes very natural.We present in detail the BP algorithm using tensoral notationsin the first part of this letter.

Simplified iterative decodings of GF -LDPC codes havealso been investigated. For , the min-sum (MS) algorithmwith proper modification has been shown to result in negli-gible performance degradation (less than 0.1 dB for regularLDPC codes) while performing additions only, and becomingindependent of the channel conditions [11]. Extension of thisapproach to any value seems highly attractive. Unfortunately,such extensions are not straightforward, as many simplificationscan not be realized in conjunction with Fourier transforms. In[9], the authors present a log-domain BP decoder combinedwith a fast Fourier transform (FFT) at the check-node input.However, combining log-values and FFT requires a lot ofexponential and logarithm computations, which may not bevery practical. To overcome this issue, the authors proposethe use of a look-up table (LUT) to perform the requiredoperations. Although simple, this approach is of limited interestfor codes over high-order fields, since the number of LUTaccesses grows in for a single message. As a result,for fields of high order, unless the LUT has a prohibitivelylarge size, the performance loss induced by the LUT approx-imation is quite large. In [9], the authors present simulationresults for LDPC codes over fields up to GF(16), in whichcase, the LUT approach remains manageable. In [12], the MSalgorithm is extended to any finite field of order . Althoughonly additions are performed and no channel information isnecessary, its complexity remains . As a result, onlysmall values of can be considered by this algorithm and for

, a degradation of 0.5 dB over BP decoding is reported.Simplifications of BP decoding of GF -LDPC codes havealso been considered in [13] for nonbinary signaling.

In this letter, we develop a generalization of the MS algorithmwhich not only performs additions without the need of channelestimation, but also with the two following objectives: 1) a com-plexity much lower than so that finite fields of large order

0090-6778/$25.00 © 2007 IEEE

634 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 4, APRIL 2007

can be considered; and 2) a small performance degradation com-pared with BP decoding. The first objective is achieved by in-troducing configuration sets, which allow keeping only a smallnumber of meaningful values at the check-node processing (notethat while the check-node processing is , that of the vari-able-node processing is only [12]). The second objectiveis achieved by applying the correction techniques of [11] to theproposed algorithm at the variable-node processing.

The letter is organized as follows. In Section II, the tensoralrepresentation of the messages for GF -LDPC codes [10] isfurther developed, and we discuss the importance of such no-tation for the simplification of the decoding algorithm in theFourier domain. The extended MS algorithms in the logarithmdomain are derived in Section III. We present the basic pro-posed algorithm [extended min-sum (EMS)], and its improvedcorrected versions using a scaling factor or an offset. The correc-tion values are obtained by minimizing the decoding thresholdscomputed with density evolution (DE) techniques. Simulationresults for small and moderate block lengths are discussed inSection IV, and concluding remarks are given in Section V.

II. DECODING GF -LDPC CODES WITH BP

In this section, we present the BP decoding algorithm of non-binary GF -LDPC codes. This decoding algorithm has beenalready presented in the literature [6], [9], [10], but the presen-tation proposed in this letter is original and could be useful to adeeper understanding and analysis of GF -LDPC decoders.The key points of a clear presentation of GF -LDPC BP arethe use of a tensoral representation of the messages along theedges in the factor graph representation of the code, and trans-formations of the usual LDPC factor graph so that the BP equa-tions can be written in a simple way.

A. Tensorial Representation of the Messages

In a GF -LDPC code, the parity-check matrix of sizedefining the kernel of the code is a sparse matrix,

such that

GFGF

(1)

As in the binary case, the rank of is with ,and the code rate is .

The factor graph representation of the code [14], [15] consistsof a set of variable nodes belonging to GF , fully connectedto a set of parity-check nodes. The edges connecting the twosets of nodes carry messages that represent probability densityfunctions (pdfs) of the codeword symbols. Since the codewordsymbols are random variables over GF , the messages arediscrete pdfs of size . When the Galois field is an extensionfield of GF(2), that is, when , the messages can beconveniently represented by tensors of size 2 and dimension

[10]. Let us briefly describe this tensoral structure of themessages.

In the Galois field GF , the elements of the field can berepresented by a polynomial of degree

with binary coefficients. Any set of binary valuesthereby determines a unique field element . In the following,we use these polynomial notations to indicate when the opera-tions (sums, products, etc) are performed in a Galois field.

Using this representation, a message on the edge connected toa variable node denoted is a tensorindexed by the binary coefficients of . For example,

corresponds to the probability inGF(8), conditionally to the random variables which depend on

in the factor graph.

B. BP Decoding Over GF

In order to simplify the notations used in the derivation ofthe equations, we omit temporal indices to describe the BP al-gorithm. The BP algorithm for nonbinary LDPC codes is nota direct generalization of the binary case, because the nonzerovalues of the matrix are not binary. There is, however, a wayto tightly link the nonbinary case to the binary case by modi-fying the factor graph. Let us take the example of a single row(check sum) with check degree . A parity node in a LDPCcode over GF represents the following parity equation:

(2)

where in the modulo operator is a degree primi-tive polynomial of GF . Equation (2) expresses that the vari-able nodes needed to perform the BP algorithm on a parity nodeare not the codeword symbols alone, but the codeword symbolsmultiplied by nonzeros values of the parity matrix . The cor-responding transformation of the graph is performed by addingvariable nodes corresponding to the multiplication of the code-word symbols by their associated nonzero values. Thetransformation is depicted in Fig. 1.

The function node that connects the two variable nodesand performs a permutation of the message values.The permutation that is used to update the message correspondsto the multiplication of the tensor indices by from node

to node , and to the division of the indices bythe other way. Using this transformation of the factor

graph, the parity-node update is indeed a convolution of all in-coming messages, just as in the binary case.

We use the following notations for the messages in the graph(see Fig. 1). Let be the set of messages enteringa variable node of degree , and be the outputmessages for this variable node. The index indicates that themessage comes from a permutation node to a variable node, and

is for the other direction. We define similarly the messages(resp. ) at the input (resp. output)

of a degree check node.The initialization of the decoder is achieved with the channel

likelihoods denoted . If the channel has binary inputand Gaussian additive noise, for example, then the symbol like-lihood has the form , where

with being the th bit of the GF symbol,and is a Gaussian noisy version of .

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 4, APRIL 2007 635

Fig. 1. Transformation of the factor graph in order to explicitly represent the nonzero values in the matrix H .

The three steps of one decoding iteration are then as follows:: product step, variable node update for a degree

node

or

(3)

where is defined as the term-by-term product of ten-sors, and is the tensor of likelihoods computed from thechannel output. Note that since the messages are pdfs, oneneeds to normalize the messages after the product in (3) sothat .

: permutation step (from variable to check nodes)

or

with (4)

where is a permutation matrix correspondingto , and collects in a column vector all thevalues of the tensor . Note that since finite Galois fieldsare cyclic fields, the permutation is actually a cyclicshift (rotation) of all the values in the message, except thevalue labeled by 0. As for the permutation step in the otherdirection, that is, from check node to variable node, it isperformed with the inverse permutation .

: sum-product step: parity-check node update for adegree nodeUsing the secondary nodes , all the parity nodeshave the same behavior, and no longer depend on thenonzeros entries of . The BP update for a parity nodeof degree is effectively a convolution of probabilitydensities on GF

or

(5)

where is the indicator function equal to 1 if and onlyif condition is fulfilled.

We could also express the sum in (5) without the help of anindicator function by using a configurations set. We will seein the presentation of the EMS algorithm (Section III) that theconcept of the configurations set is very useful. Let us considerthe following set:

(6)

Using (6), step becomes: sum-product step: parity-check node update for a

degree node

(7)

C. FFT-Based BP Decoding

In the BP algorithm presented in the previous section, thecomputational complexity of step can rapidly become pro-hibitively large. The number of basic operations required tocompute varies exponentially with both the field order andthe parity-check degree . This complexity can be reduced to

by performing (5) recursively [6], but this remains toocomplex to allow the decoding of LDPC codes in very-high-order fields, and in many works, the maximum field order isoften restricted to with this approach.

It has been proposed in [10] to perform step in the fre-quency domain, which transforms the convolution into a simpleproduct. The computation of step in the frequency domainhas also been pointed out in [8] and [9].

For densities of random variables in GF represented bytensors, the Fourier transform is formally very simple to explainsince it actually corresponds to second-order Fourier trans-forms, one in each dimension of the tensor.

Proposition: Let be a tensor of order 2 and dimen-sion , representing a density function of the random variable

GF . The Fourier transform of is given by

(8)

636 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 4, APRIL 2007

Fig. 2. Factor graph of an LDPC code over GF(2 ).

where represents the outer tensoral product in the th dimen-sion of the tensor, and is the 2 2 matrix of the second-orderFourier transform given by

As pointed out in [17], the Fourier transform with thecorrect symbol-bit labeling reduces to the Hadamard transform.For more details about Fourier transforms on finite sets (groups,rings, or fields), one can refer, e.g., to [25] and the referenceswithin.

The outer tensoral product is defined as, for

Using the Fourier transform (8), one can change theparity-check node into a product node in the factor graph of theGF -LDPC code. The whole factor graph in the case of aregular is depicted in Fig. 2. This leads to amodified version of step

: frequency-domain message update for a degreecheck node

(9)

The computational complexity of step is reducedto . Moreover, we can remark that the special form of

the Fourier transform in GF only involves additions and nomultiplication, which increases even more the decoding speed.Note that the use of tensoral notations has another advantagesince the Fourier transform (8) is already expressed in its fastimplementation form (FFT). This can be seen in the definitionof the outer tensoral product which represents exactly the equa-tions used in a butterfly implementation of the Fourier trans-form. This reduced-complexity BP decoding allows decodingnonbinary LDPC codes in very-high-order fields and for largevalues of , or equivalently, high code rates, since

.

III. EMS ALGORITHM IN THE LOGARITHM DOMAIN

In this section, we present a reduced-complexity decoding al-gorithm for LDPC codes over GF , based on a generalizationof the MS algorithm used for binary LDPC codes [11], [14],[16], [18], [26]. It is now well known that reduced-complexitydecoding algorithms are built in the logarithm domain, and thisis for two main reasons. First, it transforms the products in(3)–(5) into simple sums and the normalization of the messagesin (3) is no longer required. The second reason is that log-do-main algorithms are usually more robust to the quantization ef-fects when the messages are stored on a small number of bits[19], [24]. The practical decoding algorithms for turbo-codes(max-log-map, max*-log-map) and LDPC codes (MS, logBP)are all expressed in the logarithm domain.

The simplified decoding algorithm that we propose also useslog-density-ratio (LDR) representations of the messages. In thebinary case, only two probabilities define a message, and theLDR is defined as LDR .In the -ary case, the message is actually composed ofnonzero LDR values. By convention, we consider the followingLDR messages, for any tensorial message :

(10)Therefore, we get .

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 4, APRIL 2007 637

The purpose of the EMS algorithm is to simplify the messageupdate only at the check-node output by performing stepwith a reduced number of operations. By replacing the FFT in

with another message update rule, it is no longer im-portant that the field order is a power of 2.1 Therefore, the EMSalgorithm could be applied to any Galois field order as longas is a prime number or a power of a prime number. We havechosen to keep the tensoral notations used for GF fields inorder to have coherent notations with Section II. It is straight-forward to adapt the notations used in this section to any fieldorder.

As in the previous section, we omit the temporal indices inthe equations. Also, for ease of presentation and without loss ofgenerality, we assume that the output message is labeled by(the last branch of the check), which means that the incomingmessages of a degree check node are labeled

, and the output message is denoted .

A. Configuration Sets

Similar to the binary case, the goal of any message-passingalgorithm is to assign a reliability measure to each and everyentry in the ouput message , using the values of the in-coming messages. In the logBP algorithm, this reliability mea-sure is the local a posteriori probability, but it is hard to computeefficiently. With a suboptimal EMS algorithm, we aim at com-puting a suboptimal reliability measure, with low computationalcomplexity. To make it happen, we try not only to avoid anycomplex operations (multiplications, exponentials, logarithms,etc.), but also to compute the output reliability using only a lim-ited number of incoming values (obviously, the largest of suchvalues).

We start by selecting in each incoming message thelargest values that we denote , . We use thefollowing notation for the associated field elements , sothat we have [according to (10)]:

In principle, the number of values selected by the EMS algo-rithm could differ from one message to another, depending, forexample, on the dynamic of the reliability. This issue is not dis-cussed in this letter, and for simplicity of the presentation, weselect exactly values in each and every incoming message.

With these largest values, we build the following set of con-figurations:

1We recall that the Fourier transform has the form (8) only for densities overGF(2 ), and this motivated the choice of q = 2 in Section II.

Any vector of field elements in this set is called a con-figuration. The set corresponds to the set of configu-rations built from the largest probabilities in each incomingmessage. Hence, its cardinality is

Note that Conf(1) contains only one configuration that we callthe order-0 configuration.

In some cases, such as for large values of or , thenumber of considered configurations in remains toolarge. In order to further reduce the number of chosen config-urations, we consider the following subset of for

:

where is the subset of configurations that differfrom the order-0 configuration in exactly entries. The set

is, therefore, the subset of configurationsbelonging to that differ at most in entriesfrom the order-0 configuration. The number of elements in

is

(11)The idea of using instead of is thatit is much smaller in many cases, and it is built from configu-rations with large probabilities, since the order-0 configurationis the one with the highest probability. We finally remark that

.To each configuration, we need to assign a reliability that is

straightforward to compute. We have chosen the following func-tion as reliability measure:

(12)

Note that this choice is arbitrary, but is based on an analogywith the binary case. Moreover, this choice of reliability seemsnatural, since the EMS with parameters

is similar to the algorithm based on the approximation ofthe operator by a operator presented in [12].With these notations, we are now able to build an EMS al-gorithm from the reliabilities of the configurations in

. To this aim, we denote thesubset of defined by the check-node constraint

638 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 4, APRIL 2007

Finally, we remark that for some choices of the parame-ters , the set could be empty. If

is empty for some values of , thiscould create a problem of convergence for the EMS algorithm,since each and every entry in the output message has to befilled by a reliability. This statement has been verified by sim-ulations. However, this problem is readily overcome by usingthe subsets which are nonempty for any valueof GF .

B. EMS Algorithm

Using the notations of the preceding section, the completeEMS algorithm is given as follows.

: sum step, variable node update for a degreenode

or

(13)

: permutation step (from variable to check nodes)

or

with (14)

The permutation step from check to variable nodes is per-formed using .

: message update for a degree check node• from the incoming messages , build the sets

, then

(15)

• postprocessing

(16)

The postprocessing in step is necessary for nu-merical reasons, to ensure the nondivergence of the algorithm.Without this postprocessing, the values of the LDR messageswould converge to the highest achievable numerical value in afew iterations. Subtracting the value of the messages at index0 has also another meaning, since it forces the output of thecheck nodes to have the same LDR structure as defined in (10).

It is readily seen that the EMS algorithm reduces to the MSalgorithm when it is applied to GF(2) (the proof is given inAppendix A). This is why we named our algorithm the EMSalgorithm. Another way to see it is to express the check-nodeupdate with the help of operators, and then approxi-mate the expressions with operators, just as it is done inthe MS algorithm [12].

C. Computation Complexity

For and , the EMS algorithm becomesthe MS algorithm proposed in [12]. As a result, the algorithmproposed in [12] has complexity per iteration at thevariable nodes and at the check nodes, assuming therecursive updating proposed in [6]. In our approach, the maindifferences in complexity are the additional cost to determinethe configuration sets and the resulting reducedcost to evaluate the values .

At each node, all configuration sets can be determined byidentifying the largest values for each of the incomingbranches. Based on a binary tree representation, the resultingcost becomes .

The recursive approach of [6] can still be implemented in con-junction with configuration sets. For simplicity, we consider allrepresentations in . Both the forward and backwardrecursions start with values, and at their first step, evaluateup to intermediary reliability values (in which case, no oper-ations are needed). However, as soon as all possible interme-diary reliability values have been evaluated, the number of op-erations becomes for the remaining steps. As a result, thecomplexity of this approach can be evaluated as .

This recursive approach requires serial steps. Another pos-sible approach with a higher level of parallelism ifis to compute the values in from those in

for . Since, given a representationin , it is always possible to find a representation in

which differs in at most one position, the re-sulting complexity becomes based on (11): .In general, the complexity of this direct nonrecursive approachis slightly higher than that of the previous recursive one, but thenumber of serial steps is reduced from to .

We observe that both approaches significantly reduce thecomplexity of the algorithm proposed in [12]. Infact, for values of and which allow approaching theperformance of BP decoding (with the enhancements discussedin the next section), these complexities are of the same orderof which corresponds to the implementation ofBP with FFTs. However, only additions are performed in oursimplified approach.

D. Scaled and Offset EMS Algorithms

The EMS algorithm is suboptimal, and naturally introducesa performance degradation compared with the BP algorithm.The main reason (although not necessarily the only reason) forthis degradation is that the reliabilities computed in the EMSalgorithm are overestimated. This causes the suboptimal algo-rithm to converge too rapidly to a local minimum, which ismost often a pseudocodeword. This behavior has also been ob-served for binary LDPC codes decoded with MS, as well asfor turbo codes decoded with max*-log-map (either parallel orblock turbo codes). For binary LDPC codes, a simple techniquethat is used to compensate for this overestimation is to reduce themagnitude of the messages at the variable-node input by meansof a factor or an offset [19]. We propose to apply these correctiontechniques to the EMS algorithm, either using a factor correc-tion (17) or an offset correction (18).

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 4, APRIL 2007 639

: sum step with factor correction, variable-node update for a degree node

or

(17)

: sum step with offset correction, variable-node update for a degree node

if

if

if

(18)

Of course, the values of the factor (or offset) have to be chosencarefully, by optimizing an ad hoc cost function. In this letter, wehave chosen the values of the correction factor by minimizing thedecoding threshold of the LDPC code, computed with estimatedDE. The DE is simulated by means of Monte Carlo estimations ofthedensities,usingasufficiently largenumberofsamples inorderto get a low variance of estimation on the threshold value. In oursimulations, a number of message samples wassufficient togetgood thresholdvariances.Note that thisnumber israther small, since the LDPC codes that weconsider all have

, which means that they are the sparsest LDPC codes possible,and for ultra-sparse LDPC codes with the asymptotic be-haviorof thedecodingalgorithms is reachedvery rapidly.Thede-codingthreshold isdefinedas theminimumsignal-to-noiseratio(SNR) such that the bit-error probability goes to zeroas the number of decoding iterations goes to infinity. In practice,wechooseamaximumof200decoding iterations, andwestop theDEwhenthebit-errorprobability is lower than .Notethatweuse DE techniques only to choose the correction values and not tocompute accurate thresholds. In Fig. 3, the decoding thresholds

aredepictedasa functionof thecorrection factor(or offset), for the LDPC code over GF(256),and for two different EMS complexities, and

. We notice that the threshold is actually agood cost function in order to choose the correction values, sincethe curves are convex- and have a unique global minimum. Weobserve, as in the binary case, that the optimum values for theoffset correction are slightly worse than for the factor correction[11]. The thresholds and the corresponding correction values forvarious codes and complexities are tabulated and discussed in thenext section.

Fig. 3. Decoding threshold of a regular (2,4) LDPC code over GF(256) versuscorrection factors.

IV. PERFORMANCE COMPARISON

A. Code Thresholds and Correction Factors

The optimum threshold values of the scaled and offset EMSalgorithms with and anddifferent field orders have been reported in Table I for the rate

code and the ratecode. For comparison purpose, the capacity limit

is 0.18 dB for and1.63 dB for . As expected, the threshold values improveas and increase, and can approach the threshold of BPclosely with much less complexity. For ,we observe that the best values are obtained for inboth cases. This can be explained by the fact that as increases,the gap between BP decoding and the simpliest corrected EMSalgorithm obtained for increases, too. For the

code and , the thresholdvalues of the factor correction (respectively, offset correction)EMS are 1.58 dB (respectively, 2.18 dB) for ,

1.69 dB (respectively, 2.43 dB) for , and1.81 dB (respectively, 2.69 dB) for . This explainswhy, as increases, it is necessary to increase also inorder to stay close to the BP threshold.

The optimum threshold value for the scaled and offset EMSalgorithms with and hasbeen represented as a function of and compared with that ofBP decoding in Fig. 4. We observe that for fields of small order(say ), the corrected EMS algorithms with

are sufficient to closely approach BP decoding, and theperformance improves as increases. For and

, the corrected EMS algorithms with arenecessary to approach BP, with a performance still improvingas increases. However, for , larger values ofare required to approach BP. However, for these large fields,the potential performance gain achieved by increasing is verysmall. We observe that this trend is similar to that reported in[7] with respect to the weight profile of the random ensemblesof GF -LDPC codes.

640 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 4, APRIL 2007

TABLE ICORRECTION FACTORS AND OFFSETS FOR THE (d = 2; d = 4)-LDPC AND

THE (d = 2; d = 8)-LDPC CODES, FOR THREE DIFFERENT FIELDS

Fig. 4. Threshold value � = (E =N ) versus the field order for the (d =2; d = 4)-LDPC code.

B. Frame-Error Rate (FER) Performance for Small CodewordLengths

In this section, we present the simulation results of the dif-ferent EMS algorithms for various LDPC codes and variousfield orders. Motivated by the asymptotical threshold analysispresented in the last section, we focus on codes over high-orderfield, that is GF(64), GF(128), and GF(256). For those high or-ders, it has been shown that the best LDPC codes decoded withBP should be ultra-sparse [7], [8], that is, with the minimumvariable-node connection . We have, therefore, simu-lated LDPC codes for two different rates and with : rate

-LDPC codes, and rate-LDPC codes. These codes have been de-

signed with the PEG method of [20]. In all cases, the correctionvalues have been chosen as those reported in Table I, assumingthe code lengths are long enough to validate results obtained

Fig. 5. Comparison between BP and EMS decodings for a regular (2,4)-LDPCcode over GF(256) (R = 1=2, N = 106 GF(256) symbols, N = 848 equiv-alent bits). 50 decoding iterations have been used for all decoders.

via DE. In our simulations, we have more extensively studiedthe impact of the field order and the codeword length on the(2,4)-LDPC code.

Figs. 5 and 6 show the FER of the (2,4)-LDPC code for asmall codeword size corresponding to a typical ATM frame( information bytes), for two fields, GF(256) andGF(64), respectively. We observe that for GF(256), with asignificant complexity reduction, the EMS algorithm with

and offset correction performs within0.15 dB of BP decoding. Hence, the threshold value obtainedvia DE provides a good estimate in that case. The EMS algo-rithm with and factor correction performseven slightly better at low SNR values, but suffers from anerror floor. This behavior with scaling correction factor hasalso been observed for irregular binary LDPC codes and inthat case, can be compensated [21]. The sphere packing bound(SPB) of [22] has also been represented in Fig. 5. We observethat the EMS algorithm with correction performs within 1.0 dBof the SPB with relatively low complexity. Hence, the codeconsidered is quite powerful, considering that the SPB doesnot take into account the losses due to binary phase-shiftkeying (BPSK) modulation (roughly evaluated at 0.2 dB) andto iterative decoding (especially for such a relatively short codelength). In fact, such codes have lengths for which very fewcompetitive codes close to the SPB and decodable in real timecan be found (for short lengths, Bose–Chaudhuri–Hocquengem(BCH) codes are close to the SPB and for long lengths, turboor LDPC codes are close to the SPB) [23]. These observationsconfirm that medium-length nonbinary LDPC codes over largefinite fields are good codes [6], [7], and our results provideefficient decoding algorithms for fast VLSI implementations ofthese codes.

The performance gaps between the same algorithms becomemuch smaller for the GF(64) code represented in Fig. 6, mostlydue to the fact that the relative difference between the field order

and the algorithm complexity is much smaller com-pared with the GF(256) code. We can also observe a poorer per-formance achieved by BP decoding in that case, with an errorfloor starting at the FER . This can be attributed to both

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 4, APRIL 2007 641

Fig. 6. Comparison between BP and EMS decodings for a regular (2,4)-LDPCcode over GF(64) (R = 1=2,N = 142GF(64) symbols,N = 852 equivalentbits). 50 decoding iterations have been used for all decoders.

Fig. 7. Comparison between BP and EMS decodings for a regular (2,8)-LDPCcode over GF(128) (R = 3=4,N = 84 GF(128) symbols, N = 588 equiva-lent bits). 50 decoding iterations have been used for all decoders.

the smaller field order which worsens the low-weight profile [7]and the suboptimality of iterative decoding. For indoor applica-tions with those packet sizes, an FER of about(very high) is usually desired. For other applications, still withthose packet sizes (such as satellite communications), a FERbelow (very low) needs to be achieved. Conse-quently, the results reported in Figs. 5 and 6 are relevant to ei-ther case.

In order to show that our approach is general and also worksfor large values of (and especially with much smaller than

), the results for the (2,8)-LDPC code over GF(128) are de-picted in Fig. 7. The EMS still performs very well comparedwith BP while or is very small compared with

. The same remarks made for the previous figures applyto this figure.

Figs. 8 and 9 depict the performance of the (2,4)-LDPC codesfor a larger codeword length corresponding to a typical MPEGframe size of information bytes. In both cases, abetter error performance has been achieved due to the increase

Fig. 8. Comparison between BP and EMS decodings for a regular (2,4)-LDPCcode over GF(256) (R = 1=2, N = 376 GF(256) symbols, N = 3008

equivalent bits). 50 decoding iterations have been used for all decoders.

Fig. 9. Comparison between BP and EMS decodings for a regular (2,4)-LDPCcode over GF(64) (R = 1=2,N = 500 GF(64) symbols, N = 3000 equiva-lent bits). 50 decoding iterations have been used for all decoders.

in code length, and the same remarks as for the shorter codescan be made. Moreover, we observe that the gap between theSPB and the offset correction EMS has not increased comparedto Figs. 5 and 6. This is important since it shows that our re-duced-complexity algorithm is robust to the length of the LDPCcodes.

Finally, we observe a phenomenon that has been alreadypointed out for irregular binary LDPC codes; the MS (in ourcase, the EMS) corrected by an offset value could performbetter than the BP algorithm at low FER. This can be seen inFigs. 6, 7, and 9. Although this could be surprising, we mustrecall that the BP algorithm is not optimal, and that the effect ofpseudocodewords due to the presence of cycles in finite-lengthcodes could be different on the decoding algorithms.

V. CONCLUSION

We have presented several decoding algorithms for nonbi-nary LDPC codes over GF which aim at reducing the com-putational complexity compared with the regular BP decoder.

642 IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 4, APRIL 2007

In the probability domain, the complexity of BP can be greatlyreduced by performing the check-node updates in the frequencydomain. Although the use of FFTs has already been proposedin the literature for nonbinary codes, we show in this letter thatthe use of a tensoral representation of the messages is useful fora comprehensive description of the decoder. The complexity ofthe BP-FFT decoder is reduced to per degree

check node but still suffers from the fact that multiplicationsand divisions (for message normalization) are required. In orderto avoid these complex operations, we have also proposed re-duced-complexity algorithms in the log domain using LDR asmessages. The proposed EMS algorithm is a generalization ofthe MS algorithm used for binary codes, and has the advantageof performing additions only, while using only a limited number

of messages for the check-node update. We stress the factthat we focus on a local simplification of the check-node pro-cessing. This means, in particular, that the EMS algorithm applyboth for regular and irregular nonbinary LDPC codes, althoughall the simulations in this letter were done with regular LDPCcodes.

The complexity of the EMS algorithm is percheck node. For values of providing near-BP error per-formance, this complexity is roughly the same as that of theBP-FFT decoder, but without multiplications or divisions. Wehave also shown that by correcting the messages along thedecoding process, the EMS algorithm can approach the perfor-mance of the BP-FFT decoder, and even in some cases slightlyoutperform the BP-FFT decoder. The correction techniquesthat we developed are based on an application of a factor or anoffset to the messages at the input of a variable node, and thefactor/offset has been optimally chosen with the help of simu-lated DE. The EMS algorithm then becomes a good candidatefor hardware implementation of nonbinary LDPC decoders,since its complexity has been greatly reduced compared withthat of other decoding algorithms for nonbinary LDPC codesand the performance degradation is small or negligible.

APPENDIX AEMS REDUCES TO MS IN GF(2)

In this appendix, it is shown that the proposed suboptimalalgorithm is based on assumptions that are coherent with thebinary MS algorithm. More precisely, when we apply EMS tothe binary case, it reduces to the MS algorithm. Since one cancompute the check-node output message recursively, taking theinput messages pairwise, we can limit our discussion to the caseof only two incoming messages and one ouput message.

A. MS Algorithm

In the binary case, the messages in their tensoral representa-tion are simple vectors

(19)

Let and bethe two input messages in their usual LDR form, and let

be the output message. The MS algorithm is

given by

sign sign (20)

In order to compare the two algorithms (MS, EMS), we considerfour cases:Case 1: ;Case 2: ;Case 3: ;Case 4:

.Note that these four cases are sufficient because the other fourpossible cases are symmetrical only by exchanging and .

B. EMS Algorithm in GF(2)

The messages used in the EMS algorithm are defined by [see(10)]

(21)

where , and .We have performed step of the EMS algorithm in

each of the four cases:

Case 1:after postprocessing

;

Case 2:after postprocessing

;

Case 3:after postprocessing

;

Case 4:after postprocessing

.The results obtained by the EMS algorithm are the same than forthe MS algorithm. The two algorithms are, therefore, equivalentin GF(2).

REFERENCES

[1] R. G. Gallager, Low-Density Parity-Check Codes. Cambridge, MA:MIT Press, 1963.

[2] T. J. Richardson, M. A. Shokrollahi, and R. L. Urbanke, “Design of ca-pacity-approaching low-density parity check codes,” IEEE Trans. Inf.Theory, vol. 47, no. 2, pp. 619–637, Feb. 2001.

[3] T. J. Richardson and R. L. Urbanke, “The capacity of low-densityparity-check codes under message-passing decoding,” IEEE Trans.Inf. Theory, vol. 47, no. 2, pp. 599–618, Feb. 2001.

[4] M. G. Luby, M. Mitzenmacher, M. A. Shokrollahi, and D. A. Spielman,“Improved low-density parity-check codes using irregular graphs,”IEEE Trans. Inf. Theory, vol. 47, no. 2, pp. 585–598, Feb. 2001.

[5] S. Y. Chung, G. D. Forney, T. J. Richardson, and R. L. Urbanke, “Onthe design of low-density parity check codes within 0.0045 dB of theShannon limit,” IEEE Commun. Lett., vol. 5, no. 2, pp. 58–60, Feb.2001.

[6] M. Davey and D. J. C. MacKay, “Low density parity check codes overGF(q),” IEEE Commun. Lett., vol. 2, no. 6, pp. 165–167, Jun. 1998.

[7] X.-Y. Hu and E. Eleftheriou, “Binary representation of cycle Tanner-graph GF(2 ) codes,” in Proc. IEEE Int. Conf. Commun., Paris, France,Jun. 2004, pp. 528–532.

[8] D. J. C. MacKay and M. Davey, “Evaluation of Gallager codes for shortblock length and high rate applications,” in Proc. IMA Workshop Codes,Syst., Graphical Models, 1999, CD-ROM.

[9] H. Song and J. R. Cruz, “Reduced-complexity decoding of Q-aryLDPC codes for magnetic recording,” IEEE Trans. Magn., vol. 39, no.3, pp. 1081–1087, Mar. 2003.

IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. 55, NO. 4, APRIL 2007 643

[10] L. Barnault and D. Declercq, “Fast decoding algorithm for LDPC overGF(2 ),” in Proc. Inf. Theory Workshop, Paris, France, Mar. 2003, pp.70–73.

[11] J. Chen and M. Fossorier, “Density evolution for two improvedBP-based decoding algorithms of LDPC codes,” IEEE Commun. Lett.,vol. 6, no. 5, pp. 208–210, May 2002.

[12] H. Wymeersch, H. Steendam, and M. Moeneclaey, “Log-domain de-coding of LDPC codes over GF(q),” in Proc. IEEE Int. Conf. Commun.,Paris, France, Jun. 2004, pp. 772–776.

[13] A. P. Hekstra, Reduced-complexity decoding of LDPC codes overGF(q), preprint, 2003.

[14] F. Kschischang, B. Frey, and H.-A. Loeliger, “Factor graphs and thesum product algorithm,” IEEE Trans. Inf. Theory, vol. 47, no. 2, pp.498–519, Feb. 2001.

[15] R. M. Tanner, “A recursive approach to low complexity codes,” IEEETrans. Inf. Theory, vol. IT-27, no. 5, pp. 533–547, Sep. 1981.

[16] N. Wiberg, “Codes and decoding on general graphs,” Ph.D. disserta-tion, Linköping Univ., Linköping, Sweden, 1996.

[17] X. Li and M. R. Soleymani, “A proof of the Hadamard transform de-coding of the belief propagation decoding for LDPCC over GF(q),” inProc. Fall Veh. Technol. Conf., Sep. 2004, vol. 4, pp. 2518–2519.

[18] M. Fossorier, M. Mihaljevic, and H. Imai, “Reduced complexity it-erative decoding of LDPC codes based on belief propagation,” IEEETrans. Commun., vol. 47, no. 5, pp. 673–680, May 1999.

[19] J. Chen, A. Dholakia, E. Eleftheriou, M. Fossorier, and X.-Y. Hu, “Re-duced-complexity decoding of LDPC codes,” IEEE Trans. Commun.,to be published.

[20] X.-Y. Hu, E. Eleftheriou, and D. M. Arnold, “Regular and irregularprogressive edge-growth tanner graphs,” IEEE Trans. Inf. Theory, vol.51, no. 1, pp. 386–398, Jan. 2005.

[21] J. Chen, R. M. Tanner, C. Jones, and Y. Li, “Improved min-sum de-coding algorithms for irregular LDPC codes,” in Proc. IEEE Int. Symp.Inf. Theory, Adelaide, Australia, Sep. 2005, pp. 449–453.

[22] C. E. Shannon, “Probability of error for optimal codes in a Gaussianchannel,” Bell Syst. Tech. J., vol. 38, pp. 611–656, 1959.

[23] S. Dolinar, D. Dolinar, and F. Pollara, “Code performance as a functionof block size,” Jet Propulsion Lab., Pasadena, CA, TMO Progr. Rep.42-133, May 1998.

[24] H. Wymeersch, H. Steendam, and M. Moeneclaey, “Computationalcomplexity and quantization effects of decoding algorithms of LDPCcodes over GF(q),” in Proc. ICASSP, Montreal, QC, Canada, May2004, pp. 772–776.

[25] A. Goupil, M. Colas, G. Gelle, and D. Declercq, “On BP decoding ofLDPC codes over groups,” in Proc. Int. Symp. Turbo Codes, Munich,Germany, Apr. 2006, CD-ROM.

[26] J. Zhao, F. Zarkeshvari, and A. H. Banihashemi, “On implementationof min-sum algorithm and its modifications for decoding LDPC codes,”IEEE Trans. Commun., vol. 53, no. 4, pp. 549–554, Apr. 2005.