[american institute of aeronautics and astronautics spaceops 2006 conference - rome, italy ()]...

12
Long Erasure Correcting Codes: the New Frontier for Zero Loss in Space Applications? Enrico Paolini and Marco Chiani D.E.I.S., WiLAB, University of Bologna, Cesena, Italy Gian Paolo Calzolari European Space Agency, D/OPS, ESOC, Darmstadt, Germany In space communications, traditional error correction/detection techniques deliver to the upper layers of the communication stack only the data units for which integrity can be guaranteed. The uncorrectable data units are then “lost”, and the upper layers have typically to face data units erasures. Hence, the packet erasure channel (PEC) is the most proper channel model from the point of view of the upper layers. Automatic re- peat/retransmission query (ARQ) is the “traditional” solution implemented at the upper layers in order to face data units erasures. However, ARQ is not always possible for space or satellite communications. In such situations, forward error correction (FEC) must be used. Nowadays new techniques for FEC are available also for application at the upper lay- ers. Long erasure correcting (LEC) codes represent a new and very promising proposal for packet erasure FEC. They are able to overcome the complexity limitations of other types of codes, while preserving very good erasure correction capability. They are currently under investigation within the CCSDS (Consultative Committee for Space Data Systems) long erasure codes Bird of Feather (LEC-BOF), where a leading role has been so far played by ESA/ESOC and NASA-JPL. In this paper, the activity of the LEC-BOF is be illustrated. More in detail, the basic ideas behind LEC codes are presented, as well as the possible codes structures, the encoding and decoding rules, some theoretical properties. Some nu- merical results are presented, showing the performance of LEC codes on both memory-less and burst erasure channel. I. Introduction S pace communications exploit several techniques at different layers of the communication stack in order to guarantee the correct delivery of information to the receiver. As from the Consultative Committee for Space Data Systems (CCSDS) recommendations, the space link protocols can be described according to a five layers model, as depicted in Fig 1. 1 One of the tasks of the upper layers is to verify that all the data units are correctly delivered to the receiver. Traditional error correction and detection techniques only deliver the data units for which integrity can be guaranteed; on the contrary, the uncorrectable data units are “lost”. For this reason, the upper layers have typically to face data units erasures, and the packet erasure channel (PEC), where whole packets of bits are either correctly received or lost, is the most proper channel model from the point of view of the upper layers. Packet losses can outcome as consequence of brief outage conditions due to weather, shadowing, or loss of frame synchronization. We explicitly remark that the term “packet” used in this paper has a very general meaning, and is not referred to the data units of a specific protocol within the model presented in Fig. 1. In other words, a “packet” may represent a transfer frame, a space packet, or any data unit properly defined by the user. The expression long erasure code (LEC) packet will be sometimes used in the paper, in order to avoid confusion. Packet erasures due to the afore mentioned causes are usually correlated, and bursts of erasures can take place. This correlation is rarely considered in the channel model, and the memory-less PEC, where Email: [email protected] Email: [email protected] E-mail: [email protected] 1 of 12 American Institute of Aeronautics and Astronautics SpaceOps 2006 Conference AIAA 2006-5827 Copyright © 2006 by Authors and CCSDS. Published by the American Institute of Aeronautics and Astronautics, Inc., with permission.

Upload: gian-paolo

Post on 12-Dec-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Long Erasure Correcting Codes: the New Frontier for

Zero Loss in Space Applications?

Enrico Paolini∗ and Marco Chiani†∗

D.E.I.S., WiLAB, University of Bologna, Cesena, Italy

Gian Paolo Calzolari‡

European Space Agency, D/OPS, ESOC, Darmstadt, Germany

In space communications, traditional error correction/detection techniques deliver tothe upper layers of the communication stack only the data units for which integrity canbe guaranteed. The uncorrectable data units are then “lost”, and the upper layers havetypically to face data units erasures. Hence, the packet erasure channel (PEC) is themost proper channel model from the point of view of the upper layers. Automatic re-peat/retransmission query (ARQ) is the “traditional” solution implemented at the upperlayers in order to face data units erasures. However, ARQ is not always possible for spaceor satellite communications. In such situations, forward error correction (FEC) must beused. Nowadays new techniques for FEC are available also for application at the upper lay-ers. Long erasure correcting (LEC) codes represent a new and very promising proposal forpacket erasure FEC. They are able to overcome the complexity limitations of other types ofcodes, while preserving very good erasure correction capability. They are currently underinvestigation within the CCSDS (Consultative Committee for Space Data Systems) longerasure codes Bird of Feather (LEC-BOF), where a leading role has been so far played byESA/ESOC and NASA-JPL. In this paper, the activity of the LEC-BOF is be illustrated.More in detail, the basic ideas behind LEC codes are presented, as well as the possiblecodes structures, the encoding and decoding rules, some theoretical properties. Some nu-merical results are presented, showing the performance of LEC codes on both memory-lessand burst erasure channel.

I. Introduction

Space communications exploit several techniques at different layers of the communication stack in orderto guarantee the correct delivery of information to the receiver. As from the Consultative Committee

for Space Data Systems (CCSDS) recommendations, the space link protocols can be described accordingto a five layers model, as depicted in Fig 1.1 One of the tasks of the upper layers is to verify that all thedata units are correctly delivered to the receiver. Traditional error correction and detection techniques onlydeliver the data units for which integrity can be guaranteed; on the contrary, the uncorrectable data unitsare “lost”. For this reason, the upper layers have typically to face data units erasures, and the packet erasurechannel (PEC), where whole packets of bits are either correctly received or lost, is the most proper channelmodel from the point of view of the upper layers. Packet losses can outcome as consequence of brief outageconditions due to weather, shadowing, or loss of frame synchronization.

We explicitly remark that the term “packet” used in this paper has a very general meaning, and is notreferred to the data units of a specific protocol within the model presented in Fig. 1. In other words, a“packet” may represent a transfer frame, a space packet, or any data unit properly defined by the user. Theexpression long erasure code (LEC) packet will be sometimes used in the paper, in order to avoid confusion.

Packet erasures due to the afore mentioned causes are usually correlated, and bursts of erasures cantake place. This correlation is rarely considered in the channel model, and the memory-less PEC, where

∗Email: [email protected]†Email: [email protected]‡E-mail: [email protected]

1 of 12

American Institute of Aeronautics and Astronautics

SpaceOps 2006 Conference AIAA 2006-5827

Copyright © 2006 by Authors and CCSDS. Published by the American Institute of Aeronautics and Astronautics, Inc., with permission.

AOS SpaceData LinkProtocol

TC SpaceData LinkProtocol

TM SpaceData LinkProtocol

TM Sync. andChannel Coding

TC Sync. andChannel Coding

RF and Modulation Systems

Data Link Layer(Data Link

ProtocolSublayer)

Network Layer

Proximity-1Space Link

Protocol

SpacePacket

ProtocolSCPS-NP IP Version 4 IP Version 6

Physical Layer

Application Layer

SCPS-TP

CCSDS FileDeliveryProtocol(CFDP)

SCPS-FP

TCP

FTP

UDP

ApplicationSpecific

Protocols

Transport Layer

Lossless Data Compression

SCPS-SP

(Sync. andChannel Coding

Sublayer)

Figure 1. Space link protocols model.

packets are lost independently with equal erasure probability, is usually considered. This choice can becorrect if a sufficiently long packet interleaving is present in the system. The packet interleaving consists inperforming a packet permutation at the transmitter, before the transmission, and the inverse permutationat the receiver. This procedure can permit to properly spread the erasure bursts on the sequence of receivedpackets. However, when the interleaving procedure (if available) only involves packets belonging to thesame codeword, and the erasure burst length is not small respect to the codeword length, the memory-lessapproximation cannot guarantee a confident approximation of the communication system. (As explained inSection II, in this contest a codeword is represented by a set of n encoded packets, generated by a set of kinformation packets.)

Automatic repeat/retransmission query (ARQ) is the “traditional” solution implemented at the upperlayers in order to face data units erasures. However, ARQ is not always possible for space or satellitecommunications. This typically happens in deep space missions, where the long value of the round trip delayintroduces drawbacks with respect to both the time needed to request and receive the lost data unit(s) aswell to the increase of the memory requirements needed to ensure a very long persistence. It also happenswhen a feedback channel is not available, or in the satellite broadcast, where the satellite is not able tomanage several retransmission requests, or when on board memory is limited and persistency of the datacannot be guaranteed. In these cases, exploiting packet oriented forward error correction (FEC) techniquesis the only possibility to face packet erasures. This upper layer FEC has no impact on the traditional FECtechniques implemented at the synchronization and channel coding sublayer of data link layer (see Fig. 1).It is also worth mentioning that, when ARQ is possible, hybrid ARQ/FEC solutions can be considered.2 Inthis paper, however, only the pure FEC approach is considered.

Recall that a (n, k) block linear code with minimum distance dmin can successfully recover with probability1 from any pattern of dmin −1 or less erasures, and that the minimum distance always satisfies the Singletonbound dmin ≤ n − k + 1. When bounded distance decoding is used, if the initial number of erasures is lessthan dmin decoding is performed, otherwise a decoding failure is declared. More powerful erasure correctingalgorithms exist, but none of them can successfully recover with probability 1 from a number of erasuresgreater than dmin − 1. For Reed-Solomon codes the Singleton bound is achieved with equality (maximumdistance separable codes). Hence, a Reed-Solomon code can always recover from any pattern of at most

2 of 12

American Institute of Aeronautics and Astronautics

n − k erasures. However, the performance of Reed-Solomon codes is limited due to practical limitations tothe codeword length n, which are imposed by the decoding complexity. A typical value of the codewordlength n used is n = 255. This limitations to the codeword length determine other drawbacks, like the needto encode a long file by using more codewords, and the impossibility to face long erasure bursts.

Nowadays new techniques for forward error correction are available also for application at the upperlayers. Above all, long erasure correcting (LEC) codes represent a new and very promising proposal forpacket erasure FEC. This proposal has been formulated on the basis of excellent performance that can beobtained by iteratively decoded, long codes based on sparse graphs. In general, they are able to overcome thecomplexity limitations of maximum distance separable codes, while preserving good or very good erasurecorrection capability. Their linear encoding and decoding complexity enables for long codeword lengths,thus allowing to achieve extremely good performance, outperforming the performance of maximum distanceseparable codes of practical use. This also permits to encode long files as an unique codeword, and to facelong bursts of erasures. LEC codes are currently under investigation within the CCSDS where a long erasurecodes Bird of Feather (LEC-BOF) has been created to determine the potentiality and the applicability ofthis kind of codes. In this CCSDS group a leading role has been played by ESA/ESOC and NASA/JPL. Theaim of the LEC-BOF was set to to investigate possible benefits of these codes for future CCSDS missions,to study the proper channel model and to formulate the proper requirements. In this paper, the activity ofthe LEC-BOF is illustrated and the basic ideas behind LEC codes are clarified. Possible codes structures,encoding and decoding rules, some theoretical properties and technical proposals presented by membersduring the CCSDS Meetings are also presented. A possible algorithm for improving the performance of LECcodes on burst erasure channels is also described.

II. Long Erasure Correcting Codes

Suppose that q packets of bits x1,x2, . . . ,xq are combined together in order to generate a new packetxq+1. Suppose that packet xq+1 is generated such that if any of the packets in the set {x1, . . . ,xq,xq+1} isunknown, then it can be always reconstructed if the other q packets are known. If, as usual, x1,x2, . . . ,xq

have the same length L (intended as number of bits), then xq+1 can be generated by a simple bitwise XORof x1,x2, . . . ,xq, with resulting length L. In order to indicate that xq+1 is generated by x1,x2, . . . ,xq, thegraphical notation of Fig. 2 will be adopted. An equivalent graphical notation is depicted in Fig 3, whosemeaning is that the bitwise XOR of x1, . . . ,xq,xq+1 is equal to the all-zero packet.

Suppose that these q + 1 packets are transmitted over a PEC and that all but one of them are correctlyreceived. In such a situation the missing packet can be always reconstructed at the receiver. If xq+1 hasbeen generated as the bitwise XOR of x1,x2, . . . ,xq, then the missing packet xi can be obtained as

⊕j �=i xj

i.e. as the bitwise XOR of the received q packets. If more than one packet is lost, there is no chance torecover from the erasure pattern.

In this example, a parity constraint exists on x1, . . . ,xq,xq+1. Specifically, the q + 1 bits in the sameposition of each packet must satisfy a parity constraint, since their XOR must be 0. For this reason, thesquare node in Fig. 3 is said a parity-check node (or simply a check node), and the nodes x1,x2, . . . ,xq,xq+1

are said variable nodes. A parity-check node enables the correction of one erased packet.

A. Low-Density Parity-Check Codes and their Iterative Decoding

Consider the code structure of Fig. 4, where each packet xi (i = 1, . . . , n) to be transmitted is involved inmultiple parity constraints. The code graph is supposed to be sparse, i.e. the number of edges is supposedsmall with respect to the product nm (maximum possible number of edges in absence of multiple connectionsbetween two nodes), and the overall code is said a low-density parity-check (LDPC) code.3 The sequencex1,x2, . . . ,xn is said the codeword, and n is the codeword length a. The codeword is generated at thetransmitter by an encoding operation, consisting in the generation of the n encoded packets x1,x2, . . . ,xn

from k information packets u1,u2, . . . ,uk. The ratio k/n is said the code rate, denoted by R.The code graph is said a bipartite graph, since no connection are allowed between two variable nodes or

two check nodes. After the transmission of the n packets on the PEC, some of the packets will be knownat the decoder, while some other packets will be lost. The technique for decoding the missing packets is toiteratively applying the afore described procedure. For each check node, if it is connected to only one missing

aIn this contest, each encoded symbol is represented by a packet of bits.

3 of 12

American Institute of Aeronautics and Astronautics

x1

x2

xq

xq+1

Figure 2. Generation of xq+1 from x1,x2, . . . ,xq.

x1

x2

xq+1

xq

Figure 3. Parity constraint satisfied byx1,x2, . . . ,xq ,xq+1.

� � � �x1 x2 x3 xn

c1 c2 c3 cm

e

Figure 4. LDPC code structure.

packet, this missing packet can be successfully reconstructed. As explained in the above example, the erasedpacket can be obtained simply as the bitwise XOR of the known packets. If the number of unknown packetsconnected to a check node is greater or equal than 2, no packets can be recovered by that check node. Assoon as receiving the last packet connected to it, each check node can be processed for recovery of missingpacket(s): In principle, for some check nodes recovery will be possible, for some other check nodes it will notbe. When the last check node has been considered, a new iteration on the check nodes is started. If no packetis recovered at a certain iteration �, then it will be sure that no packet will be recovered at iteration � + 1.Hence, the iterative algorithm is stopped as soon as all there are no more unknown packets, or no packet iscorrected at a certain iteration and unknown packets still exist. In the former case decoding is successful, inthe latter one a decoding failure is declared b. However, even in this case, some packets originally missingcan be successfully reconstructed.

This iterative decoding algorithm for erasure correction, is characterized by decoding complexity linearlyincreasing with the codeword length.4 In fact, the decoding complexity, intended as number of XOR’sperformed to successfully decode the erasure pattern, is proportional to the total number of edges. Since thebipartite graph is sparse by hypothesis, the decoding complexity is proportional to the number of variablenodes. We also observe that it is possible to give an equivalent description of the same decoding algorithmas an instance of the belief propagation algorithm,5 i.e. of the traditional iterative decoding algorithm ofLDPC codes used for bit-oriented error correction.

B. IRA Codes

The decoding procedure for LDPC codes on the PEC is extremely efficient, even in the case where theconnections between variable and check nodes are chosen at random. However, this is not the case forencoding. The encoding procedure consists in generating n encoded packets from k < n information packets.The complexity of encoding procedure, intended as the number of operations needed to generate the encodedpackets, is in general a quadratic function of n. This quadratic complexity imposes a severe constraint tothe codeword length n. Thus, the possibility to exploit long codes is bound by the capability to perform theencoding operation with low complexity. This task is achieved by carefully choosing the connections betweenvariable and check nodes in the bipartite graph.

Several promising solutions exist for an efficient encoding of LDPC codes. In the following we describebActually, if the code is systematic, a decoding failure is declared when at the end of the decoding algorithm at least one

systematic packet of bits is still erased.

4 of 12

American Institute of Aeronautics and Astronautics

u1

uk

p1

pn−k

Figure 5. Structure of bipartite graph for an IRAcode.

� � �

� �

� � �

� � �u1

uk

︸ ︷︷ ︸redundant packets

Figure 6. Structure of a Tornado code.

� � � �� � �u1 . . . uk

c1 cl2 cl2+1 cl2+l3+1cl2+l3 cn−k

︷ ︸︸ ︷ ︷ ︸︸ ︷l2 redundant packets l3 redundant packets

︸ ︷︷ ︸︸ ︷︷ ︸l3 check nodesl2 check nodes

Figure 7. Tornado code as an instance of LDPC code.

one of these solutions, namely IRA codes.6 IRA codes have been already considered for implementation astransport layer erasure correcting codes in satellite broadcast applications.7

IRA encoding is systematic. This means that the first k encoded packets x1,x2, . . . ,xk are equal to theinformation packets u1,u2, . . . ,uk. The last n − k encoded packets p1 = xk+1,p2 = xk+2, . . . ,pn−k = xn

are called redundant packets. In IRA encoding, a first encoded packet p1 is generated as the bitwise XORof a subset of the information packets. Then, other n − k − 1 encoded packets are generated, accordingto the following rule. Packet pi, i = 2, . . . , n − k, is generated as the bitwise XOR of a subset of theinformation packets, and of packet pi−1, as depicted in Fig. 5. In this figure, the redundant packets havebeen represented on the right of the check nodes: however, the meaning of the connections between thenodes remains the same as for Fig. 3. The outcoming systematic codeword is [u1, . . . ,uk,p1, . . . ,pn−k].The resulting complexity of IRA encoder, intended as number of XOR operations, is linear in the codewordlength n. Moreover, the systematic nature of the code allows to immediately use the information packetsnot erased by the channel.

C. Tornado Codes

A variation with respect to the LDPC codes structure is represented by Tornado codes.4 The structure of aTornado code is depicted in Fig. 6. This structure consists in a cascade of a certain number of packet layers.The k packets u1,u2, . . . ,uk belonging to the first layer are the information packets, while the n− k packetsbelonging to all other layers are the redundant packets; the overall code is then systematic.

A sparse bipartite graph connects each layer to the next one, where the connections have the samemeaning of Fig. 2: if a packet in layer j + 1 is connected to q packets in layer j, then it is generated as thebitwise XOR of these q packets. Therefore, the encoding algorithm for Tornado codes is very simple: packetsin the second layer are generated from the information packets through bitwise XOR’s, packets in the third

5 of 12

American Institute of Aeronautics and Astronautics

layer are generated from packets in the second layer, and so on up to the packets in the last layer. Note thatthe complexity of this encoding rule is linear in the codeword length, independently of the structure of theconnections in the cascade.

The decoding algorithm of Tornado codes exploits the same simple rule, illustrated at the beginning ofthis section, as LDPC codes. Consider q packets in layer j connected to one packet in layer j + 1. If one ofthe q packets in layer j is unknown, while the other q − 1 packets and the packet in layer j + 1 are known,then the missing packet can be reconstructed as bitwise XOR of the known packets. This decoding ruleis performed in the same recursive way as for LDPC codes, for each pair of layers in the cascade. If thecascade has S layers of packets, the iterative decoding rule is first performed at layers S and S − 1, then atlayers S − 1 and S − 2, and so on up to the first layer. Hence, the direction of decoding process is oppositerespect to the direction of encoding. It is worth mentioning that, if the number of packets involved in layersS − 1 and S is sufficiently small, then it is possible to use a more powerful algorithm at the first step ofdecoding, instead of the iterative decoding. For instance, maximum a posteriori (MAP) decoding can beused, combined with a linear code structure with good properties in terms of minimum distance. This canproduce a little improvement in the overall code performance.

The structure of a Tornado code can be seen a special instance of an LDPC code. This concept isexplained in Fig. 7, where a a Tornado code with three layers of nodes (one layer of information packets andtwo layers of redundant packets) is represented as an LDPC code. The information nodes are connected toa number of check nodes equal to the number l2 of redundant packets in the second layer. Each of thesecheck nodes has an edge towards a specific redundant node in the second layer. Again, the variable nodesin the second layer are connected to a number of check nodes equal to the number l3 of redundant nodes inthe third layer, and each such check node has one connection towards a specific redundant node in the thirdlayer. The total number of check nodes is n − k. This observation suggests the interpretation of Tornadocodes as special LDPC codes, where constraints are imposed on the connections between variable and checknodes, in order to achieve efficient encoding.

D. Protograph Codes

Protograph codes8 are a subclass of LDPC codes. The bipartite graph of a protograph code is obtainedstarting from a bipartite graph with a small number of edges and nodes, called the protograph. Typically,puncturing is performed to the variable nodes: this means that some of the variable nodes in the protographare transmitted, and some other variable nodes are not transmitted. In order to obtain the final bipartitegraph, a certain number of repetitions of the protograph are first generated, in order to achieve the desiredcodeword length n. In this phase, the bipartite graph is composed of a certain number of unconnectedgraphs. Next, an edge permutation operation is performed in order to make the overall graph connected.More specifically, each edge in the graph connecting a check node c and a variable node x before thepermutation, will connect a repetition of c to a repetition of x after the edge permutation. The code rate ofthe overall bipartite graph is the same as the code rate of the protograph. Two examples of good protographsfor the erasure channel are shown in Fig 8 and Fig. 9.9 In these protographs, the black nodes representpunctured variable nodes.

E. Generalized Low-Density Parity-Check Codes c

The square node in Fig. 3 represents a bitwise parity constraint on the packets x1,x2, . . . ,xq+1. This simpleconstraint is a single parity-check code of length q+1: since its minimum distance is dmin = 2 (independentlyof the length), it is able to correct any single erasure. Next, suppose that the same square node representsa block binary linear code of length q + 1 and minimum distance dmin > 2 (actually, it represents L suchcodes working in parallel, where L is the packet length). In this case, the check node has a higher erasurecorrection capability than a single parity-check code. Several decoding algorithms can be performed at thecheck node. For example, bounded distance decoding can be performed: If the number e of erasures ise < dmin decoding is successfully performed, otherwise a decoding failure is declared. The most powerfuldecoding algorithm is MAP decoding. MAP decoding is always successful when e < dmin, it can be successfulwhen dmin ≤ e ≤ n−k, and it is never successful when e > n−k. Despite in the last case decoding is alwaysunsuccessful, some missing packets may be still correctable.

cGeneralized LDPC codes have never been considered within the CCSDS LEC BOF. They are included in this paper forcompleteness.

6 of 12

American Institute of Aeronautics and Astronautics

� �

��

Figure 8. A R = 1/2 protograph achieving a thresholdp∗ = 0.4776.

Figure 9. A R = 1/2 protograph achieving a thresholdp∗ = 0.4825.

The idea behind generalized LDPC (GLDPC) codes, sometimes called Tanner codes,10 is to use morepowerful check codes instead of single parity-check codes. Using just block linear codes with dmin > 2 ascheck nodes has been shown to produce a rate loss that makes this solution not attractive for high ratecodes (code rate R = k/n > 1/2).11 However, hybrid check structures composed of both single parity-checkcodes and linear block codes have been shown to be able to overcome LDPC codes in terms of asymptoticperformance.12

III. Long Erasure Correcting Codes Performance

A. Asymptotic Performance

The performance of the iterative decoder described in subsection II-A heavily depends on the degree distri-bution of variable and check nodes. The degree of a variable or check node is the number of edges connectedto that node, while the variable (check) degree of an edge is the degree of the variable (check) node the edgeis connected to. For instance, variable node x1 in Fig. 4 has degree 2, check node c1 in the same figure hasdegree 3, and the edge labelled by e has variable degree 2 and check degree 3. It is usual to refer to theedge-oriented degree distributions instead of the node-oriented degree distributions. This is because mostof the equations concerning the asymptotic performance of long erasure codes can be easily formulated interms of edge-oriented distributions. For a LDPC code, or for each cascading graph in a Tornado code, thefraction of edges with variable degree i is usually denoted by λi, the fraction of edges with check degree i byρi, and the edge degree distributions are defined as λ(x) =

∑i λi xi−1 and ρ(x) =

∑i ρi xi−1 (x is a dummy

variable).The maximum fraction of erasures that a random LDPC (i.e. an LDPC code with randomly generated

bipartite graph) code can correct on a memory-less PEC can be expressed as a function of just the degreedistribution pair (λ, ρ), if infinite codeword length is assumed. More specifically, for any distribution (λ, ρ),a maximum value of the channel erasure probability p exists, denoted by p∗ = p∗(λ, ρ), such that for p < p∗

a random (λ, ρ) LDPC code makes vanishing the post decoding erasure probability, in the limit where ntends to infinity. This value is also referred to as decoding threshold. For unstructured (i.e. random) codes,the threshold has been shown to be equal to the supremum p such that inequality ρ(1 − p λ(x)) > 1 − xholds ∀x ∈ (0, 1].13 This inequality is recognized to be a direct consequence of density evolution14 for thememory-less erasure channel. The threshold p∗ of a code with code rate R cannot exceed 1 − R: this is adirect consequence of the channel coding theorem. Since for an LDPC code the code rate R is expressed byR = k/n = 1 − (

∑i λi/i)/(

∑i ρi/i), then the threshold can never exceed (

∑i λi/i)/(

∑i ρi/i).

The task when considering infinite length performance is to find the degree distribution pair (λ, ρ) withthe best threshold p∗, under a number of constraints. Typical constraints are the code rate R, the minimumand maximum variable degree, the minimum and maximum (or, alternatively, the average) check degree. Ifvmax and vmin (cmax and cmin) denote the maximum and minimum variable (check) degrees, then the totalnumber of independent variables involved in the optimization problem is vmax − vmin + cmax − cmin − 3. Thisis because the three constraints

∑i λi = 1,

∑i ρi = 1, 1 − R = (

∑i λi/i)/(

∑i ρi/i) must be satisfied.

Several numerical techniques can be used in order to solve this optimization problem. Among them, one ofthe most effective is the differential evolution algorithm.15

It has been proved that the capacity of the memory-less PEC is achievable by the iterative decoder.4 Asequence of degree distributions (λm, ρm), all with the same code rate R, is said capacity achieving of rateR when, for any ε > 0, there exists an m such that |(1 − R) − p∗m| < ε for any m > m, where p∗m is thethreshold of (λm, ρm). Some capacity achieving distributions have been analytically evaluated.4,16 However,

7 of 12

American Institute of Aeronautics and Astronautics

these distributions are of limited practical interest because of the high error floors (see next subsection) ofcodes designed according to them.17,18

The concept of threshold, presented for unstructured LDPC codes, can be extended to IRA codes, Tornadocodes, LDPC codes based on protographs and GLDPC codes as well. In each case, the asymptotic decodingthreshold can be computed by exploiting either techniques based on density evolution, or techniques basedon EXIT charts.19 For protograph codes, density evolution can be directly performed on the protograph(i.e. on a relatively simple graph) in order to evaluate the asymptotic threshold of the overall code. Thisthreshold evaluation is usually very accurate. This also implies that the problem of searching for good codescan be performed as a search for good protographs. For instance, the rate 1/2 protograph depicted in Fig. 8has a threshold p∗ = 0.4776, while the rate 1/2 protograph depicted in Fig. 9 has a threshold p∗ = 0.4825.9

Protographs with higher asymptotic thresholds are also known.

B. Finite Length Performance

The performance of a finite-length LDPC code under iterative decoding depends on particular graph struc-tures in the bipartite graph, namely stopping sets20 and cycles. Suppose that a subset V of the variablenodes satisfies the following condition: any check node connected to this subset is connected to it at leasttwice. Recall that a parity-check node in an LDPC code can correct just one erasure: then, if the startingerasure pattern generated by the channel includes V , the iterative decoder will not be able to correct anyof the variable nodes in V . Subsets of variable nodes satisfying the afore mentioned constraint are calledstopping sets. If decoding is unsuccessful, the set of the residual erased variable nodes at the end of decodingis the union of all the stopping sets included in the starting erasure pattern. For GLDPC codes, the conceptof stopping set is replaced by the concept of generalized stopping set.11 A cycle is defined as a close pathin the bipartite graph, starting and ending on the same node, and the girth of the bipartite graph is theminimum length of a cycle. There is a close relationship between stopping sets and cycles. Specifically, ithas been proved that any stopping set contains cycles.21 Hence, constructing the bipartite graph accordingto girth optimizing algorithms or stopping sets removing algorithms can improve the code performance22.21

It is important to remark that more powerful (and complex) decoding algorithms like maximum likelihoodcan correct erasure patterns even if they contain stopping sets.

Within the performance curve of a finite length long erasure code under iterative decoding, two regionscan be typically distinguished. The first region is known as waterfall: for values of channel erasure probabilityp slightly smaller than the asymptotic threshold, a small reduction of p corresponds to high gain in termsof post decoding packet erasure probability or decoding failure probability. The second region is callederror floor, and corresponds to smaller values of p. In the error floor region, small improvements in terms ofperformance correspond to relatively high reductions of p. This well known phenomenon is typical of iterativedecoders on a wide range of channels. In general, codes with good threshold and waterfall performance havea poor error floor, and viceversa.

For unstructured LDPC codes on the erasure channel, the compromise between waterfall and error floorperformance can be in part justified combining a number of results. First, the error floor depends on smallstopping sets. In fact, when p is small, the starting number of unknown variable nodes is usually small aswell. Second, if a code has a minimum distance dmin, then it must have stopping sets of size dmin.21 Third, if(λm, ρm) is a capacity achieving sequence, then for sufficiently high m and sufficiently high codeword lengthn the minimum distance for an unstructured (i.e. random) code with distribution (λm, ρm) is proportionalto log(n), and hence small.23 Thus, unstructured finite length codes with very good thresholds have poorminimum distance, and so stopping sets with small size. This generates the high error floor. Though provedfor unstructured codes, this behavior is typically shared by partially structured or structured codes, as well.Finally, it has been proved that, with high probability, the minimum distance of an unstructured code withdegree distribution (λ, ρ) is a linear function of n when λ′(0)ρ′(1) < 1, otherwise it is a logarithmic functionof n.23 This results will be recalled in Section V.

IV. Optimization of Long Erasure Codes on Burst Erasure Channels

In most realistic scenarios, packet erasures are correlated and they often occur in bursts. In this section analgorithm recently proposed for the optimization of long erasure codes on burst erasure channels is recalled.24

Though the description of the algorithm is focused on LDPC codes, the basic concepts are applicable to all

8 of 12

American Institute of Aeronautics and Astronautics

0,32 0,34 0,36 0,38 0,40 0,42 0,44 0,46 0,4810

-6

10-5

10-4

10-3

10-2

10-1

100

FaR

p

C1 JPL AR4A C2 C3 IRA,d

v=5

IRA, irreg.

Figure 10. Performance on memory-less PEC of several (2048, 1024) long erasure correcting codes (R = 1/2) interms of decoding failure rate (FaR) versus channel erasure probability p.

classes of long erasure codes.By definition, the maximal guaranteed resolvable burst length of an LDPC code, denoted by Lmax, is the

maximal erasure burst length such that the iterative decoder is always able to successfully recover from theburst independently of its position within the codeword. For large codeword length n and proper permutationof variable nodes, it results Lmax � p∗n, where p∗ is the threshold over memory-less erasure channel. Sincein most realistic environments any codeword is prevalently affected by just one erasure burst, in order toimprove long erasure codes performance in such environments parameter Lmax should be maximized. Froman equivalent point of view, concentrated stopping sets (i.e. stopping sets made up of variable nodes whichare near in the graph) should be removed from the code graph, by increasing their dispersion (the maximaldistance between two variable nodes belonging to the stopping set). The proposed optimization algorithmjust executes variable nodes permutations, without modifying their connections towards the check nodes.It receives in input a specific LDPC code with maximum resolvable burst length L′

max and returns a newLDPC code with the same degree distributions and with L′′

max > L′max.

If L′max is the maximum resolvable length for an LDPC code, then at least one erasure burst of length

L′max + 1 exists which is non-resolvable. The basic observation behind the algorithm is the following: If

this non resolvable burst begins on variable node xj (see Fig. 4), then variable nodes xj and xj+L′max

mustbelong to the maximal stopping set comprised in the burst. The optimization algorithm is then based on thefollowing rule: If burst length L1 is non-resolvable due to a burst beginning on encoded packet xj , look fora variable node xi in the set {x1, . . . ,xj−1,xj+L1 ,xn} such that permuting xj (or xj+L1−1) and xi makesthe erasure burst length L1 resolvable.

Despite its extreme simplicity, the afore described optimization algorithm has been shown to be surpris-ingly effective. In many cases, it is able to generate an LDPC code with L′′

max quite close to the asymptoticvalue p∗n. Usually, the generated LDPC code has an L′′

max quite higher than the L′max of the input code.

For instance, if applied to an irregular LDPC code with codeword length n = 2000 and asymptotic thresholdp∗ = 0.4556 (p∗n � 911), it has been shown to be able to produce an LDPC code with Lmax = 904.24 Otherfeatures of the algorithm are its flexibility (i.e. applicable to any code rate and codeword length), and thepossibility to be used within pure FEC schemes, Type I / Type II ARQ protocols, and parity on demandschemes.

9 of 12

American Institute of Aeronautics and Astronautics

10-6

10-5

10-4

10-3

10-2

10-6

10-5

10-4

10-3

10-2

10-1

100

FaR

b

C1 optim ized C1

Figure 11. Performance on the CLBuEC with burst length L = 810 packets of code C1 (2048, 1024) and itsoptimized version in terms of decoding failure rate (FaR) versus parameter b.

����

����

�����

1 − b

� � �� � �b ︸ ︷︷ ︸

L Bad statesGood state

Figure 12. Constant length burst erasure channel (CLBuEC) model.

V. Simulation Results

In this section some performance curves for long erasure correcting codes are illustrated and discussed.Consider first Fig. 10, where the performance of several long erasure codes is shown on the memory-lessPEC, in terms of decoding failure rate (FaR) versus the channel erasure probability p. Each code has a coderate R = 1/2 and information block length k = 1024 packets (n = 2048 packets). The code denoted by C1is random LDPC code whose distribution has been obtained by running differential evolution subject to theconstraint to have a minimum variable nodes degree equal to 3, a maximum variable degree equal to 30 anda maximum check degree equal to 14. The condition λ2 = 0 implies good property in terms of minimumdistance and error floor (λ′(0)ρ′(1) = 0), but does not permit to obtain capacity approaching distributions.This code does not exhibit error floor up to FaR = 10−6, and its asymptotic threshold p∗ = 0.46296,though not excellent is rather good. However, the waterfall performance of this code is very poor. Theasymptotic threshold does not confidently describe the waterfall performance of code C1 at finite lengthn = 2048. Consider now the JPL AR4A protograph code.25 It is characterized by a worse asymptoticthreshold p∗ � 0.44 than code C1. However, it overcomes code C1 since, for codeword length n = 2048, itexhibits a smaller gap between finite length and asymptotic performance, as well as very good properties interms of error floor.

Next, consider the random LDPC codes denoted by C2 and C3. Their distributions have been obtainedby running differential evolution with maximum variable degree equal to 30, minimum variable degree equalto 2, and with an upper bound on parameter λ′(0)ρ′(1). More specifically, λ′(0)ρ′(1) < 0.3 for C2 andλ′(0)ρ′(1) < 0.4 for C3. The threshold for code C2 is p∗ = 0.44985, while the threshold for C3 is p∗ = 0.45300.

10 of 12

American Institute of Aeronautics and Astronautics

They both outperform the AR4A code in terms of asymptotic threshold, waterfall performance and don’texhibit error floor as well, up to FaR < 10−6. Efficient encoding for the codes C2 and C3 can be performedwith linear complexity by exploiting a generalization of the concept of IRA code, namely GeIRA.26 The IRAcode in the same figure is characterized by a uniform degree dv = 5 for the systematic nodes, and a thresholdp∗ = 0.43795. Its performance is quite similar to the performance of the AR4A protograph code, with alittle gain for FaR< 10−3. It is possible to generate IRA codes with the same codeword length and coderate, and with an optimized distribution. For instance, the irregular IRA code in Fig. 10 has a thresholdp∗ = 0.480353. However, it presents a poor error floor that makes it not attractive in applications thatrequire failures rates less than 10−3.

Next, consider two examples of application of the optimization algorithm on burst erasure channels. Morespecifically, consider the application of the algorithm to the afore presented codes C1 and irregular IRA.Code C1 has a maximum resolvable length L′

max = 788, quite lower than the asymptotic limit. From theperformance curve of this code on the memory-less channel (Fig. 10) we observe that the decoding failurerate is about 3 · 10−2 for p = 0.385 � 788/2048. Thus, the probability that a randomly chosen erasurepattern of length 788 includes a stopping set is relatively small. Hence, the number of burst positions oflength 788 including stopping sets is small, and we can expect that L′

max will be improved. On the contrary,the decoding failure rate at p = 0.395 for the same code is about 10−1. Reasoning in the same way, there isa relatively high number of burst positions of length 809 � 0.395 · 2048 including stopping sets. We expectthat the algorithm can improve Lmax just a little more than this value. Actually, the optimized code ischaracterized by a L′′

max = 822. The decoding failure rates of both C1 and the generated optimized codeare plotted in Fig. 11, on a constant length burst erasure channel (CLBuEC) with burst length L = 810packets. This channel (see Fig 12) has a good state with erasure probability pG = 0, and L bad stateseach with erasure probability pB = 1. The channel moves from the good state to the bad states withtransition probability b, with generation of a burst of packets erasures of length L. After the generation ofthe last erasure in the burst, the channel returns in the good state. We observe a gain of about one order ofmagnitude in term of b at FaR= 10−4.

Considering the irregular IRA code, we found a maximum resolvable length L′max = 766 packets. This

value has been improved by the algorithm up to L′′max = 912, close to p∗ n.

VI. Conclusion

In this paper long erasure correcting codes for packet oriented FEC have been presented. These codesare currently under investigation within the CCSDS (Consultative Committee for Space Data Systems)Long Erasure Codes Bird of Feather (LEC-BOF). The possible codes structures, the encoding and decodingalgorithms, and some theoretical properties about asymptotic and finite length performance have beendiscussed. Code structures exist which permit to efficiently perform the encoding operation. Furthermore,their low complexity iterative decoding algorithm, which can asymptotically achieve the erasure channelcapacity and offers good finite length performance, allows for long codeword lengths (up to thousands ofpackets). Long erasure correcting codes can be in principle implemented at different layers in the protocolstack illustrated in Fig. 1, depending on the transmission protocol and on the specific application. Then,the LEC packet can assume different meanings in different contexts, like transfer frame, (constant length)space packet, UDP packet or any data unit properly defined by the user. Thus, they represent an attractivesolution for packet oriented FEC when ARQ is not available or within hybrid FEC / ARQ schemes, whenlong files need to be transmitted or when the channel can produce relatively long erasure bursts.

References

1CCSDS, “Overview of space link protocols,” Green Book 130.0-G-1, June 2001.2Shambayati, S., Jones, C., and Divsalar, D., “Maximizing throughput for satellite communication in a hybrid FEC/ARQ

scheme using LDPC codes,” IEEE MILCOM 2005 , Oct. 2005.3Gallager, R., Low-density parity-check codes, Cambridge, Massachussets: M.I.T. Press, 1963.4Luby, M., Mitzenmacher, M., Shokrollahi, M., and Spielman, D., “Efficient erasure correcting codes,” IEEE Trans.

Inform. Theory, Vol. 47, Feb. 2001, pp. 569–584.5Pearl, J., Probabilistic reasoning in intelligent systems: Network of plausible inference, San Mateo, CA: Morgan Kauf-

mann, 1988.6Jin, H., Khandekar, A., and McEliece, R., “Irregular repeat-accumulate codes,” Int. Symp. on Turbo codes and Related

Topics, Sept. 2000.

11 of 12

American Institute of Aeronautics and Astronautics

7Di, C., Ernst, H., Paolini, E., Coletto, S., and Chiani, M., “Low-density parity-check codes for the transport layer ofsatellite broadcast,” 23rd AIAA ICSSC , Sept. 2005.

8Divsalar, D., Dolinar, S., Thorpe, J., and Jones, C., “Constructing LDPC codes from simple loop-free encoding modules,”IEEE ICC 2005 , Seoul, Korea, May 2005.

9Divsalar, D., “Long erasure correcting codes,” CCSDS Spring Meeting 2005 , Athens, Greece, April 2005.10Tanner, R. M., “A recursive approach to low complexity codes,” IEEE Trans. Inform. Theory, Vol. 27, Sept. 1981,

pp. 533–547.11Miladinovic, N. and Fossorier, M., “Generalized LDPC codes with Reed-Solomon and BCH codes as component codes

for binary channels,” IEEE GLOBECOM 2005 , St. Louis, USA, Dec. 2005.12Paolini, E., Fossorier, M., and Chiani, M., “Analysis of generalized LDPC codes with random component codes for the

binary erasure channel,” submitted for publication.13Luby, M. G., Mitzenmacher, M., Shokrollahi, M. A., Spielman, D. A., and Stemann, V., “Practical loss-resilient codes,”

Proc. of the twenty-ninth annual ACM symposium on Theory of computing, 1997, pp. 150–159.14Richardson, T., Shokrollahi, M., and Urbanke, R., “Design of capacity-approaching irregular low-density parity-check

codes,” IEEE Trans. Inform. Theory, Vol. 47, Feb. 2001, pp. 619–637.15Shokrollahi, M. A. and Storn, R., “Design of efficient erasure codes with differential evolution,” IEEE ISIT 2000 , Sorrento,

Italy, June 2000.16Oswald, P. and Shokrollahi, M. A., “Capacity-achieving sequences for the erasure channel,” IEEE Trans. Inform. Theory,

Vol. 48, Dec. 2002, pp. 364–373.17Paolini, E. and Chiani, M., “Performance evaluation of capacity approaching distributions,” IEEE SoftCOM 2005 , Split,

Croatia, Sept. 2005.18Chiani, M., Liva, G., and Paolini, E., “Investigating packet erasure correction for CCSDS communications protocols,”

CCSDS Fall Meeting 2004 , Toulouse, France, Nov. 2004.19Ashikhmin, A., Kramer, G., and ten Brink, S., “Extrinsic information transfer functions: Model and erasure channel

properties,” IEEE Trans. Inform. Theory, Vol. 50, Nov. 2004, pp. 2657–2673.20Di, C., Proietti, D., Telatar, I. E., Richardson, T. J., and Urbanke, R. L., “Finite-length analysis of low-density parity-

check codes on the binary erasure channel,” IEEE Trans. Inform Theory, Vol. 48, 2002.21Tian, T., Jones, C., Villasenor, J., and Wesel, R., “Construction of irregular LDPC codes with low error floors,” IEEE

ICC 2003 , Anchorage, Alaska, May 2003.22Arnold, D. M., Eleftheriou, E., and Hu, X. Y., “Progressive edge growth Tanner graphs,” IEEE GLOBECOM 2001 , San

Antonio, TX, Nov. 2001.23Di, C., Urbanke, R., and Richardson, T., “Weight distributions: How deviant can you be?” IEEE ISIT 2001 , Washington,

DC, June 2001.24Paolini, E. and Chiani, M., “Improved low-density parity-check codes for burst erasure channels,” IEEE ICC 2006 ,

Istanbul, Turkey, June 2006.25JPL, “Long erasure codes BOF - JPL update,” CCSDS Fall Meeting 2005 , Atlanta, U.S.A., Sept. 2005.26Liva, G., Paolini, E., and Chiani, M., “Simple reconfigurable low-density parity-check codes,” IEEE Communications

Letters, March 2005.

12 of 12

American Institute of Aeronautics and Astronautics