on a posteriori probability decoding of linear block codes ... · on a posteriori probability...

The University of Western Australia

School of Electrical, Electronic and Computer Engineering

Crawley, WA 6009

On A Posteriori Probability Decoding of

Linear Block Codes over Discrete

Channels

Wayne Bradley Griffiths

BCM(Hons), Dip.Mod.Lang.

This thesis is presented for the degree of

Doctor of Philosophy

of

The University of Western Australia.

June 2008

Abstract

One of the facets of the mobile or wireless environment is that errors quite often

occur in bursts. Thus, strong codes are required to provide protection against such

errors. This in turn motivates the employment of decoding algorithms which are

simple to implement, yet are still able to attempt to take the dependence or memory

of the channel model into account in order to give optimal decoding estimates.

Furthermore, such algorithms should be able to be applied for a variety of channel

models and signalling alphabets.

The research presented within this thesis describes a number of algorithms which

can be used with linear block codes. Given the received word, these algorithms

determine the symbol which was most likely transmitted, on a symbol-by-symbol

basis. Due to their relative simplicity, a collection of algorithms for memoryless

channels is reported first. This is done to establish the general style and principles

of the overall collection. The concept of matrix diagonalisation may or may not

be applied, resulting in two different types of procedure. Ultimately, it is shown

that the choice between them should be motivated by whether storage space or

computational complexity has the higher priority. As with all other procedures

explained herein, the derivation is first performed for a binary signalling alphabet

and then extended to fields of prime order.

These procedures form the paradigm for algorithms used in conjunction with

finite state channel models, where errors generally occur in bursts. In such cases,

the necessary information is stored in matrices rather than as scalars. Finally,

by analogy with the weight polynomials of a code and its dual as characterised

by the MacWilliams identities, new procedures are developed for particular types

of Gilbert-Elliott channel models. Here, the calculations are derived from three

parameters which profile the occurrence of errors in those models. The decoding

is then carried out using polynomial evaluation rather than matrix multiplication.

Complementing this theory are several examples detailing the steps required to

perform the decoding, as well as a collection of simulation results demonstrating the

practical value of these algorithms.

iii

Acknowledgements

As one can imagine, submitting oneself to the rigours of a higher degree by research

is not an easy task, and it would become even more arduous if one had to face that

task alone. Fortunately, there were a few people and organisations which provided

me with assistance, in different ways, during my period of candidature. I would like

to take this opportunity to thank them.

Firstly, I must thank my principal discipline supervisor Professor Dr.-Ing. Hans-

Jurgen Zepernick. If he had not accepted me as one of his students, this thesis

would not exist. He has been my compass from the beginning, always there to

direct me onto the next task. As evidenced by the amount of time and resources

he contributed, I must conclude that he is a wealth of knowledge and someone who

was supportive when things went bad, as well as being an excellent resource and aid

in making my written work more polished.

He was also instrumental in permitting me to study for six months at Blekinge

Tekniska Hogskola in Ronneby, Sweden. Without this opportunity, I would never

have experienced many different things, nor would I have grown as a person as much

as I believe I have done. Speaking of BTH, I wish to thank all the staff and doctoral

students from Avdelningen for Signalbehandling for accepting yet another Aussie

through their automatic doors.

I am also indebted to my co-supervisor Dr Manora Caldera. She spent countless

hours reading through my work and was always providing suggestions to aid the

readability of my written output. For that, I thank her. Thanks also to my principal

administrative supervisor Professor Sven Nordholm who ensured all the necessary

paperwork was filled out in a timely fashion and thus allowed me to optimise my

time spent researching.

There are a few organisations to which I must express my gratitude. Firstly, I

thank The University of Western Australia for offering me a scholarship position.

Their monetary assistance allowed me to have a life outside of study. In a similar

vein, I am grateful to the Australian Telecommunications Cooperative Research

Centre for providing additional financial and travel aid for me, as well as organising

v

Abstract

a number of conferences at which I was able to gain valuable experience in presenting

my work.

Thank you also to the Western Australian Telecommunications Research Insti-

tute for furnishing me with a pleasant research environment over the past four years,

providing monetary assistance when other income sources expired, and supplying

the necessary equipment on which to carry out my simulations. Additionally, I wish

to acknowledge the various fellow students who have been cohabitants of “the of-

fice” during my time at WATRI. They have given support when it was needed and

provided a welcome source of joviality in my life.

Finally, I thank my parents Colleen and Raymond, whose boundless support,

love and understanding have sustained me through these years and instilled in me

the determination to succeed at everything to which I set my mind.

Wayne Griffiths

2007.

vi

Table of Contents

Abstract iii

Acknowledgements v

List of Abbreviations xi

List of Common Symbols xiii

List of Figures xix

List of Tables xxiii

1 Introduction 11.1 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2 Summary of Major Contributions . . . . . . . . . . . . . . . . . . . . 51.3 List of Relevant Publications . . . . . . . . . . . . . . . . . . . . . . . 8

2 Foundations 112.1 Communication System Model . . . . . . . . . . . . . . . . . . . . . . 122.2 Channel Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2.1 Memoryless channels . . . . . . . . . . . . . . . . . . . . . . . 152.2.2 Channels with memory . . . . . . . . . . . . . . . . . . . . . . 19

2.3 Error Control Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . 352.3.1 Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362.3.2 Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412.3.3 Trellis representations of linear block codes . . . . . . . . . . . 45

3 APP Decoding on Discrete Channels without Memory 533.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553.2 Original Domain Matrix Representations of Linear Block Code Trellises 56

3.2.1 Matrix representation for APP decoding on BSCs . . . . . . . 563.2.2 Matrix representation for APP decoding on DMCs . . . . . . 61

3.3 Spectral Domain Matrix Representations of Linear Block Code Trellises 643.4 Instructive Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

3.4.1 Example of decoding in the original domain . . . . . . . . . . 703.4.2 Example of decoding in the spectral domain . . . . . . . . . . 72

3.5 Numerical Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

vii

TABLE OF CONTENTS

3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

4 APP Decoding on Binary Channels with Memory 83

4.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

4.2 Binary APP Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . 85

4.2.1 Original domain . . . . . . . . . . . . . . . . . . . . . . . . . . 85

4.2.2 Spectral domain . . . . . . . . . . . . . . . . . . . . . . . . . . 89

4.3 Complexity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

4.3.1 Computational complexity . . . . . . . . . . . . . . . . . . . . 94

4.3.2 Storage requirements . . . . . . . . . . . . . . . . . . . . . . . 96

4.4 Instructive Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

4.4.1 Example of decoding in the original domain . . . . . . . . . . 98

4.4.2 Example of decoding in the spectral domain . . . . . . . . . . 100

4.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

4.5.1 Description of parameter values in these simulations . . . . . . 103

4.5.2 Observations from the simulations . . . . . . . . . . . . . . . . 103

4.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

5 APP Decoding on Non-binary Channels with Memory 109


5.2 Non-binary APP Decoding . . . . . . . . . . . . . . . . . . . . . . . . 111

5.2.1 Original domain . . . . . . . . . . . . . . . . . . . . . . . . . . 112

5.2.2 Spectral domain . . . . . . . . . . . . . . . . . . . . . . . . . . 114

5.3 Properties of the Conditional Spectral Coefficients . . . . . . . . . . . 120

5.4 Complexity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

5.4.1 Computational complexity . . . . . . . . . . . . . . . . . . . . 124

5.4.2 Storage requirements . . . . . . . . . . . . . . . . . . . . . . . 125

5.5 Instructive Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

5.5.1 Example of decoding in the original domain . . . . . . . . . . 126

5.5.2 Example of decoding in the spectral domain . . . . . . . . . . 129


5.6.1 Non-binary Hamming codes . . . . . . . . . . . . . . . . . . . 131

5.6.2 The ISBN-10 code . . . . . . . . . . . . . . . . . . . . . . . . 135

5.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

6 Generalised Weight Polynomials for Binary Restricted GECs 141


6.2 Burst-error Characteristics . . . . . . . . . . . . . . . . . . . . . . . . 143

6.3 Derivation of APPs using Generalised Weight Polynomials . . . . . . 145

6.4 Instructive Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 156


6.5.1 (16,8) cyclic code . . . . . . . . . . . . . . . . . . . . . . . . . 159

6.5.2 (22,13) Chen code . . . . . . . . . . . . . . . . . . . . . . . . . 160

6.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

viii

TABLE OF CONTENTS

7 Generalised Weight Polynomials for Non-binary Restricted GECs1637.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1647.2 The Channel Reliability Factor for a Non-binary GEC . . . . . . . . 1657.3 Derivation of Non-binary APPs using Generalised Weight Polynomials 1677.4 Instructive Example . . . . . . . . . . . . . . . . . . . . . . . . . . . 1727.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

7.5.1 (4,2) Hamming code over GF (3) . . . . . . . . . . . . . . . . . 1777.5.2 (26,22) BCH code over GF (3) . . . . . . . . . . . . . . . . . . 178

7.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181

8 Conclusion and General Discussion 1838.1 Summary of Major Findings and Contributions . . . . . . . . . . . . 1838.2 Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186

Appendices 191

Appendix A Proof of (3.44) 193

Appendix B Proof of Lemma 5.3.1 195

Bibliography 197

ix

List of Acronyms

APP A posteriori probability

BCJR Bahl, Cocke, Jelinek and Raviv

BER Bit error rate

BSC Binary symmetric channel

DMC Discrete memoryless channel

GEC Gilbert-Elliott channel

GWP Generalised weight polynomial

HMM Hidden Markov model

ISBN International Standard Book Number

LDPC Low-density parity-check

LLR Log likelihood ratio

MAP Maximum a posteriori probability

ML Maximum likelihood

SER Symbol error rate

xi

List of Common Symbols

arg(·) Argument function

bin(·) Function which gives the binary representation in vector form

of its input

c(λ) Characteristic polynomial in terms of an indeterminate λ. For

a p× p matrix K, this equals det(λIp − K)

circ(l1, l2, . . . , lp) Circulant p× p matrix consisting of all p cyclic permutations

of a list of entries l1, l2, . . . lp

d Coset leader used in syndrome decoding

d(C) Hamming distance of a code C

d(u1,u2) Hamming distance between two codewords u1 and u2

dec(·) Function which gives the base 10 scalar representation of its

input

det(K) Determinant of a matrix K

diag{f(i)} Diagonal matrix with the ith diagonal element given by a func-

tion f(i)

e All-ones column vector

ei ith standard basis (row) vector

f ∗(D) Monic irreducible polynomial in indeterminate D used in

defining a field of polynomials

gi ith row of generator matrix G for a block code

gj(D) jth polynomial constraint in indeterminate D for a convolu-

tional code

hi ith column of parity check matrix H

i Index within the set of k information symbol positions

i Vector of information symbols to be encoded and transmitted

over a channel

i Vector of decoded information symbols

j Index within the set of n codeword symbol positions

Square root of -1. The imaginary unit

xiii

TABLE OF CONTENTS

k Number of information symbols per codeword in a code C

logp(·) Base p logarithm

max(·) Maximisation function

n Length of codewords in a code C

p Positive prime integer, order of the Galois field GF (p)

pb Average bit error probability for a binary GEC

ps Average symbol error probability for a non-binary GEC

psiCrossover probability within a GEC for the DMC correspond-

ing to the state si

psisi+1Crossover probability within a binary general two-state

Markov channel for the DMC corresponding to a transition

from state si at time instant i to state si+1 at time instant

i+1

s Index within the set of pn−k dual codewords

s p-ary vector representation of the decimal number s

si State of a stochastic sequential machine at time instant i

si Partial syndrome at depth i

t Index for the original domain used to refer to the cosets of a

linear block code C

tr(K) Trace of a matrix K

u Transmitted codeword of a code C

ui ith transmitted symbol of a code C

ui Estimate of the ith transmitted symbol ui obtained by decod-

ing

u⊥s sth codeword of a dual code C⊥

u⊥s,j jth symbol of the sth codeword of a dual code C⊥

v Vector of received symbols

vi ith received symbol of received word v

vj Invert or logical NOT of the jth received bit vj

vecp(·) Function which outputs the base p representation in vector

form of its input

w Complex pth root of unity

wi ith row of the Walsh-Hadamard transformation matrix

x Average fade to connection time ratio for a GEC

y Burst factor for a GEC

z Channel reliability factor for a GEC

xiv

TABLE OF CONTENTS

Aj Number of words of weight j in a code C

A(z) Weight polynomial for a code C

B ‘Bad’ state of a GEC

B Branch set of a trellis

Bj Number of words of weight j in a dual code C⊥

B(ui)(x, y, z) Generalised weight polynomial in terms of burst-error char-

acteristics x, y and z for the ith transmitted symbol ui

B(z) Weight polynomial for a dual code C⊥

C Linear block code

C Field of complex numbers

C Capacity of a channel

C⊥ Dual of the code C

D Stochastic sequential machine

D Stochastic automaton for a channel model with memory

D Stochastic state transition matrix for a channel with memory

D0 Matrix probability corresponding to correct transmission over

a channel with memory

D1 Matrix probability for binary codes corresponding to incorrect

transmission over a channel with memory

Dǫ Matrix probability for non-binary codes corresponding to in-

correct transmission over a channel with memory

DfiMatrix probability for non-binary codes corresponding to re-

ceiving the symbol which is fi greater (mod p) than the trans-

mitted symbol ui over a channel with memory

E Set of nodes in a trellis at depth zero

F Set of nodes in a trellis at depth n

G ‘Good’ state of a GEC

G Generator matrix for a linear block code C

GF (p) Galois field of order p

H Parity check matrix for a linear block code C

Ipn−k Identity matrix of order pn−k

KH Hermitian of a matrix K

KT Transpose of a matrix K

K∗ Complex conjugate of a matrix K

K−1 Inverse of a matrix K

[K](l) lth principal leading submatrix of a matrix K

xv

TABLE OF CONTENTS

M Set of matrix probabilities using the spectral domain for a

restricted GEC. Equal to {D, δ}Mhj

(uj) Matrix representation of trellis section corresponding to col-

umn hj for transmitted symbol uj

Mh(u) Elementary trellis matrix

N Set of non-negative integers

N Node set of a trellis

Nt Set of nodes of a trellis at depth t which can be reached from

the set E of nodes at depth zero

O Big-O notation for the asymptotic upper bound on complexity

OC(u) Orthogonal complement of a vector u

P State transition probability from ‘good’ state G to ‘bad’ state

B in a GEC

Pt Coset probability of the coset Vt in syndrome decoding

Pt(ui|v) APP for transmitted symbol ui given received word v when

encoding using the tth coset

P(ui|v) Vector of APPs for transmitted symbol ui given received word

v

Q State transition probability from ‘bad’ state B to ‘good’ state

G in a GEC

Q(ui|v) Vector of conditional spectral coefficients for ith transmitted

symbol ui and received word v

Qs(ui|v) sth conditional spectral coefficient for ith transmitted symbol

ui and received word v

Qs(ui|v) sth conditional spectral coefficient matrix for ith transmitted

symbol ui and received word v

Q(ui)s (x, y, z) sth conditional spectral polynomial for the ith transmitted

symbol ui, in terms of burst-error characteristics x, y and z

R Code rate of a code C. Equal to kn

S Number of states in a stochastic sequential machine

S Set of states in a stochastic sequential machine

U Alphabet of symbols which can be transmitted on the channel

UhjWeighted trellis section matrix for column hj of H regardless

of transmitted symbol

Uhj(uj) Weighted trellis section matrix for column hj of H and trans-

mitted symbol uj

xvi

TABLE OF CONTENTS

UH(ui) Matrix representation for entire weighted trellis described by

parity check matrix H for ith transmitted symbol ui

V Alphabet of symbols which can be received on the channel

Vt tth coset of a code C

W Spectrum of eigenvalues of the elementary trellis matrices

Wpn−k Walsh-Hadamard transformation matrix of order pn−k

Y Storage requirement for implementation of an algorithm in

terms of the size in memory of a single real number

Z+ Set of positive integers

Z[x, y, z] Ring of polynomials with integer coefficients and in indeter-

minates x, y and z

δ Difference matrix for a restricted GEC

δa,b Dirac-delta function, equal to 1 if a = b and 0 otherwise

δorig Homomorphism resulting in a circulant matrix representation

δspec Homomorphism resulting in a diagonal matrix representation

ǫ Crossover probability for a BSC or DMC

ι0 First row of the Walsh-Hadamard matrix Wpn−k

σ0 Stationary state distribution for a stochastic sequential ma-

chine

σsiStationary state probability for state si

τ 0 Vector post-multiplied by the representation of a decoding

trellis in the original domain in order to select paths com-

mencing at the 0th node

∆ Difference matrix for a channel with memory

∆0,∆1 Difference weights on a weighted diagonal trellis for decoding

over a BSC

Λhj(uj) Spectral matrix representation of trellis section corresponding

to column hj for transmitted symbol uj

Λh(u) Elementary spectral matrix

ΘhjWeighted spectral matrix for column hj of H regardless of

transmitted symbol

Θhj(uj) Weighted spectral matrix for column hj of H and transmitted

symbol uj

xvii

TABLE OF CONTENTS

ΘH(ui) Spectral matrix representation for entire weighted diagonal

trellis described by parity check matrix H for ith transmitted

symbol ui

Θs,i(ui) Diagonal weightings used in the weighted spectral matrix

ΘH(ui) for ith transmitted symbol ui

xviii

List of Figures

2.1 Block diagram of the considered digital communication system. . . . 13

2.2 Transition probability diagram of a BSC. . . . . . . . . . . . . . . . . 16

2.3 Transition probability diagram of the p-ary symmetric DMC model:(a) standard model, (b) alternative model. . . . . . . . . . . . . . . . 18

2.4 Structure of the binary general two-state Markov model. . . . . . . . 23

2.5 Structure of the binary GEC model. . . . . . . . . . . . . . . . . . . . 25

2.6 Structure of the standard non-binary GEC model. . . . . . . . . . . . 26

2.7 Structure of the binary Gilbert channel model. . . . . . . . . . . . . . 29

2.8 Structure of the non-binary Gilbert channel model using the standardp-ary DMC model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.9 Structure of the binary restricted GEC model. . . . . . . . . . . . . . 32

2.10 Structure of the standard non-binary restricted GEC model. . . . . . 33

2.11 Structure of the alternative non-binary restricted GEC model. . . . . 34

2.12 A (2,1,3) convolutional encoder constructed with generator polyno-mials g1(D) = D2+D+1 and g2(D) = D2+1. . . . . . . . . . . . . . 38

2.13 A basic trellis with eight nodes and seven branches. . . . . . . . . . . 46

2.14 Trellis representations of the binary (4,2) linear block code C:(a) standard syndrome trellis, (b) minimal trellis (Dashed: si+1 = si,Solid: si+1 = si ⊕ hT

i+1). . . . . . . . . . . . . . . . . . . . . . . . . . 48

2.15 Trellis representations of the binary (4,2) linear block code C suitablefor computing APPs: (a) P (u2 = 0|v), (b) P (u2 = 1|v) (Dashed:si+1 = si, Solid: si+1 = si ⊕ hT

i+1). . . . . . . . . . . . . . . . . . . . . 51

3.1 Original domain APP decoding trellises for the binary (4,2) linearblock code C which allow for computation of the conditional proba-bilities (a) P (u2 = 0|v) and (b) P (u2 = 1|v). (Dashed: sj+1 = sj,Solid: sj+1 = sj ⊕ hT

j+1.) . . . . . . . . . . . . . . . . . . . . . . . . . 73

3.2 Illustration of the relationship between the codewords u⊥s , s = 0, 1, 2, 3,

of the dual code C⊥ and the spectral section matrices Λhj(uj), j =

1, 2, 3, 4, uj = 1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

3.3 Weighted diagonal trellises of the binary (4, 2) linear block code Cused for computing the conditional spectral coefficients (a) Qs(u2 =0|v) and (b) Qs(u2 = 1|v); s = 0, 1, 2, 3. . . . . . . . . . . . . . . . . . 78

3.4 BER performance of some binary block codes on a BSC. . . . . . . . 80

3.5 SER performance of some block codes over GF (3) on a ternary DMC. 81

xix

LIST OF FIGURES

4.1 Original domain weighted APP decoding trellises for the binary (4,2)linear block code used to compute (a) P (u2 =0|v) and (b) P (u2 =1|v).(Dashed: sj+1 = sj, Solid: sj+1 = sj ⊕ hT

j+1.) . . . . . . . . . . . . . . 99

4.2 Weighted diagonal trellises of the binary (4, 2) linear block code usedfor computing the spectral coefficients (a) Qs(u2 = 0 | v) and (b)Qs(u2 = 1 | v); s = 0, 1, 2, 3. . . . . . . . . . . . . . . . . . . . . . . 102

4.3 Performance of the (7,4) Hamming code on a GEC with Q=0.01 and(a) pB = 0.1, (b) pB = 0.5. . . . . . . . . . . . . . . . . . . . . . . . . 104

4.4 Performance of the (7,4) Hamming code on a GEC with Q=0.3 and(a) pB = 0.1, (b) pB = 0.5. . . . . . . . . . . . . . . . . . . . . . . . . 105

5.1 Weighted diagonal trellis of the (4, 2) linear block code C over GF (3)used to compute spectral coefficients Qs(u2 = 0|v); s = 0, 1, . . . , 8. . . 132



5.4 Performance of the (4,2) Hamming code over GF (3) on a GEC withstate transition probabilities P=0.05 and Q=0.2. . . . . . . . . . . . 136



5.7 Performance of the ISBN-10 code on a GEC with state transitionprobabilities P=0.05 and Q=0.2. . . . . . . . . . . . . . . . . . . . . 137

6.1 Weighted diagonal trellises of the binary (4, 2) linear block code C

used to compute spectral polynomials (a) Q(u2=0)s (x, y, z) and (b)

Q(u2=1)s (x, y, z); s = 0, 1, 2, 3 for a binary restricted GEC. . . . . . . . 158

6.2 Performance of the (16,8) block code on a binary restricted GEC andpb = 1%. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

6.3 Performance of the (22,13) Chen code on a binary restricted GECwith average fade to connection time ratio x = 0.004. . . . . . . . . . 161

7.1 The relationship between the channel reliability function z and themean SER, ps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

7.2 Weighted diagonal trellis of the (4, 2) linear block code C over GF (3)

used to compute conditional spectral polynomials Q(u2=0)s (x, y, z); s =

0, 1, . . . , 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174



0, 1, . . . , 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175



0, 1, . . . , 8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

xx

LIST OF FIGURES

7.5 Performance of the (4,2) Hamming code over GF (3) on restrictedGECs with pB = 1

3, ps =1%. . . . . . . . . . . . . . . . . . . . . . . . 179

7.6 Performance of the (26,22) BCH code over GF (3) on restricted GECswith pB = 1

3, ps =1%. . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

xxi

List of Tables

2.1 Addition and multiplication table of the Galois field GF (2). . . . . . 37

8.1 Elements of GF (32) and their ternary vector images. . . . . . . . . . 187

xxiii

Chapter 1

Introduction

Human beings have always been searching for ways in which their lives can be made

easier and/or more efficient. For example, the development of wireless communi-

cations through the 20th century and beyond has made life more convenient, since

it is becoming increasingly possible to communicate with anyone or any machine,

anywhere, and at any time. There no longer needs to be a wired connection to a

network in order to communicate.

As attested to by [1], the demand for wireless communications has undergone

exponential growth since the 1990s. Not only are people talking wirelessly across

the globe in increasing numbers, but there has also been a boom in wireless internet

usage. The fact which must be taken into account is that wireless communications

are usually less reliable than for a wired channel. Imagine, for example, the following

common scenario. A mobile telephone user wishes to receive a call from another

person. Whether that other person is using a landline or mobile phone is irrelevant.

The signal will need to be transmitted from a base station to the mobile handset.

Even though base stations are often in raised locations, there will not necessarily

be a line of sight path for the signal. There will however usually be a number of

paths between the transmitter and receiver which result from reflections of the signal

off obstacles which occupy the space between them. The effects of these multiple

paths include, but are not limited to, a time delay between different versions of

the received signal, and differences in phase or amplitude. The signals received by

the handset can interfere with each other, sometimes constructively and sometimes

destructively. All of this is further complicated by the fact that the receiver will

usually be moving in space. There are times when the signal will be good, such

that its strength is above a certain threshold and not many errors will occur. By

contrast, there will be times when the signal is not good and many errors will occur.

On these occasions, the signal is said to be experiencing fading.

1

The wireless communication environment is also extremely complex. Probabili-

ties of correct reception are highly time- and space-dependent. In order to simplify

concepts and testing, channel models are used. One of the simplest models for this,

as evidenced by [2], is a Gilbert-Elliott channel (GEC) model [3]. A GEC model

consists of two distinct states and the channel is only in one of the two states at any

time. Transference between states occurs as a result of a random variable. One state

corresponds to times when there are relatively few errors, and the other corresponds

to times of relatively many errors. It is also important to note that it is difficult to

determine exactly how many states are required to model the mobile environment

and the figure may vary with the actual data [4].

One method of protecting data from channel errors, or rather, retrieving the

data after such errors have occurred, is to employ error control coding. Redundancy

is used to assist in determining the transmitted symbols. There is also a decision

to be made over which signalling alphabet to use. The traditional approach is to

use binary symbols, which permit only yes-or-no decisions. If a higher resolution of

decisions is required, a logical choice is to use the symbols from a non-binary field.

In particular, ternary is looked upon the most favourably in terms of economy of

information representation [5]. However, as the amount of information contained in

each symbol is increased, the more one stands to lose if that symbol is corrupted in

the transmission process. Extension fields, which have an order given by a prime

raised to a power of two or more, are also an option. Here, several symbols may be

used to represent a single element of the extension field.

If error control coding is used, then there is required to be a way of obtaining

an estimate of the transmitted data, in other words, decoding. For the past 60

years, there has been a search for codes which will meet the so-called Shannon

limit [6]. With recent advances such as the discovery of turbo codes [7], there

is renewed interest in decoding algorithms. In particular, a posteriori probability

(APP) decoding algorithms make use of soft information, resulting in performances

which approach the Shannon limit. Considering the above discussion points, there is

thus a need to investigate APP decoding algorithms which can be easily implemented

with a variety of codes over a field of any size, and on a variety of channels described

by a finite number of states.

Several APP decoding algorithms have been developed over the years assuming

discrete memoryless channels (DMCs) or perfectly interleaved flat-fading channels

in the presence of additive white Gaussian noise. The first major advance in the de-

velopment of favourable decoding algorithms was the trellis-based decoding scheme

by Bahl, Cocke, Jelinek and Raviv [8] referred to as the BCJR algorithm. Other

2

research in the area of APP decoding has focussed on reducing the computational

complexity of the decoding to a single-sweep algorithm [9,10], use of general input-

output hidden Markov models [11], or on exploiting features of the code itself [12,13],

or the dual code [14]. There has been increased recent interest in non-binary com-

puting [15], promoting the use of codes over such fields [9, 16, 17] and over rings

in [18]. However, the implementation of APP decoding algorithms can often be too

complex in practice. This is especially true for codes over large fields because more

probabilities need to be calculated. It is also interesting to note the most recent re-

search on decoding of general low-density parity-check (LDPC) codes over Abelian

groups [19] and non-binary LDPC codes over prime and extension fields [20]. This

recent research is motivated by the observation that the error performance of LDPC

codes can be improved for moderate code length by increasing the order of the

associated group or field.

It is also noted that due to the increasing demand for efficient utilisation of

the available bandwidth, especially in wireless communications and mobile radio

systems, higher rate codes are becoming more desirable, which promotes the use

of block codes. Although there has been some research into turbo decoding of

convolutional codes over a Markov channel [21,22] and a GEC [23], and some recent

work on the analysis of LDPC codes for the GEC [24], little can be found on APP

decoding of linear block codes over a GEC.

This lack of APP decoding methods for linear block codes over channels with

memory has motivated the research presented in this thesis. In particular, APP de-

coding algorithms for binary and non-binary linear block codes over discrete channels

with and without memory are developed. The proposed algorithms can be classified

into the field of single-sweep APP decoding techniques that use matrix multiplica-

tions and exploit the concepts of ‘dual APP’ decoding [9, 10, 14]. By formulating

suitable matrix representations for the different channel-decoder cascades, the tools

of linear algebra are used to develop APP decoding algorithms in an ‘original’ do-

main and a ‘spectral’ domain. The relationship between these two domains can

be formulated by a similarity transformation. In this way, it is possible to develop

efficient APP decoding algorithms for linear block codes on discrete memoryless

channels as well as on channels with memory that can be modelled by stochastic

sequential machines. As far as channels with memory are concerned, the focus of

this thesis will be the GEC.

By fixing the crossover probability in the ‘bad’ state of the GEC such that for

a given transmitted symbol, all symbols are equally likely to be received, an APP

decoding decision can efficiently be reached by evaluating trivariate polynomials.

3

1.1. THESIS STRUCTURE

It should also be stressed that this particular class of GEC is described by three

variables, namely, the average fade to connection time ratio, the burst factor, and

the channel reliability factor. As these variables can be deduced from error sequence

measurements, relevance to practical digital communication scenarios is readily pro-

vided. This contribution of the thesis is aesthetically pleasing as it reveals similarities

to the way that the well-known MacWilliams identities [25] use polynomials to de-

scribe the relationship between the weight distribution of a code and its dual. In this

sense, a relationship between the APPs of the original domain and the coefficients

of the spectral domain is presented. By analogy with the MacWilliams identities,

the polynomial expressions derived in this thesis, for binary and non-binary codes

respectively, are termed generalised weight polynomials.

It should be noted, however, that the thesis is not intended as a database of

a set of possible codes and their corresponding error correction performances over

different channels. The purpose of the thesis is instead to demonstrate the means

by which one could determine such capabilities, by reporting the necessary theory

and developing the analytical framework.

1.1 Thesis Structure

It is intended that the chapters of this thesis be read in the order of their presenta-

tion, with each one building on information from its predecessors. The thesis begins

with some of the basic concepts about channel models and error control coding.

Then, the middle part of the thesis deals with APP decoding algorithms for mem-

oryless channels and some standard channels with memory. In the later chapters,

some exciting polynomial-based APP decoding algorithms are reported for a par-

ticular type of finite state channel. Finally, there is a brief discussion on the future

directions in which this research could progress.

The following is a breakdown of the main content of each chapter.

Chapter 2 presents the necessary background material to ensure easy under-

standing of the concepts concerning channel models, coding and decoding which

occur later in the thesis. Specifically, it covers symmetric DMC models, as well

as channels with memory constructed using a hidden Markov model (HMM). In

addition, the basics of coding theory are covered, beginning with finite field arith-

metic. This progresses through explanations of convolutional and block encoding,

including some required information on dual codes. Some decoding strategies are

also discussed, including syndrome decoding, sequence estimation decoding, symbol

estimation decoding and the corresponding APP decoding formulations. All of this

4

1.2. SUMMARY OF MAJOR CONTRIBUTIONS

leads up to the final section on trellis decoding, which provides a foundation for the

decoding procedures presented in later chapters.

The first of the APP decoding algorithms are presented in Chapter 3. They are

designed to function with memoryless channels. Initially, the matrix representation

of the decoding trellis is constructed, based entirely on the structure of the code.

The matrices must then be weighted by the probabilities of transmission error or

non-error which form part of the channel model. Memoryless channels may be

viewed as degenerate examples of channels with memory having one, rather than

multiple states and so these weightings are matrices of dimension one, or scalars. It

is possible to perform the decoding using the representation of the original trellis,

or alternatively to diagonalise the matrices representing the code and work in the

spectral domain.

The adaptation of the original and spectral domain algorithms for use with

channels described by a stochastic sequential machine, such as a GEC, is covered

in the subsequent two chapters. To be specific, Chapter 4 reports the procedure

as applied to binary codes, while Chapter 5 expands the idea to include non-

binary fields and analyses the conditional spectral coefficients of the spectral domain.

The computational complexity and storage requirements of these procedures are

examined in both chapters.

Specialisations to the restricted GEC models are the focus of Chapters 6 and

7. Such models contain two states, one of which has the property that for a given

transmitted symbol, each symbol of the signalling alphabet is equally likely to be

received. Polynomial representations of the conditional spectral coefficients from

Chapters 4 and 5 for this specific channel type are derived. Instead of the basic error

and state transition probabilities which form the matrix elements that are used in

the procedures of the preceding chapters, the three variables of these polynomial

expressions reflect the nature of the error bursts on the channel. The paradigm of

generalised weight polynomials for binary and non-binary codes is also derived in

these Chapters 6 and 7, respectively.

Finally, Chapter 8 summarises the contributions of the thesis and explores some

of the areas in which future research related to this topic could be undertaken.

1.2 Summary of Major Contributions

A list of the significant items contained within the body of this thesis which represent

additions to scholarship is given below. These items have each been categorised

into one of four areas, which are channel modelling, APP decoding for discrete

5


channels without memory, APP decoding for discrete channels with memory, and

APP decoding using generalised weight polynomials for restricted GECs.

Channel modelling

Symmetric discrete channels without memory can be described with the aid of a

parameter called the crossover probability. It was noted early in the research period

that two unequal descriptions of the non-binary varieties of such a class of channel

could be made. In order to emphasise that both viewpoints are equally correct,

much of the research in this thesis is presented using both models. Non-binary

symmetric DMCs are vital to the structure of non-binary restricted GEC models.

It is possible to describe such models using three parameters, one of which is known

as the channel reliability factor. Although the definition of this parameter has

already been established for a GEC [26], its derivation for a non-binary restricted

GEC represents new material. Thus, the achievements of this thesis in the area of

channel modelling are:

• Characterisation of non-binary symmetric DMCs using two different view-

points, depending on how the crossover probability is defined.

• Application of the concept of the channel reliability factor to non-binary re-

stricted GECs.

APP decoding for discrete channels without memory

In order to discuss the proposed APP decoding methods for channels which possess

memory, an important first step is the development of similar procedures for use

when the channel is without memory. Such theory has already been researched [27],

however the consideration of such schemes by the author has revealed a new way

of interpreting the involved matrix operations. Additionally, the implementation of

these methods using computer simulation of the communication system resulted in

the collection of performance data for several binary and non-binary linear block

codes. Therefore, the contributions of the thesis in this area are:

• Use of the trace of the weighted spectral matrix of the full weighted trellis to

deliver the APPs which are required for the decoding procedure.

• Numerical examples obtained by computer simulations which demonstrate the

range of performance analysis options that are supported by the proposed APP

decoding procedures for DMCs.

6


APP decoding for discrete channels with memory

One of the exciting breakthroughs of this thesis is the novel application of the APP

decoding algorithms for memoryless channels to a channel with memory. This is

chiefly handled by adapting the weighting mechanism for the multi-dimensional

weights which are a feature of these channels. This ideology follows through in both

the original and spectral domains, thus making possible the development of a suite of

APP decoding algorithms. The usefulness of such a suite is supported by an analysis

of both the computational complexity and the storage requirements of the reported

algorithms, in terms of the parameters of the linear block code and the number

of states of the channel. In particular, the storage benefits of the spectral domain

approach are highlighted. These methods are also dramatically more efficient than

some other known approaches, allowing speedy recovery of corrupted data. Whilst

doing this research, the author was also able to prove an interesting fact about

the coefficients of the spectral domain, which are counterparts of the APPs of the

original domain. Hence, the main contributions of this thesis to scholarship in the

area of APP decoding for discrete channels with memory are:

• Development of an APP decoding procedure using the original domain for

binary linear block codes over a binary channel described by a stochastic au-

tomaton.

• Adaptation of the above algorithm to non-binary linear block codes over a

non-binary channel described by a stochastic automaton.

• Employing matrix diagonalisation to deduce spectral domain alternatives to

the preceding two algorithms.

• Derivation of a result concerning the conditional spectral coefficients which is

of interest to probability theorists.

• Complexity analysis for the implementation of the above four algorithms, as

well as an examination of their storage requirements. From this investigation,

the benefits of the spectral domain approach for high rate codes become clear.

• Instructive examples which showcase the differences in the presented APP

decoding approaches between channels with and without memory.

• Numerical examples displaying a variety of performance analysis options avail-

able when investigating Hamming codes over GECs in conjunction with these

APP decoding procedures.

7

1.3. LIST OF RELEVANT PUBLICATIONS

APP decoding using generalised weight polynomials for restricted GECs

Another advancement presented in this thesis involves using generalised weight poly-

nomials on restricted GECs. Previously, APP decoding methods oriented to this

specific channel model had not been published, as the closest contribution [26] con-

sidered a different type of decoding. The other main achievement in this area of the

thesis is the extension of this method for use with non-binary codes over non-binary

restricted GECs. Not only is a proof provided here of a conjecture in [26] regarding

the expression of conditional spectral coefficients in terms of the burst-error charac-

teristics of a restricted GEC, but the result is also extended to non-binary restricted

GECs. Thus, the major contributions of this thesis to the area of APP decoding for

restricted GECs are:

• Derivation of the conditional spectral coefficients used in APP decoding on

both binary and non-binary restricted GECs, in terms of the burst-error char-

acteristics of the channel.

• A new proof of the conjecture in [26] regarding the definition of the condi-

tional spectral coefficients in terms of burst-error characteristics for a binary

restricted GEC, as well as the generalisation of this theory for a non-binary

restricted GEC.

• Discussion of the link between the conditional spectral polynomials and the

binary and non-binary MacWilliams identities.

• Formulation of APP decoding procedures for both binary and non-binary re-

stricted GEC models using generalised weight polynomials.

• Numerical examples highlighting some of the many possible applications of

the theoretical framework developed in this thesis.

1.3 List of Relevant Publications

A list of the publications which have been authored or co-authored by the author

of this thesis during the time of his candidature which are relevant to the material

contained herein is as follows.

(P.1) H.-J. Zepernick, W. Griffiths and M. Caldera, “APP decoding of binary

linear block codes on Gilbert-Elliott channels,” in Proc. 2nd IEEE Int. Symp.

on Wireless Commun. Systems, Siena, Italy, Sept. 2005, pp. 434-437.

8


The objective of (P.1) was to demonstrate the existence of APP decoding

algorithms which could be applied to binary linear block codes when commu-

nicating over a GEC. Although two algorithms were developed, one using the

original domain and one using the spectral domain, only the latter is described

explicitly. The paper is augmented by an example and simulation results. This

research forms the basis of Chapter 4.

(P.2) W. Griffiths, “APP decoding of linear block codes on Fritchman channels,”

in Proc. 5th Australian Telecommun. Cooperative Research Centre Workshop,

Melbourne, Australia, Nov. 2005, pp. 50-53.

This paper has a similar goal to (P.1), however it is adapted for use with a

single error state Fritchman channel model instead of a GEC. Such a channel

model can have a large number of states in order to perhaps model the bursty

nature of the wireless channel more closely. This increase in states comes

with a corresponding increase in computational complexity when performing

APP decoding. The overall strategy in (P.1) and (P.2) is however the same.

Chapter 4 presents the general algorithms for the spectral and also the original

domain, regardless of the state structure of the finite state Markov model.

(P.3) W. Griffiths, H.-J. Zepernick and M. Caldera, “APP decoding of block

codes over Gilbert-Elliott channels using generalized weight polynomials,” in

Proc. IEEE Veh. Technol. Conf., Melbourne, Australia, May 2006, pp. 1998-

2002.

The majority of the material in Chapter 6 has its origins in (P.3). Here, the

channel model used is a particular type of binary GEC. This model requires

only three parameters in order to describe its behaviour. The paper demon-

strates how this fact can be used to produce an algorithm based on polynomials

rather than matrices. The polynomials are related to the weight polynomials

of the MacWilliams identity [25], since they describe a relationship between

the dual and original domains. A short example and some numerical results

are also provided in (P.3).

(P.4) W. Griffiths, H.-J. Zepernick and M. Caldera, “On APP decoding of

non-binary block turbo codes over discrete channels,” in Proc. Int. Symp. on

Inf. Theory and its Appl., Parma, Italy, Oct. 2004, pp. 362-366.

Finally, the concept of schemes like the one presented in (P.4) is alluded to

in Chapter 8, under the heading of Future Research. This thesis is predom-

9


inantly concerned with maximum likelihood (ML) algorithms which assume

equally likely a priori information and do not reuse reliability information.

Nevertheless, this information is available for use when dealing with iterative

decoding or concatenated codes. These schemes are slightly more computa-

tionally complex than those presented here, however they can give better error

correcting performance. The paper (P.4) is one possibility of deploying itera-

tive decoding techniques, as it concerns memoryless channels only. Additional

possible research areas are discussed in Chapter 8.

10

Chapter 2

Foundations

When modelling a system, it is beneficial to use a model which is simple to under-

stand, yet is still sufficiently complex to include all of its intricacies. Furthermore,

the problem to be solved as well as any additional devices used in combatting such

a problem must be able to be expressed using the model. The purpose of this chap-

ter is to set up the framework used in this thesis for the discussion of transmission

and reception of information over discrete channels. In particular, the framework is

oriented towards using a posteriori probability decoding algorithms to attempt to

retrieve information which has been encoded using linear block codes.

One objective of this thesis is to design decoding algorithms for use over a variety

of communication channels. In order to do this, a method of modelling the channels

is established. Such models must reflect the different symbol alphabets which can

be used for transmission, as well as the way in which the errors occur. A model

for a channel where the errors occur independently is likely to be simpler than one

where there is dependence in the error generation process.

A common strategy to protect data from the effect of errors is to employ channel

encoding. As this thesis concerns linear block codes, the focus will be on this type

of code and related concepts such as the dual code and Hamming distance. Equally

important in the overall model is the channel decoding step, as this allows the

transmitted symbols to be estimated. Many methods exist for achieving this, and

they may be classified in different ways. Such classifications include whether or not

reliability information is used, and whether decoding estimates are made symbol-by-

symbol or word-by-word. Some decoding algorithms are trellis-based. Trellises can

display information about the code and the likelihoods of receiving particular symbol

sequences using such a code. Later chapters of this thesis will use representation

theory to map the information given in a trellis into matrix form.

This chapter is organised as follows. A general overview of the communica-

11

2.1. COMMUNICATION SYSTEM MODEL

tion system used in this thesis is presented in Section 2.1. Section 2.2 provides

an overview and discussion of some of the discrete channel models for which the

decoding algorithms presented in the following chapters can be used. These chan-

nel models can be divided into two broad categories, those with memory and those

without. The focus of Section 2.3 is how coding can be used to attempt to recover

information which has been corrupted upon arrival at the receiver. The first part of

Section 2.3 deals with the different kinds of encoding processes which are available.

Next, a brief overview of the main types of decoding methods is given. Finally,

it is shown how trellis representations of linear block codes may assist with both

syndrome and APP decoding.

The main contributions to the body of knowledge in communications theory to

be found in this chapter are:

• Characterisation of a non-binary discrete memoryless channel in two distinct

ways, depending on the definition of the term “crossover probability”.

• Definition of the channel reliability factor for non-binary restricted Gilbert-

Elliott channels.

2.1 Communication System Model

The fundamental task in communications is the transmission of information from

a source to a sink. Besides the source, the transmitter may comprise units such as

a source encoder, channel encoder and a modulator. The objective of the source

encoder is to reduce redundancy which may be contained in the information re-

leased from the sink in order to reduce the total amount of data that needs to be

transmitted. In this way, system resources such as power and bandwidth may be

conserved. Given the increased susceptibility of information that has been com-

pressed by source encoding to channel errors, some form of channel encoding is

commonly employed in communication systems to protect the information prior to

transmission. In particular, error control coding is used, through which redundancy

is added to the source encoded information in a systematic way. This will enable

the receiver to detect, and if possible, correct transmission errors to some degree.

The receiver would then include a demodulator, channel decoder, and source de-

coder to perform the inverse operations. In this way, an estimate of the sequence of

transmitted symbols is produced for release to the sink.

The spatial separation between transmitter and receiver is bridged by a transmis-

sion channel. This channel may be associated with a wired or a wireless environment

12


SourceChannel

EncoderModulator

Channel

DemodulatorChannel

DecoderSink

- -

��

?

?

Discrete Channel

i u

vi, u

Figure 2.1: Block diagram of the considered digital communication system.

and can thus possess a variety of stochastic characteristics. Obtaining an accurate

description of how information is passed from source to sink requires the generation

of a channel model which is as precise as possible with respect to the physical channel

under study. Parameters of the channel model include specifications such as the size

of the signalling alphabet and the degree of statistical dependencies between error

events that usually impair the transmission of information. The resulting errors in

the received sequence of information may originate from a number of causes, such

as signal attenuation, multipath propagation, interference and additive noise at the

receiver. These errors cause information degradation at the receiver, so that unless

action is taken, the information will arrive incorrectly at the sink.

A block diagram of the components of the communication system model consid-

ered in this thesis is given in Fig. 2.1. As the main focus of the work presented is

related to the channel-decoder cascade, the functionalities of source encoding and

decoding are not considered. Instead it is assumed that perfect source encoding

has been achieved, which supports the concept of the source releasing symbols or

sequences of given length with equal probability. In addition, the involved signals

are assumed to be discrete in both time and value. Such signals are then referred to

as digital signals. The considered communication system is therefore also known as

a digital communication system over which digital transmission is performed. The

components of the considered digital communication system that will be adopted

for this thesis are then as follows:

13


Source: The discrete source produces sequences of given length k comprised of

information symbols that are taken from a finite signalling alphabet. The

result is also referred to as an information word and will be denoted by

i = [i1, i2, . . . , ik]. (2.1)

Channel encoding: The information words i are processed by a channel encoder,

which performs a mapping of the input words i of length k onto output words

u of length n. The resulting word is referred to as codeword and also consists

of symbols from a finite alphabet. Codewords will be denoted by

u = [u1, u2, . . . , un]. (2.2)

In this thesis, the paradigms of binary and non-binary linear block codes will

be considered to perform the task of systematically introducing redundancy

to the information words.

Discrete channel: The transmission medium will be modelled as a discrete chan-

nel and can be considered as an abstraction of the modulator, the physical

channel, and the demodulator. As such, the discrete channel processes code-

words u and releases received words

v = [v1, v2, . . . , vn]. (2.3)

As a consequence, the particular modulation and demodulation schemes to-

gether with the complex description of the statistical properties of the physical

channel are replaced by a discrete model and the related pattern of error se-

quences. In the case that the error events are statistically independent for

each discrete time instant, the channel is considered as being memoryless,

otherwise it is said to possess memory. The concept of a discrete channel is

particularly suited to conducting a performance assessment of potential chan-

nel coding schemes within a given application and an early design phase of a

digital communication system. It has been adopted, for example, in analysing

the performance of linear block codes on channels with and without memory

when using syndrome decoding [28], the analysis of turbo codes on channels

with memory [23], and the analysis of LDPC codes for the GEC [24].

Channel decoder: The channel decoder processes the received word v to produce

an estimate u of the transmitted codeword u and eventually an estimate i

14

2.2. CHANNEL MODELS

of the information word i released by the source. In this thesis, symbol-by-

symbol APP decoding schemes for binary and non-binary linear block codes in

conjunction with different classes of discrete channel models with and without

memory are considered.

2.2 Channel Models

A sequence transmitted in a digital communication system is usually subject to

channel noise and other impairments. These may result in the received sequence

being different to the transmitted sequence. There can be differences in the lengths

and/or values of the sequences. In the latter case, the channel causes errors to occur.

In terms of modelling the behaviour of an arbitrary channel, there are two pos-

sibilities for describing the relationship between these errors. Firstly, errors could

occur independently of the errors that have occurred in all previous time periods.

In this situation, the channel model is termed memoryless, because the model is

not required to keep a history of its error events. An example of this model type

is the binary symmetric channel (BSC) model. Alternatively, the pattern of errors

in previous time instants could affect the probability of there being an error in the

current time instant. The corresponding type of channel model needs to retain a

description of past events and is said to have memory. One such channel model is

the GEC model.

In order to obtain the truest reconstruction of real communication channels,

choosing the correct type of channel model is paramount. For example, wireless

channels often experience fades due to multipath fading as well as Doppler shifts

due to the movement of the mobile station or objects in the environment. The

result is that the errors may occur in bursts during some time periods. This type

of behaviour is more naturally modelled by a channel with memory, as the bursts

typically last for more than one time instant.

This section describes some of the channels in conjunction with which the de-

coding methods as presented in Chapters 3 to 7 can be used. Firstly, a selection of

memoryless channels designed for different signalling alphabets will be presented.

This is followed by the descriptions of several channels with memory.

2.2.1 Memoryless channels

Let the alphabet of symbols which may be transmitted and received be denoted Uand V , respectively. The alphabets U and V are also called input alphabet or input

15

2.2. CHANNEL MODELS

-

-

��

��

�3Q

QQ

QQs

��

��

��

��QQ

QQ

QQ

QQut vt1 − ǫ

1 − ǫ

ǫ

ǫ

0 0

1 1r

r

r

r

Figure 2.2: Transition probability diagram of a BSC.

set, and output alphabet or output set, respectively. In this thesis, input and output

alphabets are assumed to be of equal size, i.e., |U| = |V|. Furthermore, for discrete

time instant i ∈ Z+, denote the ith transmitted and received symbols by ui ∈ U and

vi ∈ V, respectively.

A memoryless channel is characterised by the property that for each input se-

quence or input word u of length n, and output sequence or output word v of length

n, the conditional probability P (v|u) can be written as the product

P (v|u) =n∏

i=1

P (vi|ui) (2.4)

of the so-called transition probabilities P (vi|ui). In other words, each output symbol

vi ∈ V depends only on the corresponding input symbol ui ∈ U at discrete time

instant i, but not on the history of input symbols prior to the current time instant.

In terms of the error process, the property of being memoryless means that the

underlying noise affects each input symbol independently.

Binary symmetric channel

The input alphabet U and output alphabet V of a BSC consist of two elements. For

convenience, denote these binary elements as 0 and 1. The conditional probabilities

of a BSC are symmetric and are given by the transition probabilities

P (vi|ui) =

{

1 − ǫ if vi = ui,

ǫ if vi 6= ui.(2.5)

As can be seen from (2.5), a BSC is completely described by the parameter ǫ, which

is referred to as the crossover probability. Also, the structure of a BSC may be

concisely represented by a transition probability diagram as shown in Fig. 2.2. A

binary channel model would be suitable for systems that incorporate binary channel

codes, where the BSC is a rather idealistic but simple model.

In this context, the operations on pairs of elements in the considered binary

16

2.2. CHANNEL MODELS

alphabet {0, 1} are defined as addition and multiplication. These two operations

are performed modulo 2 with respect to the algebra defined by a Galois field or

finite field (see also Section 2.3)

GF (2) = < {0, 1},⊕, · > . (2.6)

In brief, a Galois field is a finite set equipped with two binary operations which

can be applied to its elements. These operations are usually referred to as addi-

tion and multiplication as mentioned above, and must possess certain properties.

Namely, each operation must have a different identity element, and be closed and

associative. Each element is required to have an additive inverse and addition must

be commutative. A set satisfying these properties is called a ring. If all elements

besides the additive identity possess a unique multiplicative inverse, then the set is

a field. In view of the properties of the Galois field GF (2), a transition from an

input symbol ui ∈ U to an output symbol vi ∈ V may be formulated as

vi = ui ⊕ fi, (2.7)

where ⊕ denotes addition modulo 2 and fi ∈ {0, 1} represents the absence or pres-

ence of an error event.

An ultimate measure for the maximum rate of information in terms of bits per

transmission over a channel is given by the channel capacity C. As a consequence of

Shannon’s landmark 1948 paper [6], for all transmission rates less than the channel

capacity C it is possible to transmit with arbitrarily small error probability by using

sufficiently strong channel coding. In the case of a BSC with crossover probability

ǫ, the channel capacity in bits is given by

C = 1 + (1 − ǫ) log2(1 − ǫ) + ǫ log2 ǫ. (2.8)

Discrete memoryless channel

The BSC is a special case of a class of channel models known as DMCs, which allow

for non-binary input and output symbols. In this thesis, the non-binary input and

output alphabets U and V , respectively, are assumed to be of size |U| = |V| = p,

where p is a prime. This generalisation allows investigation of transmission systems

that use non-binary codes such as the ternary Golay code [29], non-binary Hamming

codes, and many others.

In this case, the involved alphabets U and V can be considered as the set of

integers {0, 1, . . . , p−1}. The operations of addition and multiplication on pairs of

17

2.2. CHANNEL MODELS

��:

��*

-XXXXXXXXXXXXXzu

u

u

u

u

u

u

u

ut vt

0

1

g

p−1

0

1

g

p−1

1−(p−1)ǫ1

ǫ1 ǫ1

ǫ1

...

...

...

...

(a)

��:

��*

-XXXXXXXXXXXXXzu

u

u

u

u

u

u

u

ui vi

0

1

g

p−1

0

1

g

p−1

1 − ǫ2

ǫ2p−1 ǫ2

p−1

ǫ2p−1

...

...

...

...

(b)

Figure 2.3: Transition probability diagram of the p-ary symmetric DMC model:(a) standard model, (b) alternative model.

elements of this set are performed modulo p according to the algebra defined by a

Galois field (see also Section 2.3), which is denoted for brevity as

GF (p) = < {0, 1, . . . , p−1},⊕, · > . (2.9)

In addition, the considered p-ary DMC models are assumed to be symmetric

and differentiate only between error-free and erroneous transmission. The actual

value of the discrete error is not taken into account. The only parameter required

to describe such a type of p-ary DMC is the crossover probability. As the definition

of the crossover probability is somewhat ambiguous for a p-ary DMC, the following

two forms of model may be described.

Standard model: The transition probability diagram of a p-ary symmetric DMC

in standard form is shown in Fig. 2.3(a). For clarity, only the branches ema-

nating from the input node for an element g ∈ GF (p) have been shown. There

is a symmetrical set of p branches emanating from each of the other p−1 input

nodes. Given the crossover probability ǫ1, the weights of the branches on the

transition probability diagram are defined by

P (vi|ui) =

{

1 − (p−1)ǫ1 if vi = ui,

ǫ1 if vi 6= ui.(2.10)

Alternative model: The alternative form of the p-ary symmetric DMC model is

shown in Fig. 2.3(b). In this alternative model, the probability ǫ2 is defined

as

ǫ2 =∑

vi∈GF (p)vi 6=ui

P (vi|ui), (2.11)

18

2.2. CHANNEL MODELS

and the weights of the transition probability diagram are given by

P (vi|ui) =

{

1 − ǫ2 if vi = ui,ǫ2

p−1if vi 6= ui.

(2.12)

Although ǫ1 6= ǫ2 holds for p > 2, the relationship between the parameters ǫ1

and ǫ2 can be easily deduced from (2.10) and (2.12) as

ǫ2 = (p−1)ǫ1. (2.13)

The channel capacities C1 and C2 of the standard and alternative p-ary symmetric

DMC models, respectively, are given as

C1 = 1 + [1 − (p− 1)ǫ1] logp[1 − (p− 1)ǫ1] + (p− 1)ǫ1 logp ǫ1, (2.14)

C2 = 1 + (1 − ǫ2) logp(1 − ǫ2) + ǫ2 logp

(

ǫ2

p− 1

)

. (2.15)

Applying (2.13) in (2.15) shows that the two models have equal channel capacities.

Unless otherwise stated, the convention in this document will be to use the

standard p-ary DMC model as presented in Fig. 2.3(a). However, there will be some

cases where calculations will be done using both models, to emphasise the equality

of the two models.

2.2.2 Channels with memory

Many practical communication channels, particularly wireless channels, do not pro-

duce single independent errors. Instead, the errors are likely to occur in bursts [30].

It is for this reason that such a channel is referred to as one having memory. Each er-

ror or non-error inherits a statistical dependence on the previous channel behaviour.

As such, the memoryless channel models as described in Section 2.2.1 are not com-

plex enough to represent such a dependence.

In order to furnish a model with memory, it is usually assumed that the channel

must reside in one of a finite number of states at each discrete time instant. These

states are indicators of different error likelihoods with respect to the symbol trans-

mission. It is then the transference of the model between these states which results

in periods with different clusterings of transmission errors. This may be observed

on a physical discrete channel with memory as error bursts and gaps between these

bursts. In this section, prominent models of channels with memory are reported and

discussed to the extent needed for the formulation of APP decoding algorithms in

later chapters.

19

2.2. CHANNEL MODELS

Stochastic sequential machines

One of the first discussions of a finite state channel model, also known as a stochastic

sequential machine, was in Shannon’s watershed paper [6]. The introduction of a

finite number of states into the channel model has proven to be a relatively simple

method of incorporating memory. The following is the interpretation of a stochastic

sequential machine which is adopted in this thesis as a means of describing channels

with memory.

A stochastic sequential machine D is a discrete time device consisting of an input

set U of cardinality |U|, an output set V of cardinality |V|, a finite set S of S states

and a set of up to |U|·|V| matrix probabilities {D(vi|ui)} of size S×S. The notation

adopted is stated as

D = (U ,V ,S, {D(vi|ui)}). (2.16)

At each time instant i ∈ Z+, a symbol ui ∈ U is transmitted and a symbol vi ∈ V is

received. The channel is in a state si ∈ S during this symbol transmission, and it will

subsequently assume a state si+1 ∈ S at time i+1. The stochastic process associated

with crossovers of symbols and state transitions is contained in the elements of the

matrix probabilities D(vi|ui) and will be detailed later in this section for a number

of major channels with memory. In order to complete the channel model, an initial

state distribution is required which will be described by a row vector σ0 of length

S. The stochastic sequential machine D together with the initial state distribution

vector σ0 forms a stochastic automaton

D = (D,σ0). (2.17)

It should be mentioned that the marginalisation of the matrix probabilities

D(vi|ui) over all output symbols vi ∈ V results in the so-called state transition

matrix

D =∑

vi∈V

D(vi|ui). (2.18)

In view of (2.18), the stationary state distribution is used in this thesis to initialise

the sequence of state transitions over discrete time rather than randomly choosing

an arbitrary initial state distribution. Given the constraint that the sum of the

stationary state probabilities represented by the elements of the stationary state

distribution vector σ0 must be one as

σ0e = 1, (2.19)

20

2.2. CHANNEL MODELS

where e denotes an all-ones column vector of length S, the stationary state distri-

bution can be easily obtained by solving the following equation:

σ0D = σ0. (2.20)

Hidden Markov models

It remains to be shown how the matrix probabilities D(vi|ui) of a stochastic sequen-

tial machine facilitate the modelling of channel memory. In essence, the error and

state sequences form a HMM, and as such, they result in two layers of stochasticity.

The error sequence representing the difference between the transmitted and received

symbol sequences is the observable result of a random process. However, the under-

lying sequence of states which in part produced the error sequence is hidden.

More specifically, if a discrete channel model with a memory of size l ∈ Z+ is re-

quired, then the state sequence must form a Markov chain of order l. In other words,

the state sequence obeys the lth order Markovian property that if a sequence of dis-

crete random variables S1, S2, . . . , Si produces a sequence of realisations s1, s2, . . . , si,

where sj ∈ S, ∀j ∈ {1, 2, . . . , i}, then

P (Si+1 = si+1|Si = si, Si−1 = si−1, . . . , S1 = s1)

= P (Si+1 = si+1|Si = si, Si−1 = si−1, . . . , Si−l+1 = si−l+1).(2.21)

Accordingly, the probability that the channel assumes state Si+1 = si+1 at time

instant i + 1 is conditioned only on the sequence of l preceding states but not the

entire history of states. As only channels with a memory of size l = 1 will be

considered in this thesis, (2.21) simplifies to

P (Si+1 = si+1|Si = si, Si−1 = si−1, . . . , S1 = s1) = P (Si+1 = si+1|Si = si). (2.22)

By analogy, conditional probabilities that account for both the error sequence

and the state sequence can be formulated. Suppose the input symbol ui ∈ U and

the output symbol vi ∈ V as realisations of the random variables Ui and Vi at time

instant i, respectively, and a channel model with memory of size l = 1 are given.

Then, an element dsi,si+1(Vi = vi|Ui = ui) of the matrix probability D(vi|ui) of

size S × S for each pair of the current state Si = si ∈ S and subsequent state

Si+1 = si+1 ∈ S can be expressed as

dsi,si+1(Vi = vi|Ui = ui) = P (Vi = vi, Si+1 = si+1|Si = si, Ui = ui). (2.23)

21

2.2. CHANNEL MODELS

Simplifying notation, applying Bayes’ rule, and assuming independence between the

underlying source and state processes, the expression in (2.23) may be written as

dsi,si+1(vi|ui) = P (vi, si+1|si, ui)

= P (vi|si+1, si, ui)P (si+1|si, ui)

= P (vi|si+1, si, ui)P (si+1|si).

(2.24)

For each combination of si ∈ S and si+1 ∈ S, assume that there is a p-ary sym-

metric DMC where the probabilities of correct and erroneous symbol transmission

are known. For the binary case, two matrix probabilities may be defined as

Dui⊕vi= D(vi|ui) =

[

dsi,si+1(vi|ui)

]

S×S∈ {D0,D1}. (2.25)

Thus, the probabilities P (vi|si+1, si, ui) and P (si+1|si) are required in order to de-

scribe the channel model. A more specific description of some prominent two-state

binary and non-binary channel models with memory follows.

Binary general two-state Markov channel

Consider a two-state Markov channel comprising a ‘good’ state G and a ‘bad’ state

B. The terms “good” and “bad” are used to indicate favourable and difficult channel

conditions in terms of providing correct transmission, respectively. The related error

processes can be quantified by probabilities P (vi|si+1, si, ui), which, for the general

binary channel with ui, vi ∈ {0, 1}, depend on the state si at time instant i, as

well as the subsequent state si+1 at time instant i+ 1. Assuming that a BSC model

applies for each of the possible pairs of the current state si ∈ {G,B} and subsequent

state si+1 ∈ {G,B}, the corresponding crossover probabilities may be denoted as

P (vi 6= ui|Si+1 = G,Si = G, ui) = pGG, (2.26)

P (vi 6= ui|Si+1 = B,Si = G, ui) = pGB, (2.27)

P (vi 6= ui|Si+1 = G,Si = B, ui) = pBG, (2.28)

P (vi 6= ui|Si+1 = B,Si = B, ui) = pBB. (2.29)

In addition, define the time-invariant state transition probabilities for the con-

sidered class of two-state channel models with S = {G,B} as

P (Si+1 = G|Si = G) = 1 − P, (2.30)

P (Si+1 = B|Si = G) = P, (2.31)

P (Si+1 = G|Si = B) = Q, (2.32)

P (Si+1 = B|Si = B) = 1 −Q. (2.33)

22

2.2. CHANNEL MODELS

G B1 − P

P

Q

1 −Q

-@

@@@R

-��

��@

@@

@@�

��

��

q q

q qui vi

0 0

1 1

1 − pGG

1 − pGG

pGG

pGG

-@

@@@R

-��

��@

@@

@@�

��

��

q q

q qui vi

0 0

1 1

1 − pGB

1 − pGB

pGB

pGB

-@

@@@R

-��

��@

@@

@@�

��

��

q q

q qui vi

0 0

1 1

1 − pBG

1 − pBG

pBG

pBG

-@

@@@R

-��

��@

@@

@@�

��

��

q q

q qui vi

0 0

1 1

1 − pBB

1 − pBB

pBB

pBB

Figure 2.4: Structure of the binary general two-state Markov model.

The structure of the error process and state transitions along with the related

parameters can be summarised as shown in Fig. 2.4. The state transition diagram

shows the connection between the ‘good’ state G and the ‘bad’ state B, where each

branch is labelled with its corresponding state transition probability. As far as the

four BSC models are concerned, these may also be considered as being associated

with one of the four branches in the state transition diagram.

For the considered symmetric channel with memory, the parameters given in

(2.26)-(2.33) may be used to concisely describe the general two-state Markov channel

by matrix probabilities. As the considered channel is symmetric with respect to

transmission error, the two possible matrix probabilities may be written as

D0 = D(vi = ui|ui) =

[

(1 − pGG) (1 − P ) (1 − pGB)P

(1 − pBG)Q (1 − pBB) (1 −Q)

]

, (2.34)

D1 = D(vi 6= ui|ui) =

[

pGG (1 − P ) pGB P

pBGQ pBB (1 −Q)

]

. (2.35)

Then, the state transition matrix for the general two-state Markov channel model

can be reported as

D = D0 + D1 =

[

1 − P P

Q 1 −Q

]

. (2.36)

In addition, the difference matrix ∆ between the matrix probabilities D0 and

D1 will be frequently used in the following chapters for the proposed APP decoding

algorithms. Such a difference matrix can be represented for the general two-state

23

2.2. CHANNEL MODELS

Markov channel as

∆ = D0 − D1 =

[

(1 − 2pGG) (1 − P ) (1 − 2pGB)P

(1 − 2pBG)Q (1 − 2pBB) (1 −Q)

]

. (2.37)

To complete the channel model, the stationary state distribution vector σ0 can

be calculated by solving (2.20) using (2.36) and subject to the constraint in (2.19)

as

σ0 = [σG, σB] =[

Q

P+Q, P

P+Q

]

. (2.38)

Binary Gilbert-Elliott channel

If the crossover probabilities of the binary general two-state Markov channel model

were to depend only on the current state Si of the channel, rather than both the

current state Si and the subsequent state Si+1, then

P (vi|Si+1 = si+1, Si = si, ui) = P (vi|Si = si, ui). (2.39)

Only two different crossover probabilities would then need to be specified for the

characterisation of the error process, that is

pG = pGG = pGB, (2.40)

pB = pBG = pBB, (2.41)

where pG and pB refer to the crossover probabilities in the ‘good’ state G and ‘bad’

state B, respectively. The condition pG ≤ pB could be applied to identify the states

G and B.

The resulting channel model is known as the GEC model [3] and may be illus-

trated as shown in Fig. 2.5. Its two matrix probabilities are given by

D0 = D(vi = ui|ui) =

[

1 − pG 0

0 1 − pB

][

1 − P P

Q 1 −Q

]

, (2.42)

D1 = D(vi 6= ui|ui) =

[

pG 0

0 pB

][

1 − P P

Q 1 −Q

]

. (2.43)

The stationary state distribution vector σ0 and state transition matrix D remain

unchanged from (2.38) and (2.36), respectively, while the difference matrix ∆ may

be expressed as

∆ = D0 − D1 =

[

1 − 2pG 0

0 1 − 2pB

]

D. (2.44)

24

2.2. CHANNEL MODELS

G B1 − P

P

Q

1 −Q

-

-

��

��

�3Q

QQ

QQs

��

��

��

��QQ

QQ

QQ

QQui vi1 − pG

1 − pG

pG

pG

0 0

1 1r

r

r

r

-

-

��

��

�3Q

QQ

QQs

��

��

��

��QQ

QQ

QQ

QQui vi1 − pB

1 − pB

pB

pB

0 0

1 1r

r

r

r

Figure 2.5: Structure of the binary GEC model.

Channel models usually do not have a unique parametrisation. The binary GEC

model has been defined above in terms of four parameters, which are connected

to the theoretical concept of Markov chains. Specifically, these parameters were

the state transition probabilities P and Q, and the crossover probabilities pG and

pB. Alternatively, a parametrisation more strongly related to the distribution of the

bursts of transmission errors may be considered. These types of parameters could

be extracted by measuring error patterns from an actual channel. The following

three alternative channel parameters have been reported in [26]:

Average fade to connection time ratio: The average fade to connection time

ratio x describes how often on average the channel is in the ‘good’ state G

compared to the ‘bad’ state B. It is related to the state transition probabilities

P and Q of the GEC model by

x =P

Q. (2.45)

Burst factor: The burst factor y relates to the correlation function of the error

process and indicates how clustered the discrete error events may appear on

the channel. A burst factor of y = 0 is obtained for the case of statistically

independent errors, such as for a DMC. As y increases, the errors occur more

frequently in bursts, as each becomes increasingly dependent on the existence

or non-existence of previous errors. A burst factor of y = 1 then indicates max-

imal error dependency. The relationship to the state transition probabilities

P and Q is given by

y = 1 − P −Q. (2.46)

25

2.2. CHANNEL MODELS

G B1 − P

P

Q

1 −Q

��:

��*

-XXXXXXXXXXXXXzu

u

u

u

u

u

u

u

ui vi

0

1

g

p−1

0

1

g

p−1

1−(p−1)pG

pG pG

pG

...

...

...

...��:

��*

-XXXXXXXXXXXXXzu

u

u

u

u

u

u

u

ui vi

0

1

g

p−1

0

1

g

p−1

1−(p−1)pB

pB pB

pB

...

...

...

...

Figure 2.6: Structure of the standard non-binary GEC model.

Channel reliability factor: This parameter is directly related to the average

bit error rate (BER) of the channel, which is a quantity of greater practical

significance compared to the individual crossover probabilities in each channel

state. Accordingly, the average BER pb of a binary GEC can be written as

pb = pGσG + pBσB = pG

Q

P +Q+ pB

P

P +Q, (2.47)

where the subscript b signifies BER rather than symbol error rate (SER). Given

the average BER in (2.47), the channel reliability factor z for the binary GEC

has been defined in [26] as

z = 1 − 2pb = (1 − 2pG)Q

P +Q+ (1 − 2pB)

P

P +Q. (2.48)

A channel reliability factor of z = 0 is obtained for a totally unreliable channel

(pb = 0.5). On the other hand, an error-free channel (pb = 0) gives a channel

reliability factor of z = 1.

Non-binary Gilbert-Elliott channel

It is straightforward to extend the binary GEC model to accommodate non-binary

input and output symbols. In this situation, the BSCs associated with the two states

of the binary GEC model are replaced by p-ary DMCs. The state transition process

between the two channel states and the related state transition probabilities remain

26

2.2. CHANNEL MODELS

the same. The structure of a non-binary GEC model may then be illustrated as

shown in Fig. 2.6. Note that the p-ary DMC model in standard form as shown in

Fig. 2.3(a) has been used here, although either p-ary DMC model may be applied.

The non-binary GEC model is able to be described by a stochastic automaton

D = (σ0,D) as defined in (2.17), with stationary state distribution vector σ0 as

given in (2.38), and stochastic sequential machine D defined by

D = (U = {0, 1, . . . , p− 1},V = {0, 1, . . . , p− 1},S = {G,B}, {D(vi|ui)}). (2.49)

The elements of the matrix probabilities D(vi|ui) can be specified by noting that

the following property holds for the p-ary symmetric DMCs in each state Si = si ∈ S:

P (vi = ui ⊕ fi|si, ui) = P (vi = u′

i ⊕ gi|si, u′

i) ∀ui, u′

i ∈ GF (p), (2.50)

whenever both symbols fi, gi ∈GF (p)\{0} or if fi = gi = 0. Adopting the notation

presented in [31], the set of matrix probabilities may then be represented as

{D(vi|ui)} = {Dfi:= D(vi = ui ⊕ fi|ui) | fi, ui ∈ GF (p)}. (2.51)

In view of (2.51), the set of matrix probabilities is of size p and its elements can be

identified through the value of the error symbols fi ∈ GF (p). Using the notation

vi ⊖ ui ≡ vi − ui (mod p), (2.52)

the set of matrix probabilities may be related to the events of correct and erroneous

symbol transmission as

D(vi|ui) = Dvi⊖ui= Dfi

=

{

D0 if fi = 0,

Dǫ if fi ∈ GF (p) \ {0}.(2.53)

As for the binary GEC, only two 2 × 2 matrix probabilities are required to de-

scribe this discrete non-binary channel with memory. The two matrix probabilities

accounting for correct and erroneous transmission can be written using (2.24) and

(2.39) in terms of the parameters of a p-ary symmetric DMC as

D0 =

[

1 − (p− 1)pG 0

0 1 − (p− 1)pB

][

1 − P P

Q 1 −Q

]

, (2.54)

Dǫ =

[

pG 0

0 pB

][

1 − P P

Q 1 −Q

]

. (2.55)

27

2.2. CHANNEL MODELS

The marginalisation of the matrix probabilities with respect to the output symbol

vi ∈ GF (p) leads to the state transition matrix D, while the concept of the difference

between correct and erroneous transmission results in the difference matrix ∆. These

may be expressed respectively as

D = D0 + (p− 1)Dǫ =

[

1 − P P

Q 1 −Q

]

, (2.56)

∆ = D0 − Dǫ =

[

1 − ppG 0

0 1 − ppB

]

D. (2.57)

It may also be beneficial to report an alternative parametrisation for the non-binary

GEC model in terms of the average fade to connection time ratio x, the burst factor

y, and the channel reliability factor z. The former two parameters depend only on

the state transition probabilities, and are therefore given by the same expressions

(2.45) and (2.46) as for the binary GEC. However, the calculation of the channel reli-

ability factor z also involves the crossover probabilities pG and pB of the constituent

DMCs, and can be derived as follows.

Without loss of generality, consider the standard non-binary GEC model shown

in Fig 2.6. Assuming the input symbols ui ∈ GF (p) appear with equal probability,

that is

∀ui ∈ U : P (ui) =1

p, (2.58)

then the average SER for this model can be calculated as

ps =∑

si∈{G,B}

∑

ui∈GF (p)

P (fi 6= 0|si, ui)P (si)P (ui)

=1

p

∑

ui∈GF (p)

[(p− 1)(pGσG + pBσB)] (2.59)

= (p− 1)(pGσG + pBσB).

By analogy with the difference matrix ∆ defined in (2.57), the channel reliability

factor z can be regarded as an average difference in the probabilities for correct and

erroneous transmission for a given transmission error fi = vi ⊖ ui = g ∈ GF (p).

As averaging is being performed with respect to the channel states G and B, the

28

2.2. CHANNEL MODELS

G B1 − P

P

Q

1 −Q

-

-

ui vi1

10 0

1 1r

r

r

r

-

-

��

��

�3Q

QQ

QQs

��

��

��

��QQ

QQ

QQ

QQui vi1 − pB

1 − pB

pB

pB

0 0

1 1r

r

r

r

Figure 2.7: Structure of the binary Gilbert channel model.

channel reliability factor z may be calculated as

z =∑

si∈{G,B}

∑

ui∈GF (p)

[P (fi = 0|si, ui) − P (fi = g 6= 0|si, ui)]P (si)P (ui)

= 1p

∑

ui∈GF (p)

[(1 − ppG)σG + (1 − ppB)σB]

= (1 − ppG)σG + (1 − ppB)σB

= 1 − p

p−1ps.

(2.60)

Binary Gilbert channel

There are special cases of the GEC model which deserve particular consideration.

Firstly, suppose that the BSC in the ‘good’ state G is perfect in the sense that

it does not cause transmission errors and therefore has a crossover probability of

pG = 0. This channel model is known as the Gilbert model [32], named after its

developer Edgar N. Gilbert. It has the advantage of fairly simple analytical error

probability calculations [33]. Figure 2.7 shows a schematic diagram of the binary

Gilbert model.

In terms of the matrix probabilities, the only change from the binary GEC model

is the restriction of pG = 0 and therefore

D0 =

[

1 0

0 1 − pB

][

1 − P P

Q 1 −Q

]

, (2.61)

D1 =

[

0 0

0 pB

][

1 − P P

Q 1 −Q

]

. (2.62)

29

2.2. CHANNEL MODELS

G B1 − P

P

Q

1 −Q

-

-

-

-

u

u

u

u

u

u

u

u

ui vi

0

1

g

p−1

0

1

g

p−11

1

1

1

...

...

...

...��:

��*

-XXXXXXXXXXXXXzu

u

u

u

u

u

u

u

ui vi

0

1

g

p−1

0

1

g

p−1

1−(p−1)pB

pB pB

pB

...

...

...

...

Figure 2.8: Structure of the non-binary Gilbert channel model using the standardp-ary DMC model.

The state transition matrix and difference matrix are then calculated as

D =

[

1 − P P

Q 1 −Q

]

, (2.63)

∆ =

[

1 0

0 1 − 2pB

]

D. (2.64)

Non-binary Gilbert channel

The non-binary Gilbert channel is obtained from the non-binary GEC model by

setting the crossover probability pG of the channel in the ‘good’ state G to zero. A

diagram of the non-binary Gilbert channel model using the standard p-ary DMC

model as in Fig. 2.3(a) is shown in Fig. 2.8.

It follows that the state transition matrix D is defined as in (2.63), while the

remaining three matrices are given as

D0 =

[

1 0

0 1 − (p−1)pB

]

D, (2.65)

Dǫ =

[

0 0

0 pB

]

D, (2.66)

30

2.2. CHANNEL MODELS

and

∆ =

[

1 0

0 1 − ppB

]

D. (2.67)

Binary restricted Gilbert-Elliott channel

In the same way that the Gilbert channel considers the extreme case of the channel

being error-free in the ‘good’ state G, another special case would be to consider a

channel that is totally unreliable in the ‘bad’ state B. In other words, with the

crossover probability pB = 0.5, the channel capacity of the BSC in the ‘bad’ state

B is zero. This case will be considered below for the binary GEC model and will

be referred to as the binary restricted GEC model. The structure of this restricted

GEC is displayed in Fig. 2.9.

Firstly consider the matrix probabilities D0 and D1 for a binary restricted GEC

model (pB = 0.5), which may be expressed as

D0 =

[

1 − pG 0

0 0.5

]

D, (2.68)

D1 =

[

pG 0

0 0.5

]

D, (2.69)

where the state transition matrix D remains unchanged from that given in (2.36).

However, the difference matrix for the binary restricted GEC model has a particu-

larly sparse form and is therefore given the special designation

δ = ∆(pB = 0.5) =

[

1 − 2pG 0

0 0

]

D. (2.70)

As the restricted GEC model is defined by a set of three parameters, either

the mathematical parameters P , Q, and pG, or the more measurement-related pa-

rameters x, y, and z may be used. In the latter case, it is possible to formulate

APP decoding based on polynomials in the three variables x, y, and z as will be

shown in Chapter 6. In addition, duality concepts similar to those contained in the

MacWilliams identity [25] can be revealed for the proposed APP decoding algorithms

on restricted GEC models by using the polynomials in x, y, and z.

While the average fade to connection time ratio x and the burst factor y do

not depend on crossover probabilities and therefore are given in (2.45) and (2.46),

respectively, the channel reliability factor z for the binary restricted GEC is obtained

31

2.2. CHANNEL MODELS

G B1 − P

P

Q

1 −Q

-

-

��

��

�3Q

QQ

QQs

��

��

��

��QQ

QQ

QQ

QQui vi1 − pG

1 − pG

pG

pG

0 0

1 1r

r

r

r

-

-

��

��

�3Q

QQ

QQs

��

��

��

��QQ

QQ

QQ

QQui vi0.5

0.5

0.5

0.5

0 0

1 1r

r

r

r

Figure 2.9: Structure of the binary restricted GEC model.

by applying the constraint pB = 0.5 to (2.48), resulting in

z = (1 − 2pG)Q

P +Q. (2.71)

This allows the stationary state distribution vector σ0, state transition matrix D,

and difference matrix δ=∆(pB =0.5) given in (2.38), (2.36) and (2.70), respectively,

to be expressed in terms of the burst-error characteristics x, y, and z as

σ0(x) =[

11+x

, x1+x

]

, (2.72)

D(x, y) =1

1 + x

[

1 + xy x− xy

1 − y x+ y

]

, (2.73)

δ(x, y, z) = z(1 + x)

[

1 0

0 0

]

D(x, y). (2.74)

Non-binary restricted Gilbert-Elliott channel

In order to obtain a channel capacity of zero in the p-ary symmetric DMC for the

‘bad’ state B of the non-binary GEC, a crossover probability of pB = 1p

is required if

using the standard model shown in Fig. 2.3(a). In the case of the alternative model

shown in Fig. 2.3(b), a crossover probability of pB = p−1p

is required. The resulting

non-binary restricted GEC models are shown in Figs. 2.10 and 2.11, respectively. In

both cases, the crossover probabilities in the ‘bad’ state B are identical.

The matrix probabilities D0 and Dǫ under the standard DMC model can be

32

2.2. CHANNEL MODELS

G B1 − P

P

Q

1 −Q

��:

��*

-XXXXXXXXXXXXXzu

u

u

u

u

u

u

u

ui vi

0

1

g

p−1

0

1

g

p−1

1−(p−1)pG

pG pG

pG

...

...

...

...��:

��*

-XXXXXXXXXXXXXzu

u

u

u

u

u

u

u

ui vi

0

1

g

p−1

0

1

g

p−1

1p

1p

1p

1p

...

...

...

...

Figure 2.10: Structure of the standard non-binary restricted GEC model.

written as

D0 =

[

1 − (p− 1)pG 0

0 1p

]

D, (2.75)

Dǫ =

[

pG 0

0 1p

]

D, (2.76)

where the state transition matrix D remains as given in (2.36). The difference

matrix δ can be represented as

δ = ∆

(

pB =1

p

)

=

[

1 − ppG 0

0 0

]

D. (2.77)

Under the alternative model, the state transition matrix D remains as in (2.36),

while the other matrix probabilities can be reported as

D0 =

[

1 − pG 0

0 1p

]

D, (2.78)

Dǫ =

[

pG

p−10

0 1p

]

D, (2.79)

δ = ∆

(

pB =1

p

)

=

[

1 − p

p−1pG 0

0 0

]

D. (2.80)

33

2.2. CHANNEL MODELS

G B1 − P

P

Q

1 −Q

��:

��*

-XXXXXXXXXXXXXzu

u

u

u

u

u

u

u

ui vi

0

1

g

p−1

0

1

g

p−1

1 − pG

pG

p−1pG

p−1

pG

p−1

...

...

...

...��:

��*

-XXXXXXXXXXXXXzu

u

u

u

u

u

u

u

ui vi

0

1

g

p−1

0

1

g

p−1

1p

1p

1p

1p

...

...

...

...

Figure 2.11: Structure of the alternative non-binary restricted GEC model.

In order to formulate APP decoding algorithms for non-binary restricted GEC

models using polynomials in terms of the burst-error characteristics x, y, and z (see

Chapter 7), the same rationale as outlined for the binary restricted GEC model

applies. Clearly, the stationary state distribution vector σ0(x) and state transition

matrix D(x, y) in x and y are identical to those for the binary restricted GEC, as

these are only dependent on the state transitions and are unrelated to the error

process specified by the crossover probabilities. It is also straightforward to show

that the difference matrix δ(x, y, z) is the same for both the standard and alternative

DMC model formulations of the non-binary restricted GEC. The only difference is

the channel reliability factor z, which is defined differently for each model. For

the standard DMC model, substituting pB = 1p

into (2.60) results in the channel

reliability factor

z = (1 − ppG)σG, (2.81)

while under the alternative DMC model, the expression

z =

(

1 − p

p− 1pG

)

σG (2.82)

for the channel reliability factor can be derived. Both models result in the same for-

mulation of the difference matrix δ(x, y, z) as given in (2.74) for the binary restricted

GEC.

34

2.3. ERROR CONTROL CODING

Other channel models

The number of states in the HMM for the channel can also be increased. In order to

limit the complexity of such a model, assume binary transmission and that the error

probabilities in each state are deterministic. That is, let there be one or more ‘good’

states where there are no errors and the crossover probability is zero, as well as one

or more ‘bad’ states which always produce errors and where the crossover probability

is one. Additionally, assume that state transitions between any two different error

states and any two different error-free states are not permitted. There are thus only

alternating variable-length periods of correct and incorrect reception. Observe that

these are semi-hidden Markov models as the current state of the channel cannot

be determined completely from the error sequence, however some information can

be gathered. These models were proposed by Fritchman in [34]. More information

concerning them can be found in [4, 35,36,37].

A common task in telecommunications is to find the best model for an observed

sample of a real channel. This is usually referred to as the parameter estimation

problem and a tool used to solve it is the Baum-Welch algorithm [38]. In [39], it is

demonstrated that the three-state Fritchman model is equivalent to the two-state

Gilbert model. This equivalence of models highlights the fact that there may not

be a unique solution to the parameter estimation problem.

Alternatively, assume non-deterministicity but suppose that state transitions are

allowed only when errors occur. This is the premise of the McCullough model, also

referred to as the binary regenerative channel [40]. Although its state diagram is

more complex than that of the Gilbert model, its parameters are better related to the

statistical properties of the noise involved. Another model sometimes used is that of

Swoboda [41], which is particularly useful for pulse code modulation channels [42].

There are thus many different models to which the decoding procedures derived in

Chapters 4 and 5 may be applied.

2.3 Error Control Coding

The second major functional block of the digital communication system model con-

sidered in this thesis deals with channel coding. A review of some fundamentals of

error control coding is therefore given in the following sections. The ideas behind

finite fields, convolutional codes and linear block codes are provided. With a focus

on block codes, the concepts of syndrome decoding, sequence estimation decoding,

and symbol estimation decoding are described. Then, the syndrome trellis is intro-

35


duced as a systematic framework for supporting symbol-by-symbol APP decoding.

As the trellis plays an important role in the derivation of APP decoding algorithms

in this thesis, an instructive example of constructing different types of trellises for

a simple linear block code is provided.

2.3.1 Encoding

In the context of error control coding, encoding refers to the process by which a

sequence of information symbols created by the source may gain protection from

errors as it is transmitted over a channel. This is achieved by introducing some

form of redundancy to the original sequence of information symbols.

In this section, the arithmetic of finite fields will be briefly reviewed, in order to

understand how redundancy may be produced from the information symbols. Then,

the two major paradigms of convolutional and block encoding will be discussed.

Finite fields

A code is a subset of a vector space, which is a mathematical object taken over a

set of scalars. This set could for example be a group, a ring or a field. Examples

of codes over rings and cyclic groups can be found in [43] and [44], respectively.

Although fields are the most restrictive of these three, vector spaces over a finite

field will be the assumed environments for codes in this thesis. The set of field

elements corresponds to the input and output signalling alphabets of the channel.

It is a famous result of Galois Theory that it is only possible to construct fields

of order pa, where p is a prime and a ∈ Z+. To determine how addition and

multiplication are performed, polynomials are used as the field elements. For a field

of order pa, consider all possible polynomials in indeterminate D having the form

f(D) =a−1∑

i=0

fiDi, (2.83)

where each coefficient fi ∈ Zp and Zp denotes the set of integers {0, 1, . . . , p−1}modulo p. This results in pa different polynomials which can be added and multi-

plied. However, this set is not closed under multiplication. The problem is solved by

choosing one particular monic irreducible polynomial of degree a, say f ∗(D). Thus,

it is impossible to write f ∗(D) as the product of two or more polynomials of degree

d, where 1 ≤ d ≤ a−1. This ensures each nonzero element has a unique multiplica-

tive inverse. Multiplication in the field is then performed modulo f ∗(D), so that a

product has degree at most a−1, and thus the set is closed under multiplication.

36


Table 2.1: Addition and multiplication table of the Galois field GF (2).

⊕ 0 1 · 0 10 0 1 0 0 01 1 0 1 0 1

Fields for which a ≥ 2 are sometimes called extension fields, since they in essence

extend the concept of the ground field GF (p). Examining the case a = 1 in detail, by

(2.83), the field elements are simply the integers {0, 1, . . . , p−1}. As the irreducible

polynomial f ∗(D) must be monic and of degree one,

f ∗(D) = D. (2.84)

In other words, addition and multiplication are calculated using modulo p arithmetic.

For example, the addition and multiplication operations for the Galois field GF (2)

using modulo 2 arithmetic are shown in Table 2.1. For more information on finite

fields, the interested reader is directed to [45] or [46].

Convolutional codes

A convolutional encoder can be modelled as a deterministic finite state machine.

The different states correspond to the contents of a series of memory registers. The

term “convolutional” is used because the input symbols are convolved with the

impulse responses of the machine in order to produce the output symbols. The

complete structure is referred to as an (n, k,m) encoder, where n is the number

of output symbols, k is the number of input symbols, and m is the number of

memory registers. The definition of the encoder for the field GF (p) also requires

stipulation of a set of n p-ary generator polynomials g1(D), g2(D), . . . , gn(D), each

in indeterminate D and of degree k ·m.

According to [47], the encoder consists of a horizontal array of m groups of k

memory registers. Initially, the rightmost k(m − 1) registers are filled with zeroes.

These registers define the state of the encoder. Each register contains one of p sym-

bols, resulting in pk(m−1) possible states. Information symbols are introduced from

the left in batches of size k. The modulo p addition circuitry defined by generator

polynomials g1(D), g2(D), . . . , gn(D) is then used to output encoded symbols u1 to

un. At each time instant, the contents of all registers are transferred k registers

to the right. After the final symbols are flushed out of the encoder by inputting

a tail of zeroes, the collection of output symbols at the righthand end of the en-

coder forms the encoded sequence. Convolutional codes are able to operate with a

37


-

-? 6

��

?-

6��

?

ff

HHY -Input bit bu1 first output bit

u2 second output bit

Figure 2.12: A (2,1,3) convolutional encoder constructed with generator polyno-mials g1(D) = D2+D+1 and g2(D) = D2+1.

stream of information symbols of almost any length, and the asymptotic rate is kn

as

more and more symbols are encoded. A diagram of a (2, 1, 3) convolutional encoder

with operations in GF (2) and binary inputs is given in Fig. 2.12. More information

regarding convolutional codes can be found in [48].

Block codes

By contrast, block codes encode words of a particular length individually. There is

a fixed number k of information symbols per word, and additional parity symbols

are produced in the encoding process, resulting in codewords of a fixed length n.

This is referred to as an (n, k) block code C, where the code rate R of such a block

code is given by

R =k

n. (2.85)

Generator matrix: Suppose that C is a set of codewords over the Galois field

GF (p). If C 6= span(C), where span(C) denotes the linear span of a set of

vectors, then C is a non-linear block code. Otherwise, an (n, k) block code C

is linear and is a k-dimensional subspace of the finite vector space [GF (p)]n

spanned by a basis of k linearly independent p-ary vectors of length n. Let

these vectors be denoted gi, 1 ≤ i ≤ k, and form the k × n generator matrix

G =

g1

g2

...

gk

. (2.86)

Thus, G can be viewed as a transformation from the vector space [GF (p)]k

to the subspace C of [GF (p)]n. A row vector i of k information symbols is

38


transformed into a codeword u of length n using the equation

u = i · G. (2.87)

For a k× (n−k) matrix A, a standard generator matrix G for an (n, k) linear

block code C satisfies the form

G = [Ik|A]. (2.88)

The ith row of the identity matrix Ik of order k is also known as the ith

standard basis vector ei, 1 ≤ i ≤ k. If the columns of a generator matrix

G contain all k transposes of the standard basis vectors of order k, but not

necessarily consecutively or in order, then G is systematic. Linear block codes

with a systematic generator matrix are advantageous because in the decoding

process, the information symbols are simply read off in the same order as the

k columns of Ik appear in G.

Denote the addition operation for the n-dimensional vector space over GF (p)

by ⊕. Since the rows of G are a basis for the vector space C, performing

elementary row operations of the form

f1gi ⊕ f2gj → gi, (2.89)

where f1, f2 ∈ GF (p) and gi and gj are row vectors of G, can give another

generator matrix for the same code. In most cases, performing these elemen-

tary row operations can deliver a standard generator matrix for C. If it is

not possible to obtain a standard generator matrix, there must exist a set of

k columns in a generator matrix which are linearly independent, since it is a

basis for a k-dimensional vector space. It is always possible to perform ele-

mentary row operations so that the k standard basis vectors appear in those

columns. Then, permuting the order of the columns of the matrix, it can be

transformed into a standard generator matrix for a code which is equivalent

to the original code. Therefore, the following result is true.

Theorem 2.3.1. Every linear block code either has a standard generator ma-

trix which can be found by performing elementary row operations, or it is

equivalent to another code of the same size which does possess a standard gen-

erator matrix.

Dual code: The orthogonal complement OC of a codeword u ∈ [GF (p)]n is the set

39


of all vectors u⊥ ∈ [GF (p)]n which have an inner product of 0 with u. That

is,

OC(u) = {u⊥ ∈ [GF (p)]n | < u,u⊥ > = 0}, (2.90)

where for u = [u1, u2, . . . , un] and u⊥ = [u⊥1 , u⊥2 , . . . , u

⊥n ], the inner product is

defined as

< u,u⊥ > =n∑

i=1

uiu⊥i (mod p). (2.91)

The dual C⊥ of a code C is defined as the set of all vectors which are in the

orthogonal complement of every element of C. In other words,

C⊥ = {u⊥ ∈ [GF (p)]n | u⊥ ∈ OC(u), ∀u ∈ C}. (2.92)

It is a simple task to show that C⊥ is also a linear block code over GF (p),

of dimension n−k. Any potential generator matrix H for the dual code C⊥

must satisfy certain conditions. Namely, it must be an (n−k) × n matrix of

full rank, since its rows must form a basis for the dual code. Also, to ensure

the rows of H are in OC(C), a potential matrix H must satisfy the constraint

GHT = 0, (2.93)

where 0 is the k× (n−k) zero matrix. If G is a standard generator matrix for

C, a generator matrix for C⊥ is easily found by consideration of (2.93) and

the independence of the standard basis vectors.

Theorem 2.3.2. If a linear block code C has a standard generator matrix as

defined by (2.88), then H = [−AT |In−k] is a generator matrix for C⊥.

Hamming distance: The Hamming distance d(u1,u2) between codewords u1 and

u2 of a code C is defined as the number of positions in which u1 and u2 differ.

However, the Hamming distance d(C) of a code C is defined as the minimum

Hamming distance over all distinct pairs of codewords of C. That is,

d(C) = min{d(u1,u2) | u1,u2 ∈ C,u1 6= u2}. (2.94)

Consider a code with Hamming distance d where hyperspheres of radius ⌊d−12⌋

centred at each codeword partition the vector space [GF (p)]n such that all pn

vectors lie in exactly one hypersphere. Such a code is called perfect. Hamming

40


codes are one of the few examples of perfect linear block codes. Furthermore,

they can be easily described in terms of their parity check matrices.

2.3.2 Decoding

Once a word has been received after transmission through the channel, the objective

is to decode that word and reconstruct the information symbols. The following de-

coding schemes can be performed with all types of linear block codes. In particular,

syndrome decoding, sequence estimation decoding, and symbol estimation decoding

are considered. In addition, the concept of a syndrome trellis and how it can be

used for APP decoding is described.

Syndrome decoding

The method of syndrome decoding in conjunction with a decoding array takes ad-

vantage of properties of the parity check matrix to decode a received word. It is a

relatively simple scheme which aims to find the nearest codeword to the received

vector. Assume C is an (n, k) linear block code over GF (p) with generator matrix

G and parity check matrix H. Firstly, note that (2.93) is in essence a system of k

linear equations having the form

giHT = 0 (2.95)

for each row vector gi of G, i ∈ {1, 2, . . . , k}, and where 0 is a row vector of n−kzeroes. Furthermore,

figiHT ⊕ fjgjH

T = 0, (2.96)

where fi, fj ∈ GF (p) and i, j ∈ {1, 2, . . . , k}. In other words,

uHT = 0, (2.97)

where u is any codeword in C. On the other hand, the received word v may or may

not be a codeword and may therefore be expressed as

v = u ⊕ d, (2.98)

where u ∈ C and d is the displacement of v from u. It then follows that

vHT = (u ⊕ d)HT = uHT ⊕ dHT = 0 ⊕ dHT = dHT = s, (2.99)

41


where s is a p-ary vector of length n−k called the syndrome of the received word v.

For each of the pn−k possible values of s, there are pk received words which result

in that syndrome s. This partitions the n-dimensional vector space [GF (p)]n into

cosets Vt, t ∈ {0, 1, . . . , pn−k−1}, where the code C is V0 and each of the other cosets

consists of words with the same syndrome. In many cases, particularly if the channel

is memoryless, the displacement d for each coset is chosen as a vector with minimal

weight in that coset. These particular displacements are known as the coset leaders

d. The one-to-one correspondence between coset leaders and syndromes leads to

the following decoding procedure.

Procedure 2.1. Syndrome decoding of an (n, k) linear block code C over GF (p)

using the coset leader - syndrome correspondence.

Step 1. Construct a table with two columns containing a list of all pn−k syndromes

in one column and words of length n of minimum weight in each syndrome in

the corresponding cells of the other column.

Step 2. For a received word v, calculate its syndrome s = vHT .

Step 3. Obtain the estimate d of the displacement d as the coset leader corre-

sponding to the syndrome s from Step 2.

Step 4. By (2.98), decode u = v ⊖ d as the estimate of the codeword u. Here, ⊖denotes element-by-element subtraction modulo p.

Procedure 2.1 is capable of correcting only the error patterns given by the set

of coset leaders. However, there are several options for how coset leaders may be

chosen, such as the following:

• Choose the coset leader d at random,

• Choose the coset leader d as a word with minimal weight in that coset,

• Choose the coset leader d as the first vector when the set of valid coset leaders

is ordered lexicographically, either left-to-right or right-to-left, or

• Choose the coset leader d in that coset as the most likely error pattern for a

given channel.

The decoding based on Procedure 2.1 can be made either correctly or erroneously.

An erroneous decoding occurs if the coset leader d obtained by Procedure 2.1 is not

the actual displacement imposed by the transmission channel. In general, decoding

42


of linear block codes using other algebraic approaches may also result in the detection

of an error pattern that cannot be corrected. In this case, the received word may

be discarded or its retransmission may be requested if a feedback channel from the

receiver to the transmitter is available.

Sequence estimation decoding

In the context of sequence estimation decoding, an APP is the conditional probabil-

ity P (u|v) that a sequence or word u was transmitted, given the sequence or word

v was received. In addition, a decoder may or may not use information about the

reliability of the decisions it makes.

Suppose a codeword u ∈ C was transmitted over the channel, and a word v was

received. In performing the decoding with respect to sequences that form a word of

length n, the aim is to maximise the probability that the estimate u is equal to the

transmitted word u, given the received word v. This objective may be stated as

u = arg maxu∈C

{P (u|v)}. (2.100)

In the very small chance of there being two maximally likely codewords, one is chosen

at random. Under the formulation in (2.100), only correct or incorrect decoding can

result. Using Bayes’ rule, (2.100) can be rewritten as

u = arg maxu∈C

{

P (v|u)P (u)

P (v)

}

. (2.101)

As the maximisation operation is independent of the probability P (v) of a given

received word v, the calculation can be expressed as the maximum a posteriori

probability (MAP) sequence estimation equation of

u = arg maxu∈C

{P (v|u)P (u)}, (2.102)

where P (u) is also known as the a priori probability of the word u. Examples of

algorithms developed to solve this equation are the Viterbi algorithm [49, 50, 51],

and other related sequential decoding algorithms such as [52].

Further simplification to the MAP criterion is obtained when the a priori prob-

ability is equal for all codewords. The related decoding schemes are called ML

algorithms. In this case, (2.102) simplifies to

u = arg maxu∈C

{P (v|u)}. (2.103)

43


Symbol estimation decoding

The other strategy for decoding is to estimate one symbol at a time, rather than

a complete word. Assuming an (n, k) linear block code C in standard form, an

information symbol can be decoded for each position i ∈ {1, 2, . . . , k}. In symbol-

by-symbol MAP decoding, the aim is to find the a posteriori probabilities P (ui|v)

of each transmitted symbol ui for each position i given the received vector v. On

the basis of these APPs, the most likely symbol is selected as the estimate ui for the

transmitted symbol ui at time instant i.

A frequently used approach to perform symbol-by-symbol MAP decoding for

binary codes is based on log likelihood algebra, which has been extended in [17] to

non-binary codes using log likelihood ratio (LLR) vectors. However, the complexity

of performing the necessary calculations in the log domain is too high. For that

reason, algorithms using approximations which could be determined efficiently were

developed. For example, the Max-Log-MAP algorithm [53] could be used. This

was faster than the standard LLR MAP algorithm, however the approximations

were too coarse, significantly degrading performance. As a compromise, the Log-

MAP algorithm [54] was suggested. This method involves an alternative exact

calculation, but stores some commonly used values in a lookup table. Thus it is more

efficient than LLR MAP. Perhaps the most famous of the symbolwise MAP decoding

algorithms however was the BCJR algorithm [8], which has regained significant

attention with the advent of turbo codes and iterative decoding.

Alternatively, under an ML scheme, a uniform a priori probability distribution

is assumed. The estimate for the ith symbol is the field element which maximises

the APP, so that by Bayes’ rule,

ui = arg maxg∈GF (p)

{P (ui = g|v)}

= arg maxg∈GF (p)

∑

u∈Cui=g

P (u|v)

= arg maxg∈GF (p)

∑

u∈Cui=g

P (v|u)P (u)

= arg maxg∈GF (p)

∑

u∈Cui=g

P (v|u)

.

(2.104)

The values for conditional probabilities P (v|u) can be obtained from the channel

44


model being employed. For example if the channel is memoryless, then

P (v|u) =n∏

i=1

P (vi|ui). (2.105)

Thus, by substituting (2.105) into (2.104), the symbolwise ML estimates for a mem-

oryless channel can be calculated as


∑

u∈Cui=g

n∏

j=1

P (vj|uj)

. (2.106)

However, if the channel model is not memoryless then (2.105) is not applicable. For

the models presented in Section 2.2.2, both the hidden sequence of state transitions

and the symbol error processes within each state must be accounted for. This is

handled precisely by the matrix probabilities D(vj|uj) = {D0,Dǫ} in (2.53). For

initial state distribution σ0 and all-ones column vector e as defined in (2.19), the

symbolwise ML estimates can be expressed as


σ0

∑

u∈C,ui=g

n∏

j=1

D(vj|uj)

e

. (2.107)

2.3.3 Trellis representations of linear block codes

According to [49], a trellis T = (N ,B) is a directed graph with node set N and

branch set B where each element of B possesses a label. Branches in a trellis of

length n may only join a node at depth i ∈ {0, 1, . . . , n−1} to a node at depth

i+1. Let E and F be the sets of nodes at depths zero and n, respectively. A path

of length l in a trellis T is defined for {b1, b2, . . . , bl} ∈ B as a sequence of branches

(b1, b2, . . . , bl) which is traversable in T and extends from depth η to depth η+l for

some η ∈ N. An example of a trellis defined by the sets

N = {1, 2, . . . , 8}, B = {b1, b2, . . . , b7}, E = {1, 2}, F = {8} (2.108)

is shown in Fig. 2.13. This trellis contains the 13 paths (b1), (b2), . . . , (b7), (b1, b2),

(b3, b6), (b4, b6), (b6, b7), (b3, b6, b7) and (b4, b6, b7).

A code trellis for an (n, k) linear block code C over GF (p) is a trellis containing

a path of length n corresponding to each codeword of C. This definition of a code

trellis is used in the construction of a syndrome trellis for C. However, in order for

45


n

n

n

n

n

n

n

n��

��

��

-

@@

@@

@@@R

-

@@

@@

@@@R

-

-2

1

5

4

3

7

6

8

b1 b2

b3

b4

b5

b6

b7

Figure 2.13: A basic trellis with eight nodes and seven branches.

trellis decoding algorithms to be efficient, it may be beneficial in some applications

to construct a trellis of minimal size, usually requiring some branches and nodes to

be removed. Trellises can also be used to represent the symbolwise ML estimates in

(2.106) and (2.107). However in this case, a different set of branches is removed as

will be illustrated below.

Syndrome trellis

Consider an (n, k) linear block code C over GF (p) and suppose C has parity check

matrix H defined by its n columns of length n−k:

H =[

h1, h2, . . . , hn

]

. (2.109)

Also, let the syndrome s of a given received word v = [v1, v2, . . . , vn] be calculated

recursively with the reception of the symbols vi ∈ GF (p) over discrete time as

si+1 = si + vi+1hTi+1. (2.110)

Each of the pn−k levels of the syndrome trellis is associated with a different syn-

drome and, by the arguments presented in Section 2.3.2, a corresponding coset

Vt, t ∈ {0, 1, . . . , pn−k − 1}, where V0 = C. The actual value of the subscript t

of a coset Vt may be calculated as the decimal representation of the syndrome

s = [sn−k−1, . . . , s1, s0] as

t = dec(s) =n−k−1∑

j=0

sjpj. (2.111)

46


The set of nodes corresponding to partial syndromes si at depth i lying on a path

from E is denoted Ni. To begin, set

E = N0 = {[0, 0, . . . , 0]} (2.112)

as only codewords will result in an all-zero syndrome. A branch joins a node in

Ni+1 depending on the partial syndrome si at depth i and level corresponding to

the partial syndrome si+1 = si ⊕ vi+1hTi+1, ∀vi+1 ∈ GF (p). Set

Ni+1 = {si ⊕ vi+1hTi+1}, (2.113)

where all partial syndromes si represented by nodes in Ni are considered. Repeating

this process for i ∈ {0, 1, . . . , n − 1}, the syndrome trellis of the code can be con-

structed. Clearly, the syndrome trellis provides a systematic way of decomposing

the n-dimensional vector space [GF (p)]n into the different cosets.

The construction of a code trellis that is minimal in size with respect to compris-

ing only paths of codewords through the trellis can be achieved in at least two ways.

One of these is through the elimination of illegal branches of the syndrome trellis,

made with the parity check matrix. It is shown in [49] that the removal of these

illegal paths ensures that the resulting trellis is minimal in terms of the size |N | of

the set N of total number of nodes in the trellis. Another method of minimising the

code trellis with respect to the overall number of nodes uses the generator matrix in

a special form, however this method will not be considered in detail here. Basically,

the trellis is built from the Shannon product [55, 56] of trellises from each row of

the generator matrix, but it requires the LR algorithm [49] to format the generator

matrix in order to produce the minimal trellis. The method involving the parity

check matrix is however simpler and more practical.

Example 2.1. Consider the (4, 2) linear block code C over GF (2) defined by the

parity check matrix

H =[

h1, h2, h3, h4

]

=

[

1 1 1 0

0 1 0 1

]

. (2.114)

The four codewords of this binary block code are given as the rows of the set

C =

0 0 0 0

0 1 1 1

1 0 1 0

1 1 0 1

. (2.115)

47


- - - -

- -

- - -

- -

AAAAAAAAAAAAA

AAAAU

BBBBBBBBBBBBBBBBBBB

BBBBBBN

��

��

��

��

AAAAAAAAAAAAA

AAAAU

AAAAAAAAAAAAA

AAAAAAU

��

��

��

��

@@

@@

@@@

@@R

��

��

��

��

@@

@@

@@@

@@R

��

��

��

��

u u u u u

u u u

u u u u

u u u

[

0 0]

[

0 1]

[

1 0]

[

1 1]

h1 h2 h3 h4

0 0 0 00 1 1 11 0 1 01 1 0 1

=V0 =C

0 0 0 10 1 1 01 0 1 11 1 0 0

=V1

0 0 1 00 1 0 11 0 0 01 1 1 1

=V2

0 1 0 00 0 1 11 1 1 01 0 0 1

=V3

(a)

- - - -

-

-

AAAAAAAAAAAAA

AAAAU

BBBBBBBBBBBBBBBBBBB

BBBBBBN

��

��

��

��

��

��

��

��

��

��

��

��

u u u u u

u u

u u

u

[

0 0]

[

0 1]

[

1 0]

[

1 1]

h1 h2 h3 h4

V0 = C

V1

V2

V3

(b)

Figure 2.14: Trellis representations of the binary (4,2) linear block code C:(a) standard syndrome trellis, (b) minimal trellis (Dashed: si+1 = si, Solid:

si+1 = si ⊕ hTi+1).

48


The structure of the syndrome trellis for this code is shown in Fig. 2.14(a) along

with the corresponding syndromes on the four nodes from the set E of originating

nodes, and cosets on the four nodes from the set F of terminating nodes.

The minimal trellis for the considered (4, 2) linear block code C defined by the

parity check matrix H is presented in Fig. 2.14(b). In this case, only the paths

through the trellis taken by codewords are considered and the sets E and F contain

one node each corresponding to the all-zeroes syndrome:

E = N0 = {[0, 0, 0, 0]}, (2.116)

F = N4 = {[0, 0, 0, 0]}. (2.117)

Trellises for APP decoding

A trellis used to perform symbol-by-symbol APP decoding provides a direct graph-

ical representation of the summands of the decision rule given in (2.106) or (2.107).

Its construction is similar to the full syndrome trellis, see Fig. 2.14(a) for example.

However, there are two notable changes to be imposed on the construction of a trellis

for APP decoding. Firstly, from the summation indices, the symbol ui in the ith

position of interest is required to have a particular value. Thus, in order to make

comparisons using the maximisation operation, a total of p different trellises are

constructed. The ith section of the trellis for ui = g ∈ GF (p), lying between depths

i and i + 1, only allows the at most pn−k branches from the level corresponding

to a partial syndrome si ∈ Ni to the level corresponding to the partial syndrome

si+1 = si ⊕ ghTi+1 ∈ Ni+1. This section will be much sparser than the other n−1

sections due to this constraint. Secondly, trellis paths may start at a level corre-

sponding to any of the pn−k syndrome states in order to support a description of

the trellis structure by trellis matrices.

Example 2.2. To illustrate the features involved in the structure of an APP de-

coding trellis, the two trellises for estimating the second symbol u2 in a codeword

u = [u1, u2, u3, u4] of the (4, 2) linear block code C of Example 2.1 are displayed in

Fig. 2.15. Accordingly, the trellis shown in Fig. 2.15(a) provides a systematic way of

arranging those words in the 4-dimensional vector space [GF (2)]4 into sets of words

that result in the same syndrome and in addition have the symbol u2 = 0 at position

i = 2. On the other hand, the trellis shown in Fig. 2.15(b) also structures words in

the 4-dimensional vector space [GF (2)]4 into sets of words that result in the same

syndrome but have symbol u2 = 1 at position i = 2. In this example, the following

49


unions of subsets may be performed to produce the code and different cosets:

C =

{

0 0 0 0

1 0 1 0

}

∪{

0 1 1 1

1 1 0 1

}

, (2.118)

V1 =

{

0 0 0 1

1 0 1 1

}

∪{

0 1 1 0

1 1 0 0

}

, (2.119)

V2 =

{

0 0 1 0

1 0 0 0

}

∪{

0 1 0 1

1 1 1 1

}

, (2.120)

V3 =

{

0 0 1 1

1 0 0 1

}

∪{

0 1 0 0

1 1 1 0

}

. (2.121)

50


- - - -

-

AAAAAAAAAAAAAA

AAAAAAAU

��

��

v v v v v

v v v v v

v v v v v

v v v v v

[

0 0]

[

0 1]

[

1 0]

[

1 1]

{

0 0 0 01 0 1 0

}

⊂V0 =C

{

0 0 0 11 0 1 1

}

⊂V1

{

0 0 1 01 0 0 0

}

⊂V2

{

0 0 1 11 0 0 1

}

⊂V3

(a)

-

-

AAAAAAAAAAAAAA

AAAAAAAU

BBBBBBBBBBBBBBBBBBBBB

BBBBBBBN

��

��

��

�

��

��

��

��

��

��

�

��

��

v v v v v

v v v v v

v v v v v

v v v v v

[

0 0]

[

0 1]

[

1 0]

[

1 1]

{

0 1 1 11 1 0 1

}

⊂V0 =C

{

0 1 1 01 1 0 0

}

⊂V1

{

0 1 0 11 1 1 1

}

⊂V2

{

0 1 0 01 1 1 0

}

⊂V3

(b)

Figure 2.15: Trellis representations of the binary (4,2) linear block code C suitablefor computing APPs: (a) P (u2 = 0|v), (b) P (u2 = 1|v) (Dashed: si+1 = si, Solid:

si+1 = si ⊕ hTi+1).

51

Chapter 3

APP Decoding on Discrete

Channels without Memory

The BSC is one of the simplest channel models developed to describe the error

characteristics of a transmission channel assuming errors occur independently. It

is specified by a single parameter referred to as the crossover probability, which

quantifies the probability of error for every bit transmitted. The generalisation of

this concept to an alphabet of size p is given by the p-ary DMC. In order to provide

a comprehensive assembly of APP decoding algorithms for a variety of classes of

discrete channels, these two memoryless channels form the starting point. These

idealistic discrete channel models can also be deployed in the formulation of models

for channels with memory as outlined in Chapter 2. As such, the fundamental

concepts behind the APP decoding strategies in this chapter can subsequently be

adapted to be employed for APP decoding on discrete channels with memory.

One approach for APP decoding is to work directly on the code trellis. Some of

the more prominent trellis-based APP decoding approaches include the BCJR algo-

rithm [8] and the Viterbi algorithm [51]. Wolf [57] improved on Viterbi’s approach

by minimising the trellis. Johansson and Zigangirov [9] made a significant contribu-

tion to APP decoding over memoryless channels by reducing the BCJR algorithm to

one requiring only a single sweep of the trellis. Other algorithms are predominantly

algebra-based. One such example is an algorithm involving generating functions and

Fourier transforms given in [14]. Clearly, the aim is to derive algorithms which are

as efficient to execute and as simple to implement as possible.

The methods described in this chapter, however, use representation theory and

linear algebra as suggested in [27] to perform the decoding. That is, the information

about the algebraic characteristics of the linear block code contained in the trellis

is mapped into a matrix group by a homomorphism. Two sorts of homomorphism

53

are covered here. One is a direct mapping into a cyclic subgroup of the group of

permutation matrices, whilst the other uses a diagonalisation technique to map the

necessary information into diagonal matrices. The calculations are then performed

using matrix algebra, and the required APPs are extracted from the resulting ma-

trix representations. Since the trellises need to be weighted according to the error

characteristics of the channel, the same applies to the matrices used in the calcu-

lations. For memoryless channels, this is achieved by scalar multiplication, so that

weighting does not increase the size of the matrices used. With this strategy, a de-

coding decision can be made efficiently using either of the homomorphisms, although

the diagonal matrix approach clearly has merit due to the simplicity of performing

arithmetic with such matrices.

This chapter is structured as follows. Section 3.1 formulates the APP decoding

problem which will be solved in this chapter. Then, Section 3.2 demonstrates how

a matrix representation of a code trellis in the original domain can be constructed

for both binary and non-binary linear block codes. This leads to elementary trellis

matrices, trellis section matrices, and the trellis matrix of the full code trellis. On

this basis, the stochastic characteristics of the discrete channel without memory are

included in terms of the crossover probability. The derivations result in weighted

trellis section matrices and the related weighted trellis matrix of the full weighted

trellis. Eventually, the APP decoding procedures for BSCs and DMCs are formu-

lated in the original domain. Having established matrix representations of the code

trellis and the weighted code trellis, the tools of linear algebra with respect to eigen-

values and eigenvectors of matrices are used in Section 3.3 to perform a similarity

transformation of the matrices in the original domain into diagonal matrices in the

spectral domain. It should be mentioned that the term “spectral domain” has been

chosen because the spectrum of a transformation on a finite dimensional vector space

is given by the set of all its eigenvalues. Such eigenvalues appear in the matrix rep-

resentations of the spectral domain, namely, elementary spectral matrices, spectral

section matrices, and spectral matrices corresponding to a full code trellis, as well

as the different weighted spectral matrices. The ultimate outcome in the algorith-

mically much simpler spectral domain are conditional spectral coefficients. These

coefficients can be related to the a posteriori probabilities needed for deriving an

APP decoding decision by a straightforward inverse transformation. In Section 3.4,

instructive examples are provided to demonstrate the algorithmic components in-

volved in the original domain and the spectral domain. In particular, it is shown

how the dual code and the elements of the received word control the APP decoding

in the spectral domain. Numerical examples are contained in Section 3.5, describing

54

3.1. PROBLEM STATEMENT

the performance of APP decoding for several selected codes on DMCs using com-

puter simulations. It should be noted that these performance investigations are used

to indicate options for the applications of the presented APP decoding approach,

but they are not considered to provide an exhaustive investigation into good code

and channel combinations. Finally, the chapter is summarised in Section 3.6.

The principal contributions of this chapter are:

• A formulation of the a posteriori probabilities required for decoding in terms

of the trace of the weighted spectral matrix of the full weighted trellis.

• Instructive examples showing the algorithmic differences between the original

and spectral domain approaches to APP decoding for a memoryless channel.

• A selection of numerical examples obtained by computer simulations which

demonstrate the range of performance analysis options that are supported by

the proposed APP decoding on discrete channels without memory.

3.1 Problem Statement

Consider a systematic (n, k) linear block code C in standard form over GF (p).

Furthermore, it is assumed that the linear block code C is used on a DMC, which

is characterised by conditional probabilities P (vj|uj), j = 1, 2, . . . , n. In view of

(2.106) under the assumption of a uniform a priori probability distribution, the APP

decoding problem of finding an estimate ui, i = 1, 2, . . . , k, of the ith transmitted

symbol ui of codeword u can then be formulated as


∑

u∈Cui=g

n∏

j=1

P (vj|uj)

. (3.1)

In the binary case, (3.1) may be rewritten as

ui =

0 if∑

u∈Cui=0

n∏

j=1

P (vj|uj) ≥∑

u∈Cui=1

n∏

j=1

P (vj|uj),

1 otherwise,

(3.2)

or in terms of difference probabilities as

ui =

0 if∑

u∈Cui=0

n∏

j=1

P (vj|uj) −∑

u∈Cui=1

n∏

j=1

P (vj|uj) ≥ 0,

1 otherwise.

(3.3)

55

3.2. ORIGINAL DOMAIN MATRIX REPRESENTATIONS OF LINEAR BLOCK CODE TRELLISES

In both the binary and non-binary problem setting, a total of pk−1 products of

conditional probabilities

P (v|u) =n∏

j=1

P (vj|uj) (3.4)

have to be calculated and summed in an order such that the involved sequences of

symbols uj establish codewords u = [u1, u2, . . . un] subject to the symbol ui at the

position i of interest being a given value g ∈ GF (p). This ordering of conditional

probabilities can be performed and evaluated either in the original domain using

the code trellis and its associated matrix representation, or it may be transformed

into a spectral domain using the corresponding spectral matrix representation. The

analytical framework for both domains will be presented in the next two sections.

3.2 Original Domain Matrix Representations of

Linear Block Code Trellises

A trellis representation of a linear block code provides a systematic means of decom-

posing an n-dimensional vector space into the actual code and its other cosets (see

Section 2.3.3). In order to develop an analytical framework for APP decoding, the

structure of a code trellis needs to be described by mathematical expressions. This

can be achieved by using the concepts of linear algebra and matrix representations.

In particular, trellis matrices can be derived which account for the algebraic proper-

ties of the code, while weighted trellis matrices include the stochastic properties of

the channel in the representation. Full details on the development of these matrix

representations can be found in [27], and the main concepts are summarised in this

chapter. As the mathematical framework described in the following sections relates

directly to the code trellis without the imposition of additional modifications, it is

referred to as being examined in the original domain.

3.2.1 Matrix representation for APP decoding on BSCs

Consider a binary (n, k) linear block code C defined by a parity check matrix

H =[

h1, h2, . . . , hn

]

, (3.5)

where hj, j ∈ {1, 2, . . . , n}, is a binary column vector of length n − k. Without

loss of generality, it is assumed that the block code C is given in standard form by

56


the related generator matrix G. Accordingly, k information bits are mapped onto

codewords of length n such that they appear as the first k elements of a codeword

prior to transmission over a BSC with crossover probability ǫ.

The key component in developing a mathematical framework that allows for

the construction of a code trellis is given by the partial syndrome, as will now be

explained. As mentioned in Section 2.3.3, the syndrome s of a word u is given by

s = uHT (3.6)

and can be considered as a state of the trellis. Instead of computing the syndrome

s using (3.6), it is beneficial to produce the syndrome recursively such that the

so-called partial syndromes

sj = sj−1 ⊕ ujhTj , j = 1, 2, . . . , n, (3.7)

are generated step-by-step with the processing of the bits uj. In this way, it is

possible to determine the transitions from all possible partial syndromes or states

sj−1 to the related subsequent partial syndromes or states sj for uj = 0 and uj = 1.

Clearly, this reveals the structure of the jth section of the code trellis. An analytical

construction of the sections of a code trellis may then be derived using a matrix

representation for the partial syndrome computation of (3.7) as presented in [27].

To further break this problem down into smaller components, it is instructive to

first consider only the scalar components

s′ ≡ s+ t (mod 2) with t ≡ u · h (mod 2) (3.8)

of the partial syndrome calculation (3.7) and formulate a corresponding matrix

representation. The resulting matrices are referred to as elementary trellis matrices

and constitute the fundamental entities from which the trellis section matrices are

constructed. In this way it is possible to analytically describe the code trellis by

matrix representations and then combine this description with the characteristics

of the considered discrete channel without memory to produce an APP decoding

decision.

Elementary trellis matrices and trellis section matrices

In view of the above, firstly consider the elementary trellis matrices as the building

blocks for the formulation of trellis section matrices. For this purpose, it is noted

that a matrix representation of a group G1 of order p is a homomorphism from G1

into a group G2 of p× p matrices. In the context of the considered problem setting

57


of deriving a matrix representation for the operation given in (3.8), G1 is taken to

be the additive group of GF (p) and G2 is a group of transformations from a finite

p-dimensional vector space V over the field C of complex numbers to itself. Then,

define the particular homomorphism for the original domain as

δorig : t 7→ M(t) = circ(0, . . . , 0, 1, 0, . . . , 0), (3.9)

where the circulant matrix circ(0, . . . , 0, 1, 0, . . . , 0) has t zeroes preceding the single

one entry and p−t−1 zeroes following that one entry. The first row of a circulant

matrix is the same as its arguments in order from left to right, and the successive

rows are cyclic shifts of this vector, one position to the right per row [58].

Noting that the argument t of the circulant matrix M(t) is given in the form of

t ≡ u · h (mod p), an elementary trellis matrix can be defined as

Mh(u) = M(u · h). (3.10)

As the Galois field GF (p) is closed under multiplication and the case of binary linear

block codes C is being considered here, there are in fact only two distinct elementary

trellis matrices:

Mh(u) =

[

1 0

0 1

]

for h · u = 0,

[

0 1

1 0

]

for h · u = 1.

(3.11)

A matrix representation for each column of a parity check matrix H can then be

constructed from the elementary trellis matrices defined in (3.11). It has been

demonstrated in [27] that a representation for the jth column hj of H, under the

assumption that the jth transmitted symbol was uj, is given by the 2n−k × 2n−k

trellis section matrix

Mhj(uj) =

n−k⊗

µ=1

Mhn−k−µ,j(uj), (3.12)

where the operator ⊗ denotes the Kronecker or tensor product. Given a matrix A

of size c×d and a matrix B of size l×m, then the related Kronecker product A⊗B

is a matrix of size cl × dm and may be written as

A ⊗ B =

a1,1B a1,2B . . . a1,dB

a2,1B a2,2B . . . a2,dB

......

...

ac,1B ac,2B . . . ac,dB

. (3.13)

58


Weighted trellis matrices

Having established an analytical expression for the trellis sections in terms of trellis

section matrices, it is now possible to include the characteristics of the discrete

channel into the description to support the APP decision rule given in (3.2). This

can be simply done by weighting the trellis section matrices by the conditional

probabilities

P (vj|uj) =

{

1 − ǫ if uj = vj,

ǫ if uj 6= vj,(3.14)

as defined in (2.5). The resulting weighted trellis section matrix Uhj(uj) for the jth

section assuming that the symbol uj was transmitted is then given as

Uhj(uj) = P (vj|uj) · Mhj

(uj) =

{

(1 − ǫ) · Mhj(uj) if uj = vj,

ǫ · Mhj(uj) if uj 6= vj,

(3.15)

whilst the weighted trellis section matrix Uhjfor the jth trellis section irrespective

of the bit transmitted at that position is given by

Uhj= Uhj

(0) + Uhj(1). (3.16)

In order to include all valid paths in the trellis that are required for deriving an

APP decision ui for the ith transmitted bit ui according to (3.2), it is necessary to

calculate the matrix sum over the transitions caused by both uj = 0 and uj = 1

for all trellis sections except for the ith section. At the position of interest i, only

one transition type, caused by either ui = 0 or ui = 1, is considered. A matrix

representation of the entire weighted trellis for calculating the conditional probability

P (ui|v) that the ith transmitted bit was ui ∈ {0, 1} given the received word v, can

therefore be formulated as

UH(ui) =i−1∏

j=1

Uhj· Uhi

(ui) ·n∏

j=i+1

Uhj. (3.17)

Determining the a posteriori probabilities for a BSC

There are two more conditions which must be met in order that only paths through

the weighted trellis corresponding to codewords are considered. The first of these

conditions is that according to (3.2), the summation of probabilities must only be

taken over paths which begin in the all-zero state. This condition can be accounted

for by pre-multiplying the weighted trellis matrix UH(ui) by a row vector τ 0 of

59


length 2n−k such that only the first row of the weighted trellis matrix is selected.

The suitable row vector is given by

τ 0 =[

1, 0, . . . , 0]

. (3.18)

The suggested vector-matrix product resulting in a vector P(ui|v) of APPs Pt(ui|v),

t = 0, 1, . . . , 2n−k − 1, for the ith transmitted bit ui given the received word v, is

then obtained as

P(ui|v) =[

P0(ui|v), P1(ui|v), . . . , P2n−k−1(ui|v)]

= τ 0UH(ui). (3.19)

Due to the operation of the considered type of code trellis as a device that

organises the elements of the n-dimensional vector space [GF (p)]n into the code

C = V0 and the related cosets Vt, t = 1, 2, . . . , 2n−k−1, (3.19) gives the corresponding

conditional probabilities P0(ui|v) and Pt(ui|v), t = 1, 2, . . . , 2n−k − 1. However,

the second condition of (3.2) is that trellis paths must only correspond to valid

codewords. Therefore, only encodings that correspond to paths through the code

trellis which originate from the all-zero state and terminate at the all-zero state

have to be considered. As such, only the element P0(ui|v) of the probability vector

P(ui|v) is of interest for deriving the APP decision as will be formulated below.

APP decoding procedure for a BSC

The fundamental steps of APP decoding when using a binary (n, k) linear block

code C in standard form on a BSC can now be summarised as a formal procedure.

As this procedure is based directly on the weighted code trellis, the procedure is

referred to as being performed in the original domain.

Procedure 3.1. Given is a binary (n, k) linear block code C in standard form to

be used on a BSC with crossover probability ǫ. The linear block code C shall be

defined by a parity check matrix H. A codeword u is transmitted over the channel

and a word v is received. APP decoding in the original domain is comprised of the

following steps.

Step 1. ∀j ∈ {1, 2, . . . , n} and ∀uj ∈ {0, 1}, compute the trellis section matrix

Mhj(uj) for column hj of parity check matrix H and jth transmitted symbol

uj using (3.11) and (3.12).

Step 2. ∀j ∈ {1, 2, . . . , n} and ∀uj ∈ {0, 1}, compute the weighted trellis section

matrix Uhj(uj) using (3.14) and (3.15).

60


Step 3. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ {0, 1}, compute the weighted trellis matrix

UH(ui) for the full weighted trellis using (3.17) with (3.15) and (3.16).

Step 4. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ {0, 1}, calculate the a posteriori probability

P0(ui|v) using (3.19).

Step 5. Derive an estimate ui for the transmitted symbol ui of codeword u at each

position i ∈ {1, 2, . . . , k} using

ui =

{

0 if P0(ui = 0|v) ≥ P0(ui = 1|v),

1 if P0(ui = 0|v) < P0(ui = 1|v).(3.20)

3.2.2 Matrix representation for APP decoding on DMCs

The same APP decoding approach in the original domain as presented above for a

binary linear block code C may be adapted for use with a linear block code C over

GF (p). Accordingly, the corresponding discrete channel without memory is given by

a non-binary DMC model. The following exposition will be based on the standard

model of a p-ary DMC model as defined in Fig. 2.3(a). It is noted that using the

alternative p-ary DMC model shown in Fig. 2.3(b) would change only the weighting

of the code trellis but not the actual APP decoding procedure. The major change

compared to the binary case is seen in a broader definition of the elementary trellis

matrices and the related trellis section matrices.

Elementary trellis matrices and trellis section matrices

The generalisation of an elementary trellis matrix Mh(u) for a binary linear block

code to an (n, k) linear block code C over GF (p) is achieved by expanding the

range of the regular representation δorig from the two circulant matrices of order 2

to the set of p circulant matrices of order p. Given the definition in (3.9), the set of

elementary trellis matrices for the considered class of non-binary linear block codes

comprises the p× p matrices

s′ ≡ s+ u · h (mod p) 7→ Mh(u) =

0 1 0 . . . 0

0 0 1...

0. . . 0

... 1

1 0 0 . . . 0

u·h

. (3.21)

61


As for the case of a binary linear block code, the elementary trellis matrices

are the building blocks for the construction of the trellis section matrices of a non-

binary linear block code. A trellis section matrix Mhj(uj), assuming symbol uj was

transmitted, captures the impact of column hj of the parity check matrix H on the

state transitions in the jth trellis section and is given by the Kronecker product

Mhj(uj) =

n−k⊗

µ=1

Mhn−k−µ,j(uj) (3.22)

of size pn−k × pn−k. Again, the arrangement of elements h ∈ GF (p) in a column h

defines the order in which elementary trellis matrices Mh(u) appear in the construc-

tion of a trellis section matrix Mh(u) for a given transmitted symbol u ∈ GF (p).


The stochastic characteristics of a p-ary DMC in terms of crossover probabilities can

now easily be combined with the trellis section matrices to generate weighted trellis

matrices. For this purpose, assume that the standard model of the p-ary DMC given

in Fig. 2.3(a) is used. It then follows that (3.15) applies, but now with the weights

defined as

P (vj|uj) =

1 − (p− 1)ǫ if uj = vj,

ǫ if uj 6= vj.(3.23)

This results in the weighted trellis section matrices

Uhj(uj) = P (vj|uj) · Mhj

(uj) =

[1 − (p− 1)ǫ] · Mhj(uj) if uj = vj,

ǫ · Mhj(uj) if uj 6= vj.

(3.24)

Additionally, the weighted trellis section matrix Uhjfor the jth trellis section irre-

spective of the symbol transmitted at that position is given by

Uhj=

p−1∑

uj=0

Uhj(uj). (3.25)

It follows that the entire weighted trellis for calculating the conditional probability

P (ui|v) that the ith transmitted symbol was ui ∈ GF (p) for a received word v, is

given by

UH(ui) =i−1∏

j=1

Uhj· Uhi

(ui) ·n∏

j=i+1

Uhj. (3.26)

62


Determining the a posteriori probabilities for a DMC

Clearly, a vector P(ui|v) of APPs Pt(ui|v), t = 0, 1, . . . , pn−k − 1, can be calculated

using the same rationale as in the binary case as

P(ui|v) =[

P0(ui|v), P1(ui|v), . . . , Ppn−k−1(ui|v)]

= τ 0UH(ui). (3.27)

Although this vector is of length pn−k, only the APP P0(ui|v) is needed because this

APP relates to the paths through the trellis that originate in the all-zero state, and

after processing n symbols, also terminate in the all-zero state.

APP decoding procedure for a DMC

With the above presented analytical framework, it is thus possible to describe an

APP decoding procedure in the original domain for (n, k) linear block codes over

GF (p) on a discrete channel modelled by a p-ary DMC as follows.

Procedure 3.2. Given is an (n, k) linear block code C in standard form over GF (p)

to be used on a p-ary DMC also given in standard form. The linear block code C

shall be defined by a parity check matrix H. A codeword u is transmitted over the

channel and a word v is received. APP decoding in the original domain comprises

the following steps.

Step 1. ∀j ∈ {1, 2, . . . , n} and ∀uj ∈ GF (p), compute the trellis section matrix

Mhj(uj) for column hj of parity check matrix H and jth transmitted symbol

uj using (3.21) and (3.22).

Step 2. ∀j ∈ {1, 2, . . . , n} and ∀uj ∈ GF (p), compute the weighted trellis section

matrix Uhj(uj) using (3.24).

Step 3. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (p), compute the weighted trellis matrix

UH(ui) of the full weighted trellis using (3.26) with (3.24) and (3.25).

Step 4. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (p), calculate the a posteriori probability

P0(ui|v) using (3.27).



ui = arg maxui∈GF (p)

{P0(ui|v)}. (3.28)

63

3.3. SPECTRAL DOMAIN MATRIX REPRESENTATIONS OF LINEAR BLOCK CODE TRELLISES

3.3 Spectral Domain Matrix Representations of

Linear Block Code Trellises

It is possible to find another set of p homomorphisms to form the matrix representa-

tion of the decoding trellis. From (3.17) and (3.26), one of the most computationally

expensive tasks in Procedures 3.1 and 3.2 is the multiplication of matrices. Storage

of the matrix elements is also an issue, since the trellis matrices in the original do-

main are relatively dense. Using the spectral domain, a product of diagonal matrices

can be used instead. Matrix multiplication of two n× n matrices is approximately

an O(n2.41) operation [59], whereas it becomes O(n) when the two matrices are di-

agonal. Additionally, only n values need to be stored for a diagonal n × n matrix.

A method for obtaining this representation for p-ary transmission is shown in this

section.

Elementary spectral matrices and spectral section matrices

Given a suitable transformation matrix Wp of order p, a similarity transformation of

each elementary trellis matrix M(t) = Mh(u) onto a diagonal matrix Λ(t) = Λh(u),

∀t ∈ GF (p) can be formulated as

Λ(t) = W−1p M(t)Wp. (3.29)

It is a basic result from linear algebra that the composition of the diagonal

matrix Λ(t) may be given by the eigenvalues λs, s = 0, 1, . . . , p−1, of the matrix

M(t) in some order whilst the rows of the transformation matrix Wp represent the

corresponding eigenvectors ws. In particular, the set of eigenvalues can be obtained

by calculating the roots of the characteristic polynomial c(λ) for λ ∈ C as

c(λ) = det[λIp − M(t)], (3.30)

where det(K) denotes the determinant of a matrix K. As M(t) is a diagonal matrix

itself for t = 0, consider the cases t = 0 and t > 0 individually as follows:

t = 0: In this case, M(0) is equal to an identity matrix Ip of order p and the char-

acteristic polynomial c(λ) for λ ∈ C is obtained as

det[λIp − M(t)] = (λ− 1)p. (3.31)

The roots of this characteristic polynomial are λs = 1, s = 0, 1, . . . , p− 1.

64


t > 0: In this case, the non-zero entries of the matrix which is the argument to the

determinant operation are given in a form such as

λIp − M(t) =

λ −1

λ. . .

. . . −1

−1. . .

. . . . . .

−1 λ

, (3.32)

where the placement of the −1 in the first row depends of the value t ∈ GF (p)

and then follows the circular structure of M(t). Using Gaussian elimination,

(3.32) can be transformed into an upper triangular matrix where the main

diagonal comprises p−1 entries of λ and one entry of λ − λ1−p. Noting that

determinants of matrices are invariant under Gaussian elimination and the

fact that the determinant of an upper triangular matrix is the product of its

main diagonal entries, the characteristic polynomial can be calculated as

c(λ) = det[λIp − M(t)]

= λp − 1.(3.33)

Solving c(λ) = 0 reveals the eigenvalues of M(t) for t > 0 as being the pth

roots of unity

λs = ws; w = e− 2π

p , (3.34)

where s = 0, 1, . . . , p − 1 and =√−1. The spectrum of eigenvalues of the

matrix M(t) is therefore given by the set

W = {1, w1, w2, . . . , wp−1}. (3.35)

Having established the spectrum W of eigenvalues λ for the cases of t = 0 and t > 0,

a suitable transformation matrix Wp of order p that supports the desired diagonal-

isation of M(t) needs to be reported. It can easily be shown that an eigenvector ws

corresponding to an eigenvalue λs = ws, s = 0, 1, . . . , p− 1, may be represented as

ws =[

ws·0, ws·1, ws·2, . . . , ws·(p−1)]

. (3.36)

65


These eigenvectors can then be used as rows of the transformation matrix such that

Wp =

1 1 1 . . . 1

1 w w2 . . . wp−1

1 w2 w4 . . . wp−2

......

......

1 wp−1 wp−2 . . . w

. (3.37)

Using the transformation matrix Wp as defined in (3.37), the similarity transforma-

tion formulated in (3.29) results in a diagonal matrix

Λ(t) = diag{w0, wt, w2t, . . . , w(p−1)t} = diag{wst}. (3.38)

As the diagonal matrix Λ(t) contains the spectrum of eigenvalues of M(t) in its main

diagonal, it is referred to as a spectral matrix. Since the transformation matrix Wp is

kept fixed, the arrangement of eigenvalues in the spectral matrix Λ(t) corresponding

to the matrix M(t) depends on the value t ∈ GF (p). In view of this property, the

set of homomorphisms for the spectral domain may be specified as

δspec : t 7→ Λ(t) = diag{wst}. (3.39)

Analogously to the elementary trellis matrices in the original domain and noting

that t ≡ u · h (mod p), elementary spectral matrices can be defined for the spectral

domain as

Λh(u) = Λ(uh) = diag{ws·uh} = diag{wu·sh} = diag{wu·u⊥}, (3.40)

where u⊥ = s · h is used to indicate the relationship to the dual code C⊥. Following

simple arguments of linear algebra, the spectral section matrix Λhj(uj) for an (n, k)

linear block code C over GF (p) defined by a parity check matrix H and a given

transmitted symbol uj for j ∈ {1, 2, . . . , n} can then be deduced from the similarity

transformation

Λhj(uj) = W−1

pn−kMhj(uj)Wpn−k . (3.41)

Such a matrix for each trellis section is constructed from the Kronecker product of

elementary spectral matrices, given by

Λhj(uj) =

n−k⊗

µ=1

Λhn−k−µ,j(uj) = diag{wuj ·u

⊥

s,j}, (3.42)

66


where u⊥s,j is the jth symbol of the sth dual codeword u⊥s = sH, and s = dec(s)

denotes the decimal representation of the p-ary vector s. It should also be mentioned

that the transformation matrix Wpn−k can be constructed recursively using the

elementary transformation matrix Wp, as shown in [27], by

Wpn−k = Wpn−k−1 ⊗ Wp. (3.43)

For p = 2, the transformation matrix W2n−k of order 2n−k is known as a Walsh-

Hadamard matrix, while for prime number p > 2, the obtained matrix Wpn−k of

order pn−k may be considered as a generalised Walsh-Hadamard matrix. As far as

the inverse matrix W−1pn−k of matrix Wpn−k is concerned, it is easy to show that

W−1pn−k =

1

pn−kWH

pn−k , (3.44)

where (·)H denotes the Hermitian of the argument (see Appendix A).

Weighted spectral matrices

With the above results, it is also possible to perform a similarity transformation of

the weighted trellis matrices of the original domain into weighted spectral matrices

of the spectral domain. This operation can be formulated to produce a weighted

spectral matrix ΘH(ui) for the whole weighted trellis and a given argument ui at

position i ∈ {0, 1, . . . , k} of interest as

ΘH(ui) = W−1pn−kUH(ui)Wpn−k

=i−1∏

j=1

[

W−1pn−kUhj

Wpn−k

]

·[

W−1pn−kUhi

(ui)Wpn−k

]

·n∏

j=i+1

[

W−1pn−kUhj

Wpn−k

]

=i−1∏

j=1

Θhj· Θhi

(ui) ·n∏

j=i+1

Θhj. (3.45)

The individual factors in the product (3.45) shall be referred to as weighted spectral

section matrices Θhj(uj) and are given, for the jth section assuming that symbol uj

was transmitted over a p-ary DMC in standard form, as

Θhj(uj) = W−1

pn−kUhj(uj)Wpn−k

= P (vj|uj)Λhj(uj).

(3.46)

67


Incorporating (3.23), (3.46) may be further specified as

Θhj(uj) =

[1 − (p− 1)ǫ] · Λhj(uj) if uj = vj,

ǫ · Λhj(uj) if uj 6= vj,

=

diag{Θs,j(uj) = [1 − (p− 1)ǫ] · wuj ·u⊥

s,j} if uj = vj,

diag{Θs,j(uj) = ǫ · wuj ·u⊥

s,j} if uj 6= vj.(3.47)

Similarly, the weighted spectral section matrices Θhjfor the jth section regardless

of the transmitted symbol uj can be expressed as

Θhj= W−1

pn−kUhjWpn−k

=

p−1∑

uj=0

Θhj(uj)

= diag

Θs,j =

p−1∑

uj=0

P (vj|uj) · wuj ·u⊥

s,j

. (3.48)

By substituting (3.47) and (3.48) into (3.45), the weighted spectral matrix ΘH(ui)

with focus on symbol ui at position i ∈ {1, 2, . . . , k} can be reformulated as

ΘH(ui) = diag

{

Qs(ui|v) = Θs,i(ui)n∏

j=1, j 6=i

Θs,j

}

, (3.49)

where the so-called conditional spectral coefficients Qs(ui|v), s = 0, 1, . . . , pn−k − 1,

of the spectral domain represent the counterpart to the conditional probabilities

Pt(ui|v), t = 0, 1, . . . , pn−k − 1, of the original domain. The conditional spectral

coefficients may be expressed as

Qs(ui|v) = P (vi|ui) · wui·u⊥

s,i ·n∏

j=1, j 6=i

p−1∑

uj=0

P (vj|uj) · wuj ·u⊥

s,j

. (3.50)

Determining the a posteriori probabilities in the spectral domain

With the proposed similarity transformations as outlined above, it is straightforward

to determine the APPs using the conditional spectral coefficients of the spectral

domain as follows. In the first step, the transformation matrix Wpn−k is applied to

rewrite the vector P(ui|v) of conditional probabilities Pt(ui|v), t = 0, 1, . . . , pn−k−1,

68


for transmitted symbol ui, i ∈ {1, 2, . . . , k}, as

P(ui|v) = τ 0Wpn−k · W−1pn−kUH(ui)Wpn−k · W−1

pn−k

= τ 0Wpn−k · ΘH(ui) · W−1pn−k

= ι0ΘH(ui) · W−1pn−k

= Q(ui|v)W−1pn−k . (3.51)

In other words, instead of calculating the vector of conditional probabilities directly

in the original domain, the initial vector τ 0 and weighted spectral matrix UH(ui)

may be transformed into the spectral domain, resulting in the vector of conditional

spectral coefficients

Q(ui|v) =[

Q0(ui|v), Q1(ui|v), . . . , Qpn−k−1(ui|v)]

= ι0ΘH(ui), (3.52)

where

ι0 = τ 0Wpn−k =[

1, 1, . . . , 1]

. (3.53)

Exploring the simple computational structure of the spectral domain and then

performing an inverse transformation to return to the original domain eventually

leads to the required a posteriori probability

P0(ui|v) =1

pn−ktr[ΘH(ui)] =

1

pn−k

pn−k−1∑

s=0

Qs(ui|v), (3.54)

where tr(K) denotes the trace of a matrix K. The remaining elements of the vector

of conditional probabilities may be obtained using the inverse transform

Pt(ui|v) =1

pn−k

pn−k−1∑

s=0

Qs(ui|v)w<−s,t>, (3.55)

where the operator < s, t > denotes the scalar product in modulo p arithmetic

between s = vecp(s) and t = vecp(t), which are p-ary vectors representing the

decimal numbers s and t, respectively.

APP decoding procedure in the spectral domain

The following procedure formulates APP decoding in the spectral domain and out-

puts the estimated sequence of information symbols for a linear block code over a

BSC or DMC.

69

3.4. INSTRUCTIVE EXAMPLES

Procedure 3.3. Given is an (n, k) linear block code C in standard form over GF (p)

to be used on a p-ary DMC also given in standard form. The linear block code C

shall be defined by a parity check matrix H. The codeword u is transmitted over

the channel and the received word is obtained as v. APP decoding in the spectral

domain comprises the following steps.

Step 1. ∀j ∈ {1, 2, . . . , n} and ∀uj ∈ GF (p), compute the spectral section matrix

Λhj(uj) for column hj of parity check matrix H and jth transmitted symbol

uj using (3.40) and (3.42).

Step 2. ∀j ∈ {1, 2, . . . , n} and ∀uj ∈ GF (p), compute the weighted spectral section

matrix Θhj(uj) using (3.23) and (3.47).

Step 3. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (p), compute the weighted spectral matrix

ΘH(ui) relating to the full weighted trellis using (3.45) with (3.47) and (3.48).




{tr[ΘH(ui)]}. (3.56)

3.4 Instructive Examples

Here, an example is provided to illustrate the algorithmic components involved in

APP decoding of linear block codes on discrete channels without memory using

the original domain and spectral domain as formulated in Procedures 3.1 and 3.3,

respectively. For this purpose, consider the binary (4, 2) linear block code C defined

by the parity check matrix H given in (2.114) (see Example 2.1). Accordingly, the

related discrete channel without memory shall be modelled by a BSC with crossover

probability ǫ. Furthermore, without loss of generality, assume that the objective is

to obtain an APP decoding decision for the symbol u2 at position i = 2 of codeword

u, given the received word

v = [1, 1, 1, 0]. (3.57)

3.4.1 Example of decoding in the original domain

Using the parity check matrix H defined in (2.114), the trellis section matrices

Mhj(uj) for the four sections of the trellis and argument uj = 0, j = 1, 2, 3, 4, can

70


be calculated using (3.12) as

Mh1(0) = Mh2

(0) = Mh3(0) = Mh4

(0) = I4. (3.58)

Similarly, for argument uj = 1, the trellis section matrices Mhj(uj), j = 1, 2, 3, 4,

are obtained using (3.12) as

Mh1(1) = M1(1) ⊗ M0(1) =

0 0 1 0

0 0 0 1

1 0 0 0

0 1 0 0

, (3.59)

Mh2(1) = M1(1) ⊗ M1(1) =

0 0 0 1

0 0 1 0

0 1 0 0

1 0 0 0

, (3.60)

Mh3(1) = M1(1) ⊗ M0(1) =

0 0 1 0

0 0 0 1

1 0 0 0

0 1 0 0

, (3.61)

Mh4(1) = M0(1) ⊗ M1(1) =

0 1 0 0

1 0 0 0

0 0 0 1

0 0 1 0

. (3.62)

Given the received vector v = [1, 1, 1, 0], the corresponding weighted trellis section

matrices Uhj(uj) for argument uj = 0, j = 1, 2, 3, 4, can be calculated with (3.15)

and (3.58) as

Uhj(0) =

{

ǫI4 for j = 1, 2, 3,

(1 − ǫ)I4 for j = 4.(3.63)

On the other hand, the weighted trellis section matrices Uhj(uj) for argument

uj = 1, j = 1, 2, 3, 4, can be calculated with (3.15) and the trellis section matri-

ces given by (3.59)-(3.62) as

Uhj(1) =

{

(1 − ǫ)Mhj(1) for j = 1, 2, 3,

ǫMhj(1) for j = 4.

(3.64)

Then using (3.16), a weighted matrix representation of the whole trellis in order to

determine the likelihood of u2 = 0 and u2 = 1, respectively, can be obtained by

71


computing the following matrix products (see also Fig. 3.1):

UH(u2 = 0) = Uh1Uh2

(0)Uh3Uh4

, (3.65)

UH(u2 = 1) = Uh1Uh2

(1)Uh3Uh4

. (3.66)

After some elementary algebra, the related probabilities can be determined from

(3.65) and (3.66) by (3.19) as

P0(u2 = 0|v) = ǫ(1 − ǫ)(2ǫ2 − 2ǫ+ 1), (3.67)

P0(u2 = 1|v) = 2ǫ2(1 − ǫ)2. (3.68)

The APP decoding decision can be deduced from the following expression:

ui =

{

0 if ǫ(1 − ǫ)(2ǫ2 − 2ǫ+ 1) ≥ 2ǫ2(1 − ǫ)2,

1 if ǫ(1 − ǫ)(2ǫ2 − 2ǫ+ 1) < 2ǫ2(1 − ǫ)2.(3.69)

Noting that the difference probability

P0(u2 = 0|v) − P0(u2 = 1|v) = (2ǫ− 1)2, (3.70)

is non-negative for all possible crossover probabilities ǫ ∈ [0, 1], the decision rule

given by (3.3) produces the estimate u2 for the second symbol u2 of a codeword

u = [u1, u2, u3, u4] ∈ C as

u2 = 0. (3.71)

3.4.2 Example of decoding in the spectral domain

Alternatively, the same APP decoding problem can be solved using the Walsh-

Hadamard matrix of order four to diagonalise the trellis matrices given in Sec-

tion 3.4.1. This allows direct consideration of the related spectral section matri-

ces as defined in (3.42). The elements u⊥s,j, j = 1, 2, 3, 4, of the dual codewords

u⊥s = [u⊥s,1, u

⊥s,2, u

⊥s,3, u

⊥s,4], s = 0, 1, 2, 3, needed to evaluate (3.42) are given as the

rows of the set

C⊥ =

0 0 0 0

0 1 0 1

1 1 1 0

1 0 1 1

. (3.72)

72


- - - -

- - - -

- - - -

- - - -

AAAAAAAAAAAAAA

AAAAAU

AAAAAAAAAAAAAA

AAAAAAU

��

��

��

��

AAAAAAAAAAAAAA

AAAAAU

AAAAAAAAAAAAAA

AAAAAAU

��

��

��

��

@@

@@

@@

@

@@@R

��

��

��

�

��

@@

@@

@@

@

@@@R

��

��

��

�

��

v v v v v

v v v v v

v v v v v

v v v v v

1

0

0

0

P0(u2 =0|v)

P1(u2 =0|v)

P2(u2 =0|v)

P3(u2 =0|v)

ǫ

ǫ

ǫ

ǫ

1−ǫ

1−ǫ1−ǫ

1−ǫ

ǫ

ǫ

ǫ

ǫ

ǫ

ǫ

ǫ

ǫ

1−ǫ

1−ǫ1−ǫ

1−ǫ

ǫ

ǫ

ǫ

ǫ

1−ǫ

1−ǫ

1−ǫ

1−ǫ

(a)

- - -

- - -

- - -

- - -

AAAAAAAAAAAAAA

AAAAAU

AAAAAAAAAAAAAA

AAAAAAU

��

��

��

��


BBBBBBBN

��

��

��

�

��

��

��

@@

@@

@@

@

@@@R

AAAAAAAAAAAAAA

AAAAAU

AAAAAAAAAAAAAA

AAAAAAU

��

��

��

��

@@

@@

@@

@

@@@R

��

��

��

�

��

@@

@@

@@

@

@@@R

��

��

��

�

��

v v v v v

v v v v v

v v v v v

v v v v v

1

0

0

0

P0(u2 =1|v)

P1(u2 =1|v)

P2(u2 =1|v)

P3(u2 =1|v)

ǫ

ǫ

ǫ

ǫ

1−ǫ

1−ǫ1−ǫ

1−ǫ

1−ǫ

1−ǫ

1−ǫ1−ǫ

ǫ

ǫ

ǫ

ǫ

1−ǫ

1−ǫ1−ǫ

1−ǫ

ǫ

ǫ

ǫ

ǫ

1−ǫ

1−ǫ

1−ǫ

1−ǫ

(b)

Figure 3.1: Original domain APP decoding trellises for the binary (4,2) linear blockcode C which allow for computation of the conditional probabilities (a) P (u2 = 0|v)

and (b) P (u2 = 1|v). (Dashed: sj+1 = sj, Solid: sj+1 = sj ⊕ hTj+1.)

73


The spectral section matrices Λhj(uj) for uj = 0, j = 1, 2, 3, 4 are given by (3.42)

as

Λh1(0) = Λh2

(0) = Λh3(0) = Λh4

(0) = I4, (3.73)

while for uj = 1, the four spectral section matrices obtained may be expressed as

Λh1(1) = Λ1(1) ⊗ Λ0(1) = diag{+1,+1,−1,−1}, (3.74)

Λh2(1) = Λ1(1) ⊗ Λ1(1) = diag{+1,−1,−1,+1}, (3.75)

Λh3(1) = Λ1(1) ⊗ Λ0(1) = diag{+1,+1,−1,−1}, (3.76)

Λh4(1) = Λ0(1) ⊗ Λ1(1) = diag{+1,−1,+1,−1}. (3.77)

It may be instructive to visualise the structure of the spectral section matrices for

uj = 1, j = 1, 2, 3, 4, given in (3.74)-(3.77) by a diagonal trellis as shown in Fig. 3.2.

Comparing the set of codewords given in (3.72) and the weights of the diagonal

trellis in Fig. 3.2, the following relationship can be seen:

C⊥ =

0 0 0 0

0 1 0 1

1 1 1 0

1 0 1 1

↔

+1 +1 +1 +1

+1 −1 +1 −1

−1 −1 −1 +1

−1 +1 −1 −1

. (3.78)

In other words, the algebraic characteristics of the linear block code C under con-

sideration are represented in the spectral domain by the corresponding dual code

C⊥. As such, state transitions in the trellis of the original domain are transformed

into a pattern of +1 and −1 weights in the diagonal trellis of the spectral domain.

Then, the eight spectral matrices Λhj(uj) given in (3.73)-(3.77) must be weighted

by the conditional probabilities P (vj|uj), j = 1, 2, 3, 4. Given the received word

v = [1, 1, 1, 0], these probabilities may be expressed as

[P (1|u1), P (1|u2), P (1|u3), P (0|u4)]=

{

[ǫ, ǫ, ǫ, 1 − ǫ] for uj = 0, j = 1, 2, 3, 4,

[1−ǫ, 1−ǫ, 1−ǫ, ǫ] for uj = 1, j = 1, 2, 3, 4.

(3.79)

The resulting weighted spectral matrices Θhj(uj) for a transmitted symbol of uj = 0,

j = 1, 2, 3, 4, can then be expressed using (3.47) as

Θh1(0) = diag{ǫ, ǫ, ǫ, ǫ}, (3.80)

Θh2(0) = diag{ǫ, ǫ, ǫ, ǫ}, (3.81)

Θh3(0) = diag{ǫ, ǫ, ǫ, ǫ}, (3.82)

Θh4(0) = diag{1 − ǫ, 1 − ǫ, 1 − ǫ, 1 − ǫ}, (3.83)

74


u u u u u

u u u u u

u u u u u

u u u u u

- - - -

- - - -

- - - -

- - - -

−1 +1 −1 −1

−1 −1 −1 +1

+1 −1 +1 −1

+1 +1 +1 +1

Figure 3.2: Illustration of the relationship between the codewords u⊥s , s = 0, 1, 2, 3,

of the dual code C⊥ and the spectral section matrices Λhj(uj), j = 1, 2, 3, 4, uj = 1.

whilst those for a transmitted symbol of uj = 1, j = 1, 2, 3, 4, can be expressed as

Θh1(1) = diag{(+1)(1 − ǫ), (+1)(1 − ǫ), (−1)(1 − ǫ), (−1)(1 − ǫ)}, (3.84)

Θh2(1) = diag{(+1)(1 − ǫ), (−1)(1 − ǫ), (−1)(1 − ǫ), (+1)(1 − ǫ)}, (3.85)

Θh3(1) = diag{(+1)(1 − ǫ), (+1)(1 − ǫ), (−1)(1 − ǫ), (−1)(1 − ǫ)}, (3.86)

Θh4(1) = diag{(+1)(ǫ), (−1)(ǫ), (+1)(ǫ), (−1)(ǫ)}. (3.87)

In a similar fashion to that relationship between the dual code and weights which

is illustrated in (3.78), the products shown in (3.84)-(3.87) relate to the elements of

the codewords in the dual code but now include information about the error process

induced by the discrete channel in terms of the crossover probability.

Using (3.48) and (3.49), the matrix descriptions of the complete weighted spectral

trellis in the two cases u2 = 0 and u2 = 1 can be formulated as

ΘH(u2 = 0) = Θh1Θh2

(0)Θh3Θh4

= diag{Qs(u2 = 0|v)}, (3.88)

ΘH(u2 = 1) = Θh1Θh2

(1)Θh3Θh4

= diag{Qs(u2 = 1|v)}. (3.89)

After performing the multiplications given in (3.88) and (3.89), respectively, the

75


conditional spectral coefficients Qs(0|v), s = 0, 1, 2, 3, for u2 = 0 are obtained as

Q0(u2 = 0|v) = ǫ, (3.90)

Q1(u2 = 0|v) = ǫ(1 − 2ǫ), (3.91)

Q2(u2 = 0|v) = ǫ(2ǫ− 1)2, (3.92)

Q3(u2 = 0|v) = ǫ(2ǫ− 1)2(1 − 2ǫ), (3.93)

while for the case of u2 = 1, the results may be expressed as

Q0(u2 = 1|v) = 1 − ǫ, (3.94)

Q1(u2 = 1|v) = (1 − ǫ)(1 − 2ǫ), (3.95)

Q2(u2 = 1|v) = (1 − ǫ)(2ǫ− 1)2, (3.96)

Q3(u2 = 1|v) = (1 − ǫ)(2ǫ− 1)2(1 − 2ǫ). (3.97)

The weights of the diagonal trellis for u2 = 0 and their relationship with the con-

ditional spectral coefficients Qs(0|v), s = 0, 1, 2, 3 as found in (3.90)-(3.93) can be

seen in Fig. 3.3(a). However, in order to illustrate the impact of the dual code C⊥

and the received word v on the composition of the conditional spectral coefficients

Qs(ui|v), it may be instructive to consider the weights of the diagonal trellis for

u2 = 1 as shown in Fig. 3.3(b). For ease of exposition, define the difference weights

∆0 = 1 − 2ǫ and ∆1 = 2ǫ− 1. (3.98)

The impact of the received word v and the dual code C⊥ on the order of weights in

the diagonal trellis can then be deduced from the following relationship:

C⊥ =

0 0 0 0

0 1 0 1

1 1 1 0

1 0 1 1

↔

1 +(1 − ǫ) 1 1

1 −(1 − ǫ) 1 ∆0

∆1 −(1 − ǫ) ∆1 1

∆1 +(1 − ǫ) ∆1 ∆0

. (3.99)

For positions j = 1, 3, 4, it can be seen from (3.99) that the element u⊥j = 0 in

a codeword u⊥ of the dual code C⊥ relates to the weight +1 while the element

u⊥j = 1 relates to the weight ∆0 or ∆1 depending on the elements of the received

76


word v. In this context, it is noted that the difference value ∆0 is actually used

when vj = 0 whereas the difference value ∆1 is used when vj = 1. For i = 2,

the position at which the APP decision is to be established in this example, the

elements u⊥s,2 = 0 and u⊥s,2 = 1, s = 1, 2, 3, 4, relate to the factors +1 and −1 in the

weights, respectively. The properties of the discrete channel are accounted for by the

conditional probability P (v2|u2), which for the given transmitted symbol u2 = 1 is

determined by the element v2 = 1 of the received word v as P (1|1) = 1− ǫ. Clearly,

these structural characteristics can be used to efficiently implement APP decoding

over discrete channels without memory. In particular, only the set of weights of the

diagonal trellis has to be produced and the order of their appearance in the product

leading to the conditional spectral coefficients is determined by the dual code C⊥

and the received word v. It is to be noted that similar findings extend to the case

of discrete channels with memory, subject to the modification that the set of scalar

weights is replaced by corresponding matrices.

Having computed the related conditional spectral coefficients Qs(u2 = 0|v) and

Qs(u2 = 1|v), respectively, the mapping to the conditional probabilities P0(u2 = 0|v)

and P0(u2 = 1|v) can be derived using the inverse transform (3.54) as

P0(u2 = 0|v) =1

4

3∑

s=0

Qs(u2 = 0|v) = ǫ(1 − ǫ)(2ǫ2 − 2ǫ+ 1), (3.100)

P0(u2 = 1|v) =1

4

3∑

s=0

Qs(u2 = 1|v) = 2ǫ2(1 − ǫ)2. (3.101)

Clearly, the APP decoding decision deduced from the spectral domain characteristics

in terms of conditional spectral coefficients through the expression

ui =

0 if3∑

s=0

Qs(u2 = 0|v) ≥3∑

s=0

Qs(u2 = 1|v),

1 if3∑

s=0

Qs(u2 = 0|v) <3∑

s=0

Qs(u2 = 1|v),

(3.102)

then leads to the same outcome as in the original domain. It produces the estimate

u2 for the transmitted symbol u2 at position i = 2 of a codeword u as

u2 = 0. (3.103)

77


v v v v v

v v v v v

v v v v v

v v v v v

- - - -

- - - -

- - - -

- - - -

Q3(u2 =0|v)

Q2(u2 =0|v)

Q1(u2 =0|v)

Q0(u2 =0|v)

2ǫ− 1 ǫ 2ǫ− 1 1 − 2ǫ

2ǫ− 1 ǫ 2ǫ− 1 1

1 ǫ 1 1 − 2ǫ

1 ǫ 1 1

1

1

1

1

(a)

v v v v v

v v v v v

v v v v v

v v v v v

- - - -

- - - -

- - - -

- - - -

Q3(u2 =1|v)

Q2(u2 =1|v)

Q1(u2 =1|v)

Q0(u2 =1|v)

2ǫ− 1 1 − ǫ 2ǫ− 1 1 − 2ǫ

2ǫ− 1 ǫ− 1 2ǫ− 1 1

1 ǫ− 1 1 1 − 2ǫ

1 1 − ǫ 1 1

1

1

1

1

(b)

Figure 3.3: Weighted diagonal trellises of the binary (4, 2) linear block codeC used for computing the conditional spectral coefficients (a) Qs(u2 = 0|v) and

(b) Qs(u2 = 1|v); s = 0, 1, 2, 3.

78

3.5. NUMERICAL EXAMPLES

3.5 Numerical Examples

Computer simulations were carried out for several binary and non-binary linear

block codes to examine the BER performance of these codes over BSCs and DMCs,

respectively. In particular, APP decoding in the spectral domain as formulated in

Procedure 3.3 was used. The linear block codes were chosen such that they have code

rates of R ≥ 0.5, which ensures the complexity benefits of the spectral domain can

be utilised. It must be noted that these simulations are not intended to provide an

exhaustive performance investigation of APP decoding on discrete channels without

memory but rather to verify the applicability of the derived theoretical framework.

Simulation results for binary linear block codes on BSCs

The particulars of the considered binary (n, k) linear block codes C used together

with the BSC model are given below.

(7,4) Hamming code: This one-error correcting Hamming code of rate R = 0.57

can be defined by the parity check matrix

H =

0 1 1 1 1 0 0

1 0 1 1 0 1 0

1 1 0 1 0 0 1

. (3.104)

(16,8) Cyclic code: This two-error correcting binary block code of rate R = 0.5

is defined by a generator polynomial with coefficients given as the vector

g = [1, 1, 1, 0, 1, 0, 1, 1, 1]. Accordingly, the equivalent standard form of the

parity check matrix for this code can be obtained as

H =

1 0 1 0 0 1 0 1 1 0 0 0 0 0 0 0

0 1 1 1 0 1 1 1 0 1 0 0 0 0 0 0

0 0 0 1 1 1 1 0 0 0 1 0 0 0 0 0

1 0 0 0 1 1 1 1 0 0 0 1 0 0 0 0

1 1 1 0 0 0 1 0 0 0 0 0 1 0 0 0

1 1 1 1 0 0 0 1 0 0 0 0 0 1 0 0

1 1 0 1 1 1 0 1 0 0 0 0 0 0 1 0

0 1 0 0 1 0 1 1 0 0 0 0 0 0 0 1

. (3.105)

(22,13) Chen code: This two-error correcting binary (22,13) linear block code

of rate R = 0.59 reported by Chen, Fan, and Jin has been defined in [60], and

is henceforth referred to as the Chen code. The code can be defined by parity

79


10−4

10−3

10−2

10−1

10−6

10−5

10−4

10−3

10−2

10−1

100

ǫ

BE

R

Uncoded

(7,4) Hamming

(22,13) Chen

(16,8) Cyclic code

Figure 3.4: BER performance of some binary block codes on a BSC.

check matrix

H =

0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0

0 0 0 0 1 1 1 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0

0 0 1 1 0 0 1 0 0 1 1 1 0 0 0 1 0 0 0 0 0 0

0 1 0 1 0 1 0 0 1 0 1 1 1 0 0 0 1 0 0 0 0 0

0 1 1 0 1 0 0 1 0 1 0 1 1 0 0 0 0 1 0 0 0 0

1 0 0 1 0 1 1 0 1 1 0 1 0 0 0 0 0 0 1 0 0 0

1 0 1 0 1 1 1 1 0 1 1 0 1 0 0 0 0 0 0 1 0 0

1 1 0 1 1 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 1 0

1 1 1 0 0 0 1 1 1 0 1 1 1 0 0 0 0 0 0 0 0 1

. (3.106)

The BER performance of these block codes on BSCs was obtained through com-

puter simulations and is shown in Fig. 3.4. The performance for transmitting data

without coding is shown for comparison. As expected, the channel codes improve

the performance compared to an uncoded transmission. The (22,13) Chen code and

the (16,8) cyclic code outperform the weaker (7,4) Hamming code due to their supe-

rior error-correcting capabilities. In all cases, the BER increases with the crossover

probability ǫ.

80


10−4

10−3

10−2

10−1

10−6

10−5

10−4

10−3

10−2

10−1

100

ǫ

SE

R

Uncoded

(4,2) Hamming

(11,6) Golay

Figure 3.5: SER performance of some block codes over GF (3) on a ternary DMC.

Simulation results for ternary linear block codes on DMCs

The specifications of the considered (n, k) linear block codes C over GF (3) used on

the ternary DMC model are as follows.

(4,2) Hamming code over GF (3): This one-error correcting ternary Hamming

code of rate R = 0.5 is a perfect code and defined by the parity check matrix

H =

[

2 2 1 0

1 2 0 1

]

. (3.107)

(11,6) Golay code over GF (3): The parity check matrix of this two-error cor-

recting perfect block code of rate R = 0.54 is given by

H =

2 0 2 1 1 2 1 0 0 0 0

2 2 0 2 1 1 0 1 0 0 0

2 1 2 0 2 1 0 0 1 0 0

2 1 1 2 0 2 0 0 0 1 0

2 2 1 1 2 0 0 0 0 0 1

. (3.108)

The simulated SER performance obtained for these linear block codes overGF (3)

on ternary DMCs is displayed in Fig. 3.5. It should be mentioned that the stan-

81

3.6. SUMMARY

dard DMC model shown in Fig. 2.3(a) has been used in the simulations. Again, as

expected, it is observed that channel coding produces a gain over the uncoded sce-

nario. In this case, however, the lower rate of the Hamming code does not result in

a better performance when compared with the Golay code. Both are perfect codes,

but the Golay code can correct two errors whilst the Hamming code is single-error

correcting.

3.6 Summary

There are many different trellis-based decoding algorithms available. However, work-

ing with trellises can be cumbersome, especially if the length n of the codewords or

order p of the considered Galois field is large. It is therefore convenient to be able

to create a matrix representation of the trellis. In this way, trellis operations are

converted to addition and multiplication of matrices, which can be handled easily

by digital signal processors and computers.

The solution to the APP decoding problem in (3.1) involves forming the ma-

trix representation in either the original or the spectral domain, based solely on the

structure of the code or dual code. APP decoding algorithms were given for both

the original and spectral domains. In the original domain, the matrices are usually

non-sparse due to the many intersecting paths in the trellis. However, it has the

advantage that the APPs can be obtained directly at the terminating end of the

weighted trellis. On the other hand, the spectral domain approach involves diago-

nalising the matrix descriptions of the code trellis. This results in a diagonal trellis

with all paths parallel. The calculation of the conditional spectral coefficients can

be done relatively fast, as it requires only addition and multiplication of diagonal

matrices. These coefficients must be transformed back to APPs in order to make the

decoding decision. The involved APP decoding concepts were demonstrated in an

instructive example, verifying that the same answer is obtained when using either

domain. Some performance examples for selected linear block codes on channels

without memory were obtained by computer simulation to illustrate potential areas

of application of the presented APP decoding approach.

82

Chapter 4

APP Decoding on Binary

Channels with Memory

The errors which occur in physical wireless channels are not usually independent and

therefore memoryless models provide too coarse an approximation [61]. One solution

to the problem of obtaining an accurate model is to force the current behaviour of

the model to be dependent upon its recent behaviour and thus endow the model with

a memory. An information theoretic argument [62] demonstrates that consideration

of the behaviour of the model during only the previous symbol-period usually results

in an acceptable approximation to a mobile channel experiencing fading.

The behaviour of such a model can be described using a hidden Markov model

which represents the different concentrations of errors in a received sequence. Since

the probabilities of all possible state transitions must be incorporated into an APP

decoding algorithm, matrices are used instead of the scalar crossover probabilities

of Chapter 3, and thus the complexity increases with the number of states.

Turbo decoding algorithms [23, 63] have been developed for convolutional codes

over channels with memory, but their block code counterparts are less prevalent.

This motivates the exposition in this chapter of two APP decoding algorithms for

binary channels described by a hidden Markov model. As in the memoryless case,

one operates in the original domain and the other uses the spectral domain.

This chapter is organised as follows. Firstly, Section 4.1 defines the main prob-

lem to be solved in this chapter, which concerns APP decoding of binary linear

block codes over channels described by stochastic automata. Section 4.2 develops

and describes two procedures for performing such decoding. The solution to the

APP decoding problem is first formulated using the original domain and then the

equivalent procedure is derived in the spectral domain. In Section 4.3, the computa-

tional complexity and storage requirements of both procedures are examined. The

83


theory involved in the two procedures can be made more tangible by demonstration

in examples. Section 4.4 contains two such examples, specifically APP decoding on

a binary GEC using each domain. Section 4.5 displays some performance results

obtained by computer simulation for several codes with the spectral domain APP

decoding procedure developed in Section 4.2. In particular, it is shown how the BER

performance of the decoder is affected by changing the parameters of the channel

model. Finally, Section 4.6 summarises the chapter.

The main contributions to research of this chapter are:

• Development of an APP decoding procedure using the original domain for

binary linear block codes over channels described by stochastic automata.

• Through diagonalisation, the development of an alternative procedure using

the spectral domain to perform the same task.

• Demonstration of the benefits of the spectral domain approach for high rate

codes in terms of the storage space required.

• Numerical examples showing a variety of available options when investigating

the performance of Hamming codes with APP decoding using these proce-

dures, through computer simulation of transmission of information over GECs.


Suppose that C is a systematic binary (n, k) linear block code which is to be used

on a channel described by a stochastic automaton

D = (D,σ0). (4.1)

Here, the stochastic sequential machine

D = (U ,V ,S, {D(vj|uj)}) (4.2)

has binary input and output sets U ={0, 1} and V ={0, 1} and a set S of S states.

In general, there are four S×S matrix probabilities D(vj|uj) for uj ∈ U and vj ∈ V.

Additionally, σ0 is a row vector of length S representing the initial or stationary state

distribution of the automaton D. For an all-ones column vector e, the equations to

be solved for APP decoding of binary linear block codes over finite state channel

models can be expressed as

ui = arg maxg∈GF (2)

σ0

∑

u∈C,ui=g

n∏

j=1

D(vj|uj)

e

. (4.3)

84

4.2. BINARY APP DECODING

However, if the channel in each state is a BSC, then the four matrix probabilities

D(vj|uj) can be replaced by a set of two matrix probabilities using the equation

D(vj|uj) = Duj⊕vj∈ {D0,D1}. (4.4)

It follows from (4.3) and (4.4) that an APP decoding rule can be formulated as

ui =

0 if σ0

∑

u∈C,ui=0

n∏

j=1

Duj⊕vj

e ≥ σ0

∑

u∈C,ui=1

n∏

j=1

Duj⊕vj

e,

1 otherwise.

(4.5)

If it is further assumed that the block code C is in standard form, then the problem

to be solved is to find an estimate ui of the transmitted bit ui for i = 1, 2, . . . , k.

This is done by comparing the two matrix products for all information bits, using

either the original or the spectral domain.

4.2 Binary APP Decoding

The foundations of trellis representations using matrices were presented in Section

3.2. Furthermore, the elementary trellis matrices Mh(u) were introduced for a block

code C over GF (p). In this section, it is shown how to obtain a description of an

entire weighted trellis for a channel with memory by means of these elementary

matrices. As before, let C be defined by an (n−k) × n parity check matrix

H =[

h1, h2, . . . , hn

]

, (4.6)

where the jth column, 1 ≤ j ≤ n, is given by

hj =[

hn−k−1,j, hn−k−2,j, . . . , h0,j

]T

. (4.7)

A solution to (4.5) will first be presented using the original domain. This will be

followed by an analogous procedure in the spectral domain.

4.2.1 Original domain

The decoding procedure using the original domain is developed by first weighting

the trellis section matrices by the appropriate matrix probabilities for bit error and

non-error. The APPs are calculated from the matrix representation of the trellis

and a decoding decision is then reached.

85



The matrix representation for a trellis section must be weighted according to the

state transition and error probabilities of the channel. This is achieved by taking

the Kronecker product of the matrix for that trellis section with the required matrix

probability. For uj, vj ∈GF (2), a binary channel produces four matrix probabilities

D(vj|uj). If each state contains a BSC, then there are only two matrix probabilities

D0 = D(vj = uj|uj), (4.8)

D1 = D(vj 6= uj|uj), (4.9)

and the overall structure forms a GEC model. In this case, two possible weighted

trellis section matrices exist for each column of the parity check matrix H. One

corresponds to correct reception of the transmitted bit, and one corresponds to the

situation where an error has occurred. Incorporating (4.8) and (4.9) as extensions

of (3.15), the two (2n−kS)× (2n−kS) weighted trellis section matrices can be written

asUhj

(uj = 0) = Mhj(0) ⊗ D(vj|0) = I2n−k ⊗ Dvj

,

Uhj(uj = 1) = Mhj

(1) ⊗ D(vj|1) = Mhj⊗ Dvj

,(4.10)

where vj = vj ⊕ 1, I2n−k is the identity matrix of order 2n−k, and the trellis section

matrices Mhj(uj) are described in (3.11) and (3.12). Suppose that the ith bit is

being decoded. For all sections j 6= i of an APP decoding trellis, one horizontal

and one oblique branch leave each node. These correspond to transmitted bits 0

and 1, respectively. Given that all paths through the trellis must be considered, the

weighted trellis section matrices Uhjirrespective of the transmitted bit are used for

all but the ith section. It is possible to represent the complete trellis section in these

n−1 cases as

Uhj= Uhj

(uj = 0) + Uhj(uj = 1)

= I2n−k ⊗ Dvj+ Mhj

⊗ Dvj. (4.11)

Assigning a probability to each path through the trellis is accomplished by mul-

tiplication of the weighted trellis section matrices in order, from the first to the

nth section. However, the summation bound “u ∈ C, ui = g” in (2.107) stipulates

that the matrix Uhi(g) is the ith multiplicand in the overall product. The matrix

representation of the entire trellis for the ith transmitted bit ui is thus given by

UH(ui) =i−1∏

j=1

Uhj· Uhi

(ui) ·n∏

j=i+1

Uhj. (4.12)

86


Determining the a posteriori probabilities in the original domain

A vector P(ui|v) of 2n−k APPs must be extracted from the matrix product of size

(2n−kS) × (2n−kS) in (4.12). This is done by calculating the expected value of the

product of trellis branch values over all possible initial states. The entries of the

stationary state distribution vector σ0 are used as the values of the probability

distribution for the initial state of the finite state channel. More formally, as shown

in [26] and [27], a vector P(ui|v) of APPs may be calculated using the equation

P(ui|v) = (τ 0 ⊗ σ0) · UH(ui) · (I2n−k ⊗ e), (4.13)

where e is a length-S column vector of ones, and

τ 0 =[

1, 0, . . . , 0]

(4.14)

represents a vector of length 2n−k. The row vector τ 0 has this form because all

paths through a decoding trellis in the original domain must commence at the 0th

node. Paths commencing at any of the other nodes are not allowed. The zeroes in

(4.14) show that these paths would not contribute to the APP. The resulting vector

P(ui|v) consists of 2n−k entries Pt(ui|v) for t = 0, 1, . . . , 2n−k−1, which denote the

probability that the ith transmitted bit was ui, given that a word v was received

and that the encoding was performed by mapping a binary vector of k information

bits onto length-n words of the tth coset Vt. In this description, as was the case

for memoryless channels, the 0th coset V0 corresponds to the code C. Hence, the

required APP is P0(ui|v).

Although all rows and columns of each matrix Uhjor Uhi

(ui) are used in the

calculation of the matrix representation UH(ui) of the entire trellis for the ith trans-

mitted bit ui, once this has been calculated, only the upper S rows of UH(ui) are

used in calculating the product (τ 0 ⊗ σ0) · UH(ui). This is due to the zeroes in all

positions of τ 0 except the first. Since only the first element of the resulting product

of (τ 0 ⊗σ0) ·UH(ui) with I2n−k ⊗ e is required to find P0(ui|v), it follows that only

the first S columns of UH(ui) are ultimately relevant.

For a matrix K and l ∈ Z+, define the lth principal leading submatrix [K](l) as

the square submatrix consisting of the intersection of the first l rows and l columns

of K. For example, if K = [ki,j]3×4, then

[K](2) =

[

k1,1 k1,2

k2,1 k2,2

]

. (4.15)

87


With this notation, (4.13) can be rewritten in order to calculate P0(ui|v) as

P0(ui|v) = σ0 · [UH(ui)](S) · e. (4.16)

APP decoding procedure for the original domain

It is now possible to report a procedure which performs APP decoding for a binary

linear block code using the original domain. For a binary channel described by a

stochastic automaton, the procedure is as follows.

Procedure 4.1. Given is a binary (n, k) linear block code C in standard form,

to be used on a channel which is described by a stochastic automaton containing

a stochastic sequential machine which has S <∞ states. The linear block code C

shall be defined by parity check matrix H. A codeword u is transmitted over the

channel and a word v is received. APP decoding in the original domain comprises

the following steps.

Step 1. Use the state transition and crossover probabilities to find the stationary

state distribution vector σ0 and the two S×S matrix probabilities D0 and D1.

Step 2. ∀j ∈ {1, 2, . . . , n} and ∀uj ∈ {0, 1}, compute the matrix representation

Mhj(uj) for column hj and jth transmitted symbol uj using (3.12).

Step 3. ∀j ∈ {1, 2, . . . , n} and ∀uj ∈ {0, 1}, compute the weighted trellis section


Step 4. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ {0, 1}, compute the matrix representation of

the trellis UH(ui) using (4.12).

Step 5. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ {0, 1}, let [UH(ui)](S) be the Sth principal

leading submatrix of UH(ui) and calculate P0(ui|v) using (4.16).

Step 6. Derive an estimate ui for each position i ∈ {1, 2, . . . , k} using

ui =

{

0 if P0(ui = 0|v) ≥ P0(ui = 1|v),

1 if P0(ui = 0|v) < P0(ui = 1|v).(4.17)

An algorithm for APP decoding of binary linear block codes over a finite state

channel has been provided. In essence, it is a generalisation of the procedure shown

in Chapter 3 for memoryless channel models, but now the sequence of state transi-

tions plays an additional role. Procedure 4.1 uses the columns of the parity check

88


matrix H to construct a matrix representation of the decoding trellis, but another

procedure with strong connections to the dual code may also be developed.

4.2.2 Spectral domain

A decoding procedure for the spectral domain can be developed in a similar way

to the original domain approach by using diagonalised versions of the trellis section

matrices. After weighting these diagonal matrices by the appropriate matrix prob-

abilities of bit error and non-error, a transformation back to the original domain in

order to obtain the APPs is made.

Spectral section matrices

As outlined in Section 3.3, a set of elementary spectral matrices can be constructed,

each representing a different section of the trellis for a particular assumed transmit-

ted bit in that position. For column hj of the parity check matrix and transmitted

bit uj, the spectral matrix Λhj(uj) defined in terms of a variable s, which takes

values between 0 and 2n−k−1, may be expressed as

Λhj(uj) = diag{(−1)uj ·u

⊥

s,j}, (4.18)

where u⊥s,j denotes the jth symbol of the sth dual codeword u⊥s = sH and s = dec(s)

is the decimal representation of the binary vector s.


The spectral section matrices must be weighted by the state transition and error

probabilities of the channel, as was the case in the original domain. One method of

achieving this is to apply a similarity transformation directly to the weighted trellis

section matrices derived in the original domain. When considered for a transmitted

bit uj, this relationship may be expressed as

Θhj(uj) = T−1Uhj

(uj)T (4.19)

for a transformation matrix T. The Walsh-Hadamard transformation is applied, as

was the case for memoryless models. However, here it must be applied over all S

states and so the matrix T in (4.19) can be expressed as

T = W2n−k ⊗ IS. (4.20)

89


Since all rows and columns of W2n−k are orthogonal to all of its disparate rows and

columns, it follows that

W22n−k = 2n−kI2n−k (4.21)

and therefore the inverse of the Walsh-Hadamard matrix can be written as

W−12n−k =

1

2n−kW2n−k . (4.22)

Substituting (4.10), (4.20) and (4.22) into (4.19) produces the (2n−kS)× (2n−kS)

weighted spectral matrix

Θhj(uj) = 1

2n−k W2n−kMhj(uj)W2n−k ⊗ D(vj|uj)

= Λhj(uj) ⊗ D(vj|uj)

= diag{(−1)uj ·u⊥

s,jDuj⊕vj},

(4.23)

which is a block diagonal matrix. Considering the weighted spectral matrix in (4.23)

for a specific input uj gives

Θhj(uj) =

diag{Dvj} if uj = 0,

diag{(−1)u⊥

s,jDvj} if uj = 1,

= diag{ujDvj+ uj(−1)u⊥

s,jDvj}.

(4.24)

Weighted spectral matrices for the jth 6= ith diagonal trellis sections irrespective of

transmitted symbol uj are given by the sum of the Θhj(uj) matrices over all uj ∈U .

Thus,

Θhj= Θhj

(0) + Θhj(1) = diag{Dvj

+ (−1)u⊥

s,jDvj}. (4.25)

Assuming that an estimate of the probability that the ith bit is equal to ui is required,

a weighted (2n−kS) × (2n−kS) matrix ΘH(ui) for the entire diagonal trellis, which

consists of 2n−k parallel paths, is calculated by multiplying the weighted spectral

matrices for each trellis section in order, from the first to the nth section. Due

to the definition of the APPs, each of the weighted spectral matrices will be taken

irrespective of the transmitted symbol, apart from the ith factor, where the weighted

spectral matrix for an input of ui will be used. That is,

ΘH(ui) =i−1∏

j=1

Θhj· Θhi

(ui) ·n∏

j=i+1

Θhj. (4.26)

90


Each factor of ΘH(ui) is a block diagonal matrix with square, invertible submatrices

on the main diagonal. Additionally, diagonal matrices over C with the same square

block structure of invertible submatrices form a group under matrix multiplication.

Therefore, ΘH(ui) is also a block diagonal matrix, and it is possible to write

ΘH(ui) = diag{Qs(ui|v)}, (4.27)

where

Qs(ui|v) =i−1∏

j=1

[

Dvj+ (−1)u⊥

s,jDvj

]

×

[uiDvi+ ui(−1)u⊥

s,iDvi] × (4.28)

n∏

j=i+1

[

Dvj+ (−1)u⊥

s,jDvj

]

.

Observing the structure of Θhjin (4.25), when n≥ 3, there are more instances of

D0 ±D1 than either D0 or D1 alone. For most codes, it is thus beneficial to rewrite

(4.28) using the notationD = D0 + D1,

∆ = D0 − D1.(4.29)

Firstly for columns j 6= i,

Dvj+ (−1)u⊥

s,jDvj= (−1)vj ·u

⊥

s,j(u⊥s,jD + u⊥s,j∆). (4.30)

Secondly, for the ith column,

uiDvi+ui(−1)u⊥

s,iDvi=

1

2

[

(−1)ui·u⊥

s,iD + (−1)(ui·u⊥

s,i)+vi∆]

. (4.31)

Using (4.30) and (4.31) in (4.28) yields the conditional spectral coefficient matrices

Qs(ui|v) =i−1∏

j=1

[

(−1)vj ·u⊥

s,j(u⊥s,jD + u⊥s,j∆)]

×

1

2

[

(−1)ui·u⊥

s,iD + (−1)(ui·u⊥

s,i)+vi∆]

× (4.32)

n∏

j=i+1

[

(−1)vj ·u⊥


.

In particular, note the correspondence between the zeroes and ones of the dual

codewords and the arrangement of D and ∆ matrix probabilities within the matrices

Qs(ui|v). The conditional spectral coefficients Qs(ui|v), which factor in the relative

91


likelihoods of the model commencing in each of the states, can be obtained from the

conditional spectral coefficient matrices Qs(ui|v) using the conversion

Qs(ui|v) = σ0 · Qs(ui|v) · e. (4.33)

Together, the 2n−k conditional spectral coefficients Qs(ui|v) form the vector of con-

ditional spectral coefficients

Q(ui|v) =[

Q0(ui|v), Q1(ui|v), . . . , Q2n−k(ui|v)]

. (4.34)


The conditional spectral coefficient matrices cannot be used directly to perform the

APP decoding. These matrices are constructed from the spectral coefficients and

in order to calculate the required APPs, coefficients from the original domain must

be used. The two sets of coefficients are related by the Walsh-Hadamard matrix

W2n−k of order 2n−k. The spectral domain equivalent of (4.13) can be determined

using properties of the Kronecker product. Firstly, substituting (4.12) into (4.13)

produces

P(ui|v) = (τ 0 ⊗ σ0) ·i−1∏

j=1

Uhj· Uhi

(ui) ·n∏

j=i+1

Uhj· (I2n−k ⊗ e). (4.35)

Then, application of (4.19), (4.20) and (4.26) results in

P(ui|v) = (τ 0 ⊗ σ0)(W2n−k ⊗ IS) · ΘH(ui) · (W−12n−k ⊗ IS)(I2n−k ⊗ e). (4.36)

It is also important to note that

(τ 0 ⊗ σ0)(W2n−k ⊗ IS) = ι0 ⊗ σ0, (4.37)

where ι0 is the all-ones vector of length 2n−k, which is equal to the first row of W2n−k

as defined in (3.53). A further simplification may be made using the properties of

identity matrices, so that

(W−12n−k ⊗ IS)(I2n−k ⊗ e) = W−1

2n−k ⊗ e. (4.38)

The substitution of (4.27), (4.37) and (4.38) into (4.36) results in

P(ui|v) = (ι0 ⊗ σ0) · diag{Qs(ui|v)} · (W−12n−k ⊗ e). (4.39)

92


Applying (4.22) and considering the structure of the vector of conditional spectral

coefficients defined in (4.33) and (4.34) produces

P(ui|v) =1

2n−kQ(ui|v)W2n−k . (4.40)

Then, the required APP is the first element of the vector P(ui|v), that is

P0(ui|v) =1

2n−k

2n−k−1∑

s=0

Qs(ui|v). (4.41)

Since the estimate ui for the transmitted bit ui is the binary symbol that maximises

P0(ui|v), (4.41) gives

ui = arg maxui∈GF (2)

{ 2n−k−1∑

s=0

Qs(ui|v)

}

. (4.42)

APP decoding procedure for the spectral domain

Assuming the same conditions as those for Procedure 4.1, an alternative APP de-

coding procedure using the spectral domain is summarised below.

Procedure 4.2. Given is a binary (n, k) linear block code C in standard form,

to be used on a channel which is described by a stochastic automaton containing a

stochastic sequential machine which has S<∞ states. The linear block code C shall

be defined by parity check matrix H. A codeword u is transmitted over the channel

and a word v is received. APP decoding in the spectral domain is comprised of the

following main steps.


state distribution vector σ0 and the matrix probabilities D0 and D1. Formu-

late D and ∆ in terms of D0 and D1 using (4.29).

Step 2. ∀s = dec(s) ∈ {0, 1, . . . , 2n−k − 1}, compute dual codewords

u⊥s = s · H =

[

u⊥s,1, u⊥s,2, . . . , u⊥s,n

]

∈ C⊥, (4.43)

which are used in defining the arrangement of D and ∆ matrices in (4.32).

Step 3. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ {0, 1}, compute the conditional spectral coef-

ficients Qs(ui|v) using (4.32) and (4.33).

93

4.3. COMPLEXITY ANALYSIS

Step 4. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ {0, 1}, accumulate these coefficients to com-

pute the APPs P0(ui|v) using (4.41).

Step 5. Derive an estimate ui of the ith transmitted bit ui of codeword u at each


ui =

0 if2n−k−1∑

s=0

Qs(ui = 0|v) ≥2n−k−1∑

s=0

Qs(ui = 1|v),

1 otherwise.

(4.44)

4.3 Complexity Analysis

There are two common methods of analysing the complexity of an algorithm. Firstly,

one can investigate how much computation time is required for the execution of the

algorithm in terms of the sizes of its input arguments. Secondly, one may also

look at the amount of computer memory which would be required in order to run

the algorithm. This section provides such an analysis of Procedures 4.1 and 4.2 in

comparison to each other and to alternative approaches.

4.3.1 Computational complexity

A reasonable approximation to the execution time requirements of the procedures

may be made by determining the number of multiplications which would be needed

to decode a single word. The addition operations are ignored, since they are of

lower computational complexity. An analysis of the number of operations is made

in terms of the sizes of three main input arguments. These arguments are the total

number n of bits in a codeword, the number k of information bits per codeword,

and the number S of states in the channel model. This analysis is provided using

O notation. Specifically, if f(x) and g(x) are real-valued functions of one variable,

then f(x) is O(g(x)) if and only if there exists x0 ∈ R and a positive constant c such

that

|f(x)| ≤ c · |g(x)| (4.45)

for all x values greater than the threshold x0. Multivariate extensions are similar. In

this way, O notation provides an asymptotic upper bound to a function and permits

a comparison of functions as the magnitude of their arguments tend to infinity.

94


Computational complexity of the original domain approach

Procedure 4.1 is oriented only towards calculation of the APPs P0(ui|v). Although

alternative analyses could be made for determining Ps(ui|v) ∀s ∈ {1, 2, . . . , 2n−k−1},as is considered in [64] for linear unequal error protection codes, a similar conclusion

as to the computational complexity of such an approach would be reached and this

scenario will not be considered here. Therefore, assume the APP P0(ui|v) needs

to be calculated for both ui = 0 and ui = 1, for all values of i ∈ {1, 2, . . . , k}. To

calculate each such APP, a sum of 2k−1 matrix products of length n needs to be

created. The exponent is k−1 because there are 2k choices for information bits for

the codewords, however only half of these will satisfy the condition that ui has a

specific value. The parity check matrix H then ensures that no more choices can

be made for the remaining n−k bit positions. Calculation of each matrix product

of length n, when pre-multiplied by the stationary state distribution vector σ0 and

post-multiplied by the all-ones column vector e, requires O(nS2) operations when

the channel model comprises S states. The overall complexity for decoding a single

word using the original domain is therefore O(2kknS2).

Computational complexity of the spectral domain approach

Suppose a model has S states and a word is to be decoded using Procedure 4.2. The

number of multiplications needed is approximately n × 2n−k × S2 for each of the

two possible binary symbols and for each position i ∈ {1, 2, . . . , k}. This is because

there are 2n−k rows in the diagonal trellis, each requiring a vector of length S to be

multiplied by an S × S matrix n times. Therefore the overall complexity per word

is O(2n−k+1knS2).

Whilst storage requirements also need to be considered, a major factor in the

decision whether the original or spectral domain should be used is determining which

of them results in a lower computational complexity. From the discussions in the

previous two paragraphs, the only difference in these complexities is the exponent

to which the base 2 is raised, namely k versus n−k+1. Simple arithmetic shows

that the spectral domain approach of Procedure 4.2 is preferred whenever

k >n+ 1

2(4.46)

and the original domain should be used if this is not the case. In summary, the

original domain is more suited to lower rate codes and the spectral domain is more

appropriate for codes which have a high rate.

95


Computational complexity of other known approaches

Some APP decoding algorithms which use a trellis to perform their calculations have

been published. For example, the BCJR algorithm [8] uses both a forward and a

backward recursion along the decoding trellis. It requires O(2n−k+1kn) operations

per codeword of a binary (n, k) linear block code C, since each of the k information

bits to be decoded requires two passes along a trellis of size 2n−k×n. The one-sweep

algorithm by Johansson and Zigangirov, presented in [9], is a significant improvement

over the BCJR algorithm since it reduces the number of passes per trellis required

from two to one. Thus, its complexity is approximately half that of the algorithm in

[8]. However, the algorithms such as those described in [8] and [9] were developed for

memoryless channels, without the concept of states. The advantages of Procedures

4.1 and 4.2 are that they were developed specifically for channel models which possess

states, and also, when compared to the BCJR algorithm, that they require only a

forward recursion.

4.3.2 Storage requirements

The other main factor in determining the cost of an algorithm is the amount of

data it needs to store in order to carry out its tasks. An algorithm may have a low

computational complexity, however its desirability will be decreased if it has large

storage requirements. In the following analysis, let Y be the quantity of real number

storage spaces which would be required for the execution of an algorithm. The Onotation again indicates the asymptotic size of such storage as the parameters n, k

and S increase.

Storage requirements of the original domain approach

Even though the original domain approach of Procedure 4.1 is based upon calculating

the S×S matrix [UH(ui)](S), the entire matrix UH(ui) must in essence be determined.

This requires storage of (2n−kS)2 real numbers. Calculation of P0(ui|v) using (4.16)

requires a temporary vector of length S and the space for two real numbers is used

to store the results P0(ui|v) for ui =0 and ui =1. Therefore Procedure 4.1 requires

space to store 4n−kS2 + S + 2 real numbers and hence

Y = O(4n−kS2). (4.47)

96


Storage requirements of the spectral domain approach

There is a definite advantage in using the spectral domain when compared to the

original domain because the conditional spectral coefficients Qs(ui|v) can be cal-

culated one at a time. Such calculations involve multiplications of a row vector

of length S by an S×S matrix. After multiplication by the column vector e, the

final result is a real number. As each of the 2n−k conditional spectral coefficients is

calculated, it is added to the previous tally. Thus, only the storage space of a single

real number is required to determine P0(ui|v). For the binary case, only two of

these are required in order to arrive at a decoding decision for position i. Therefore,

ignoring the space required to store vectors σ0 and v, as well as matrices D and ∆,

Procedure 4.2 needs storage space of approximately S2 + S + 2 real numbers. Thus,

for the spectral domain,

Y = O(S2). (4.48)

Storage requirements of other known approaches

To provide a comparison to these analyses, the BCJR algorithm requires storage

of 2n−k vectors, each of length n, and Johansson and Zigangirov’s improvement

reduces this figure to just 2n−k real numbers [9]. The high storage costs of the

BCJR algorithm have prompted many alterations to be made, such as in [65], where

the amount of storage required is reduced to

Y = O(2n−k) (4.49)

at the expense of a 13

increase in the computational complexity. This analysis shows

the attractiveness of using Procedure 4.2 if storage requirements are a factor.


To demonstrate the calculation of the a posteriori probabilities required in Proce-

dures 4.1 and 4.2, a decoding example in both the original and the spectral domain

for communications on a GEC will be provided. The code used in this example

is the same binary (4,2) linear block code as for the example in Section 3.4, thus

making clear the effect of altering the channel model from BSC to GEC. Define the

stationary state distribution vector σ0, the all-ones vector e, the state transition

probabilities P and Q and the crossover probabilities pG and pB for the GEC as

done in Section 2.2.2.

97



Suppose a codeword from the binary (4,2) linear block code

C = {[0, 0, 0, 0], [0, 1, 1, 1], [1, 0, 1, 0], [1, 1, 0, 1]} (4.50)

of Example 2.1 is sent over a GEC. Assume v = [1, 1, 1, 0] is received and the aim

is to use the original domain to decode the second bit transmitted. The first step

is to calculate the trellis matrices Mhj(uj) for each of the four columns hj of the

parity check matrix H and for each possible transmitted bit uj. These were given in

(3.58) to (3.62). The trellis matrices must be weighted by the matrix probabilities

D0 and D1, as defined in (2.42) and (2.43), respectively, in order to determine the

weighted trellis section matrices with respect to a specific transmitted bit uj. The

matrix probability D0 is used when uj = vj, as there has not been a transmission

error. When uj 6=vj, a transmission error has occurred, and D1 is used instead. The

weighted matrices representing the first trellis section for transmitted bits u1 = 0

and u1 =1 can be calculated as

Uh1(0) = Mh1

(0) ⊗ D1 =

D1 0 0 0

0 D1 0 0

0 0 D1 0

0 0 0 D1

, (4.51)

Uh1(1) = Mh1

(1) ⊗ D0 =

0 0 D0 0

0 0 0 D0

D0 0 0 0

0 D0 0 0

. (4.52)

A weighted matrix representation of the first trellis section regardless of transmitted

bit u1 is given by

Uh1= Uh1

(0) + Uh1(1) =

D1 0 D0 0

0 D1 0 D0

D0 0 D1 0

0 D0 0 D1

. (4.53)

Weighted matrix representations of the third and fourth trellis sections are calculated

similarly. It then follows from (4.16) that the APPs required in order to decode the

second transmitted bit are given by

P0(u2 = 0|v) = σ0 · (D1D1D1D0 + D0D1D0D0) · e,P0(u2 = 1|v) = σ0 · (D0D0D1D1 + D1D0D0D1) · e.

(4.54)

98


- - - -

- - - -

- - - -

- - - -

AAAAAAAAAAAAAA

AAAAAU

AAAAAAAAAAAAAA

AAAAAAU

��

��

��

��

AAAAAAAAAAAAAA

AAAAAU

AAAAAAAAAAAAAA

AAAAAAU

��

��

��

��

@@

@@

@@

@

@@@R

��

��

��

�

��

@@

@@

@@

@

@@@R

��

��

��

�

��

x x x x x

x x x x x

x x x x x

x x x x x

σ0

0

0

0

e P0(u2 =0|v)

e P1(u2 =0|v)

e P2(u2 =0|v)

e P3(u2 =0|v)

D1

D1

D1

D1

D0

D0

D0

D0

D1

D1

D1

D1

D1

D1

D1

D1

D0 D0

D0D0

D1

D1

D1

D1

D0

D0

D0

D0

(a)

- - -

- - -

- - -

- - -

AAAAAAAAAAAAAA

AAAAAU

AAAAAAAAAAAAAA

AAAAAAU

��

��

��

��


BBBBBBBN

��

��

��

�

��

��

��

@@

@@

@@

@

@@@R

AAAAAAAAAAAAAA

AAAAAU

AAAAAAAAAAAAAA

AAAAAAU

��

��

��

��

@@

@@

@@

@

@@@R

��

��

��

�

��

@@

@@

@@

@

@@@R

��

��

��

�

��

x x x x x

x x x x x

x x x x x

x x x x x

σ0

0

0

0

e P0(u2 =1|v)

e P1(u2 =1|v)

e P2(u2 =1|v)

e P3(u2 =1|v)

D1

D1

D1

D1

D0

D0

D0

D0

D0

D0

D0

D0

D1

D1

D1

D1

D0 D0

D0D0

D1

D1

D1

D1

D0

D0

D0

D0

(b)

Figure 4.1: Original domain weighted APP decoding trellises for the binary (4,2)linear block code used to compute (a) P (u2 =0|v) and (b) P (u2 =1|v).

(Dashed: sj+1 = sj, Solid: sj+1 = sj ⊕ hTj+1.)

99


Figure 4.1 shows how the components of the complete set of APPs Ps(u2|v),

s ∈ {0, 1, 2, 3}, are derived from the two trellises for u2 = 0 and u2 = 1. The fi-

nal decoding decision is made by substituting the values of P,Q, pG and pB into

(2.38), (2.42) and (2.43), and subsequently into (4.54). For example, if the channel

parameters are given as

P = 0.01, Q = 0.2, pG = 0.001, and pB = 0.3, (4.55)

then

P0(u2 = g|v) =

{

8.34 × 10−3 if g = 0,

3.25 × 10−3 if g = 1.(4.56)

Therefore P0(u2 = 0|v) < P0(u2 = 1|v) and so by (4.17), u2 = 0.


The elements u⊥s,j, j = 1, 2, 3, 4, of the dual codewords u⊥s = [u⊥s,1, u

⊥s,2, u

⊥s,3, u

⊥s,4],

s = 0, 1, 2, 3, which are needed to evaluate (4.18) can be expressed as the rows of

the set

C⊥ =

0 0 0 0

0 1 0 1

1 1 1 0

1 0 1 1

. (4.57)

Equation (4.32) gives

Qs(u2|v) = (−1)v1·u⊥

s,1(u⊥s,1D + u⊥s,1∆) ×1

2

[

(−1)u2·u⊥

s,2D + (−1)(u2·u⊥

s,2)+v2∆]

× (4.58)

4∏

j=3

[

(−1)vj ·u⊥


,

where D and ∆ are defined in (2.36) and (2.44), respectively. In total, eight condi-

tional spectral coefficient matrices need to be calculated. One is required for each

value of s ∈ {0, 1, 2, 3}, and both u2 = 0 and u2 = 1 must be treated in each of these

four cases. Substituting the necessary values of the transmitted bit u2, the received

word v = [1, 1, 1, 0] and the entries u⊥s,j of the dual codewords yields

Q0(u2 = 0|v) = D(

D−∆

2

)

DD,

Q1(u2 = 0|v) = D(

D−∆

2

)

D∆,

Q2(u2 = 0|v) = (−∆)(

D−∆

2

)

(−∆)D,

Q3(u2 = 0|v) = (−∆)(

D−∆

2

)

(−∆)∆,

(4.59)

100

4.5. SIMULATION RESULTS

for u2 = 0 and

Q0(u2 = 1|v) = D(

D+∆

2

)

DD,

Q1(u2 = 1|v) = D(

−D+∆

2

)

D∆,

Q2(u2 = 1|v) = (−∆)(

−D+∆

2

)

(−∆)D,

Q3(u2 = 1|v) = (−∆)(

D+∆

2

)

(−∆)∆,

(4.60)

for u2 = 1. The arrangement of the plus and minus signs preceding the factors of

(4.59) and (4.60) is a reflection of the pattern of zeroes and ones in C⊥ as defined

in (4.57) and in the received word v. The two diagonal trellises of Fig. 4.2 (a) and

(b) demonstrate how the conditional spectral coefficient matrices are used in the

calculation of the scalars Qs(u2|v). Namely, in each row of the trellis, the stationary

state distribution vector σ0 is successively multiplied by each of the four matrices

of the corresponding Qs(u2|v) matrix product and then finally multiplied by the

column vector e to arrive at a scalar result.

To calculate the two APPs P0(u2|v), it is necessary to find the mean of these

four scalars. The required calculations may be written as

P0(u2 =0|v) = 14

3∑

s=0

Qs(u2 = 0|v)

= 18σ0(D

4−D∆D2+D3∆−D∆D∆+∆D∆D−∆3D+∆D∆2−∆4)e,

(4.61)

P0(u2 =1|v) = 14

3∑

s=0

Qs(u2 = 1|v)

= 18σ0(D

4+D∆D2−D3∆−D∆D∆−∆D∆D−∆3D+∆D∆2+∆4)e.

(4.62)

Once the values P,Q, pG and pB of a particular parameter set have been substituted

into D, ∆ and σ0, in (2.36), (2.44) and (2.38) respectively, the results can be used

in (4.61) and (4.62) to determine the APPs. Assuming the same parameter set as

for the original domain as given in (4.55), it follows that

P0(u2 = g|v) =

{

8.34 × 10−3 if g = 0,

3.25 × 10−3 if g = 1.(4.63)

This is the same result as for the original domain, and therefore u2 = 0. Procedures

4.1 and 4.2 are two different ways of obtaining the same answer.

4.5 Simulation Results

To provide a sample of the performance of codes used for transmission over a channel

described by a stochastic automaton in conjunction with Procedure 4.2, computer

101


x x x x x

x x x x x

x x x x x

x x x x x

- - - -

- - - -

- - - -

- - - -

σ0

σ0

σ0

σ0

e Q3(u2 =0|v)

e Q2(u2 =0|v)

e Q1(u2 =0|v)

e Q0(u2 =0|v)

−∆ D−∆

2 −∆ ∆

−∆ D−∆

2 −∆ D

D D−∆

2 D ∆

D D−∆

2 D D

(a)

x x x x x

x x x x x

x x x x x

x x x x x

- - - -

- - - -

- - - -

- - - -

σ0

σ0

σ0

σ0

e Q3(u2 =1|v)

e Q2(u2 =1|v)

e Q1(u2 =1|v)

e Q0(u2 =1|v)

−∆ D+∆

2 −∆ ∆

−∆ −(

D+∆

2

)

−∆ D

D −(

D+∆

2

)

D ∆

D D+∆

2 D D

(b)

Figure 4.2: Weighted diagonal trellises of the binary (4, 2) linear block code usedfor computing the spectral coefficients (a) Qs(u2 = 0 | v) and (b) Qs(u2 = 1 | v);

s = 0, 1, 2, 3.

102


simulations were carried out using MATLABr. As there are many combinations

of parameter values which can be chosen, and many binary linear block codes in

existence, only some are presented here. Note that Procedure 4.1 would have made

the same decoding decisions and hence produced the same results. However, as was

discussed in Section 4.3, more storage space would be required if the original domain

were used.

4.5.1 Description of parameter values in these simulations

In these simulations, the binary (7,4) Hamming code in systematic form as described

in (3.104) is used for transmission over a GEC. The model of this channel has four

parameters which may independently vary. A complete analysis of APP decoding

of this code would then require an investigation in four different dimensions. A

small subset of the parameter space is selected for simulation, and is defined by the

constraintsP ∈ {10−7, 10−6, 10−5, 3×10−5, 10−4, 3×10−4 . . . , 1},Q ∈ {0.01, 0.3},pG ∈ {10−4, 10−3, 10−2, 10−1},pB ∈ {0.1, 0.5}.

(4.64)

After decoding a large number of received words for each choice of the four parameter

values, the BER was calculated. The results are displayed in Fig. 4.3 for the case

Q = 0.01, and Fig. 4.4 for the case Q = 0.3. To further distinguish the results, the

BERs for situations where pB is set to 0.1 are presented in subfigure (a); those for

situations where pB is set to 0.5 are presented in subfigure (b).

4.5.2 Observations from the simulations

As expected, for each value of the crossover probability pG in the ‘good’ state G, the

performance degrades as the value of the state transition probability P is increased,

since this corresponds to an increased probability that the channel is in the ‘bad’

state B. In addition, the BER decreases as pG decreases.

For each value of pG, an error floor is reached at a different value of P . This is

because as P approaches zero, the GEC is in essence a BSC with crossover prob-

ability pG, which produces a particular BER when used in conjunction with this

code and decoding procedure. The differing error floors exist due to the different

values of pG relative to P . The curves for pG = pB = 0.1 in Figs. 4.3(a) and 4.4(a)

are horizontal. Since the crossover probability is equal in both states, the BER is

103


10−7

10−6

10−5

10−4

10−3

10−2

10−1

100

10−8

10−7

10−6

10−5

10−4

10−3

10−2

10−1

100

P

BE

R

pG = 10−1

pG = 10−2

pG = 10−3

pG = 10−4

(a)

10−7

10−6

10−5

10−4

10−3

10−2

10−1

100

10−8

10−7

10−6

10−5

10−4

10−3

10−2

10−1

100

P

BE

R

pG = 10−1

pG = 10−2

pG = 10−3

pG = 10−4

(b)

Figure 4.3: Performance of the (7,4) Hamming code on a GEC with Q=0.01 and(a) pB = 0.1, (b) pB = 0.5.

104


10−7

10−6

10−5

10−4

10−3

10−2

10−1

100

10−8

10−7

10−6

10−5

10−4

10−3

10−2

10−1

100

P

BE

R

pG = 10−1

pG = 10−2

pG = 10−3

pG = 10−4

(a)

10−7

10−6

10−5

10−4

10−3

10−2

10−1

100

10−8

10−7

10−6

10−5

10−4

10−3

10−2

10−1

100

P

BE

R

pG = 10−1

pG = 10−2

pG = 10−3

pG = 10−4

(b)

Figure 4.4: Performance of the (7,4) Hamming code on a GEC with Q=0.3 and(a) pB = 0.1, (b) pB = 0.5.

105

4.6. SUMMARY

independent of P . The channel is in fact a BSC. Finally, note that the channel

models in Fig. 4.3, where the state transition probability Q = 0.3, reach their error

floor at higher values of P compared to the corresponding models in Fig. 4.4, where

Q = 0.01. This supports the idea that the average fade to connection time ratio x

provides a better idea of the channel’s behaviour than P or Q individually.

In summary, Figs. 4.3 and 4.4 demonstrate that Procedure 4.2 produces the

expected results in terms of the parameters of binary GECs when used with the

(7,4) Hamming code. The link to a BSC as a degenerate case of a GEC was noted,

as was the importance of the ratio between state transition probabilities P and Q.

4.6 Summary

This chapter has provided two solutions to the problem of APP decoding for bi-

nary linear block codes over channels described by stochastic automata. As in the

memoryless channel model situation from Chapter 3, one solution used the origi-

nal domain, whilst the other used the spectral domain. However, these solutions

are more advanced than their predecessors because they use matrix probabilities

to consider all possible state transitions introduced by the discrete channel with

memory.

In the original domain approach, the trellis section matrices were weighted by

one of four possible matrix probabilities representing the stochastic properties of

the channel. For a GEC, there are only two such matrix probabilities due to the

symmetry of the constituent BSCs. The result was a collection of weighted trellis

section matrices that could be multiplied together in the right combinations to

produce matrix representations of complete weighted trellises. It was shown how

principal leading submatrices could be used to calculate the necessary APPs from

which the decoding decisions could be made.

After application of the Walsh-Hadamard transform, expressions for weighted

spectral matrices were obtained. A spectral domain trellis can be constructed from

these weighted spectral matrices, however it is easier to work with the matrices

themselves. In this case, strong ties with the dual codewords were observed, and

a change in notation involving the sum and difference of the matrix probabilities

was made to reflect these ties. The APPs for the spectral domain procedure were

calculated by accumulating the conditional spectral coefficients. A decoding decision

in each information bit position was made by selecting the bit which optimised the

APP.

The computational complexity and storage requirements of both approaches were

106

4.6. SUMMARY

then analysed. It was shown that the spectral domain method required less storage

space for its execution compared to that of the original domain, and was ideal for

use with block codes of high rate, since its computational complexity was related

to the dimension of the dual code. Many other APP decoding algorithms have

been developed for memoryless channels, whereas the procedures presented in this

chapter were designed for channels with memory.

A decoding example was also provided in both domains to illustrate the various

calculations which are required, and to demonstrate the equality of the two solutions

obtained. Finally, simulations of the transmission of information encoded by a

Hamming code over various GEC models were carried out. The BER performance

observed suggested that increases in either of the crossover probabilities or in the

probability of transition from the ‘good’ state G to the ‘bad’ state B would degrade

the performance of the code. By contrast, increasing the probability of transition

from the ‘bad’ state B to the ‘good’ state G appeared to improve the performance.

107

Chapter 5

APP Decoding on Non-binary

Channels with Memory

The formulation of APP decoding given in Chapter 4 only works for situations where

binary data is being transmitted on the channel. Once the order of the field from

which the symbols are to be selected for transmission rises above two, that decoding

methodology will cease to function. Since more branches on the original domain and

diagonal trellises are required due to the larger number of choices for each of the

transmitted symbols, the elementary trellis and spectral matrices must accordingly

be made larger in size. Additionally, the bipartite “error” or “non-error” model of

the matrix probabilities is no longer adequate. However, if the channel is symmetric,

this problem can be dealt with in a simple way.

Although codes over GF (p), for p an odd prime, are not as common in practice as

binary codes, they do promote the use of symbols containing a higher resolution of

information. For example, the International Standard Book Number (ISBN) system

for library items is based on a code over GF (11). Also, the (11,6) Golay code [66]

as described in (3.108) is one of the few perfect linear codes and is constructed over

GF (3). In addition, Hamming codes constructed over GF (p) are perfect. There

is therefore a need to find good decoding algorithms for non-binary codes. How-

ever, most of the algorithms already developed have been for memoryless channels.

For example, [67] presents MAP decoding methods for non-binary block and con-

volutional codes on a time-discrete memoryless channel. There is a definite lack of

powerful decoding methods for channels with memory such as the GEC.

This chapter is, in a sense, a non-binary analogue of Chapter 4. Section 5.1 gives

the description of the problem which is solved in this chapter. The majority of the

theory is presented in Section 5.2. Section 5.2.1 defines the weighted trellis matrices

for a code over GF (p) on a channel with memory, which leads to the specification of

109


an APP decoding algorithm for the original domain. Similarly, in Section 5.2.2 the

formulation of the weighted spectral matrices leads to an APP decoding algorithm

for the spectral domain. Further information is given in Section 5.3 about some

of the probabilities involved in the necessary calculations for the spectral domain

algorithm. In Section 5.4, the computational complexity and storage requirements

of the algorithms are discussed. Examples of applying the algorithms to specific

non-binary codes over a finite state channel are given in Section 5.5. Addition-

ally, Section 5.6 provides results of simulations of the algorithms developed in this

chapter. Finally, Section 5.7 summarises the major findings of this study into APP

decoding of non-binary codes for channels with memory.

Therefore, the major contributions of this chapter are as follows:

• Description of an APP decoding procedure for linear block codes over GF (p)

in conjunction with a channel with memory, using the original domain.

• The spectral domain equivalent of the aforementioned procedure.

• Proof of a result regarding probability theory and the conditional spectral

coefficients.

• An analysis of the requirements of both procedures in terms of execution time

and memory usage.

• Computer-based simulations of the spectral domain procedure as applied to

several non-binary linear block codes over a non-binary GEC.


Assume C is a systematic (n, k) linear block code over GF (p) defined by parity

check matrix

H =[

h1, h2, . . . , hn

]

, (5.1)

where the jth column of H is given by the vector

hj =[

hn−k−1,j, hn−k−2,j, . . . , h0,j

]T

∈ [GF (p)]n−k. (5.2)

Let the channel over which the data encoded by C is transmitted be described by

the stochastic automaton

D = (D,σ0), (5.3)

110

5.2. NON-BINARY APP DECODING

where the stochastic sequential machine D has identical input and output sets equal

to {0, 1, . . . , p−1}, a set S of S states, and p2 matrix probabilities D(vj|uj) of

size S×S for a transmitted symbol uj and received symbol vj. Furthermore, σ0

is a vector of length S which represents the stationary state distribution of the

automaton. According to (2.107), for a received word v = [v1, v2, . . . , vn] the APP

decoding decision ui for the ith symbol of the transmitted codeword u is given by


σ0

∑

u∈C,ui=g

n∏

j=1

D(vj|uj)

e

, (5.4)

for i ∈ {1, 2, . . . , k}, where e is an all-ones column vector of length S. If the DMC

corresponding to each of the S states is symmetric, then it was shown in Section

2.2.2 that the matrix probabilities D(vj|uj) depend on whether uj and vj are equal

or not equal. Specifically, they may be written as

D(vj|uj) = Dvj⊖uj∈ {D0,Dǫ}. (5.5)

The value of ui is found for each position i ∈ {1, 2, . . . , k} by comparing the p differ-

ent APP values which result from the matrix arithmetic within (5.4) and selecting

the estimate which results in the largest APP value. Again, this may be achieved

in either the original or spectral domain.

5.2 Non-binary APP Decoding

The decoding methods presented in this section combine two aspects. One is

the theory of matrix representations for non-binary codes, which was presented

in Chapter 3. The other is the two decoding strategies for channels with memory,

which were presented in Chapter 4. Firstly, in order to reach a decoding decision for

each information symbol position, matrix representations of all viable paths through

the original domain trellis for all values of information symbol are found. After each

matrix representation of a viable path is weighted by its probability of occurring,

a decoding decision for each information symbol position is reached. Then in the

spectral domain, the matrix representations are diagonalised using a similarity trans-

formation. After calculation of the conditional spectral coefficients, conversion to

the original domain APPs is simple.

111


5.2.1 Original domain

The first step in APP decoding using the original domain for a non-binary linear

block code when the channel has memory is the creation of a weighted matrix

representation of each trellis section. The correct combinations of these matrix

representations must then be multiplied in order to arrive at values for the APPs,

from which decoding decisions can be made.


If the jth transmitted symbol is uj, then a matrix representation of the jth trellis

section is the pn−k ×pn−k matrix Mhj(uj), as defined in (3.21) and (3.22). However,

the trellis section matrices must be weighted by the relevant matrix probabilities for

the channel model. Suppose the set of possible states for the stochastic sequential

machine D is

S = {Sm | m ∈ 1, 2, . . . , S}, (5.6)

with a symmetric DMC corresponding to each state. A weighted trellis matrix for

the jth trellis section is then obtained by taking the Kronecker product of the trellis

matrix representation for column hj and transmitted symbol uj with the relevant

matrix probability based on the sent symbol uj and received symbol vj. That is,

Uhj(uj) = Mhj

(uj) ⊗ Dvj⊖uj, (5.7)

where by (2.53),

Dvj⊖uj=

D0 if uj = vj,

Dǫ if uj 6= vj.

(5.8)

By analogy with the binary case, a weighted trellis section matrix irrespective of

symbol uj is given by the sum of the weighted trellis section matrices Uhj(uj) over

all uj ∈GF (p). This relationship may be expressed as

Uhj=

p−1∑

g=0

Mhj(g) ⊗ Dvj⊖g. (5.9)

Using the same methodology for APP decoding as presented in Section 4.2.1 for

binary codes, the representation of the entire weighted trellis for the ith transmitted

symbol ui is the square matrix UH(ui) of dimension pn−kS. This matrix may be

112


written as

UH(ui) =i−1∏

j=1

Uhj· Uhi

(ui) ·n∏

j=i+1

Uhj

=i−1∏

j=1

[ p−1∑

uj=0

Mhj(uj) ⊗ Dvj⊖uj

]

×

[Mhi(ui) ⊗ Dvi⊖ui

] × (5.10)n∏

j=i+1

[ p−1∑

uj=0

Mhj(uj) ⊗ Dvj⊖uj

]

.

Determining the a posteriori probabilities in the original domain

In order to obtain a length-pn−k vector of APPs, the p-ary analogue of (4.13) is used:

P(ui|v) = (τ 0 ⊗ σ0) · UH(ui) · (Ipn−k ⊗ e), (5.11)

where

τ 0 =[

1, 0, . . . , 0]

(5.12)

denotes a vector of length pn−k. In this way, only paths commencing at the 0th

node of the trellis are used. The vector P(ui|v) consists of entries Pt(ui|v) for

t = 0, 1, . . . , pn−k−1, with the tth entry denoting the probability that the ith trans-

mitted symbol was ui, given that a word v was received and that the encoding was

performed by mapping a p-ary vector of k information symbols onto length-n words

of the tth coset Vt. The required APP for encoding onto the 0th coset, which is C

itself, is thus P0(ui|v). This is the only situation which will be considered here.

Then (5.11) can be rewritten in order to calculate P0(ui|v) as

P0(ui|v) = σ0 · [UH(ui)](S) · e. (5.13)

The estimate ui for the ith transmitted symbol is the element ui of GF (p) resulting

in the highest value of P0(ui|v), so that


{P0(ui|v)} . (5.14)

APP decoding procedure in the original domain

The following procedure can be used to perform APP decoding in the original domain

for an (n, k) linear block code over GF (p) and a non-binary channel described by a

stochastic automaton.

113


Procedure 5.1. Given is an (n, k) linear block code C in standard form over GF (p),


stochastic sequential machine which has S <∞ states. A p-ary DMC in standard

form corresponds to each of the states. The linear block code C shall be defined by

parity check matrix H. A codeword u is transmitted over the channel and a word v

is received. APP decoding in the original domain consists of the following steps.


state distribution vector σ0 and the two S×S matrix probabilities D0 and Dǫ.

Step 2. ∀j ∈ {1, 2, . . . , n} and ∀uj ∈ GF (p), compute the trellis section matrix

Mhj(uj) for column hj and transmitted symbol uj using (3.22).

Step 3. ∀j ∈ {1, 2, . . . , n} and ∀uj ∈ GF (p), compute the weighted trellis section


Step 4. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (p), compute the weighted trellis matrix

UH(ui) for the full weighted trellis using (5.10).

Step 5. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (p), let [UH(ui)](S) be the Sth principal

leading submatrix of UH(ui) and calculate the a posteriori probability P0(ui|v)

using (5.13).


position i ∈ {1, 2, . . . , k} using (5.14).

Procedure 5.1 performs the same task as Procedure 4.1 for a code over an al-

phabet of size p, rather than an alphabet of size 2. The major differences are that

the matrix probability D1 corresponding to an error event has been replaced by the

matrix probability Dǫ, and the elementary trellis matrices are now of size p × p,

causing the overall trellis structure to be correspondingly larger. As was shown

for binary codes in Chapter 4, the decoding process can also be carried out in the

spectral domain. This approach is described in the next subsection.

5.2.2 Spectral domain

Here, the APPs are calculated through the intermediate step of determining the

conditional spectral coefficients. That is, diagonalised matrix representations of the

original domain are weighted by the necessary probabilities of symbol error and

non-error. Products of combinations of these diagonal matrices, corresponding to

the entire trellis, are then calculated. Finally, a matrix transformation back to the

114


original domain delivers the APPs, through which the decoding decisions may be

reached.


From Chapter 3, the spectral representation of the trellis section corresponding to

the jth column hj of H under the postulate that uj is the jth transmitted symbol is

given by

Λhj(uj) = diag{wuj ·u

⊥

s,j}, (5.15)

where w = e− 2π

p is a complex pth root of unity. Note that (5.15) corresponds to

the original domain representation of the jth trellis section after application of the

similarity transformation Wpn−k , resulting in a diagonal matrix. In other words,

Λhj(uj) = W−1

pn−kMhj(uj)Wpn−k , (5.16)

where Wpn−k is the complex generalisation of the binary Walsh-Hadamard trans-

formation matrix, and is defined by (3.37) and (3.43). Recall that the inverse of

the matrix Wpn−k is 1pn−k W

Hpn−k , as given in (3.44). The spectral matrices for each

trellis section must then be weighted by the relevant matrix probabilities D(vj|uj).

This can be seen by considering the transformed versions of the weighted trellis sec-

tion matrices Uhj(uj) of the original domain. Following the same strategy as was

adopted in Section 4.2.2 for binary channels with memory, applying the generalised

Walsh-Hadamard transform to the weighted trellis section matrix for the jth trellis

section and transmitted symbol uj results in the weighted spectral section matrix

Θhj(uj) =

(

W−1pn−k ⊗ IS

)

· Uhj(uj) ·

(

Wpn−k ⊗ IS

)

= W−1pn−kMhj

(uj)Wpn−k ⊗ Dvj⊖uj

= Λhj(uj) ⊗ Dvj⊖uj

= diag{wuj ·u⊥

s,jDvj⊖uj}.

(5.17)

A weighted spectral section matrix irrespective of the transmitted symbol is given

by the sum, over all uj ∈ GF (p), of the weighted trellis section matrices defined in

(5.17). Therefore,

Θhj=

∑

uj∈GF (p)

Θhj(uj) = diag

{ p−1∑

uj=0

wuj ·u⊥

s,jDvj⊖uj

}

. (5.18)

The complete spectral matrix is then constructed by multiplying each of the weighted

spectral section matrices in sequence to create a matrix product. When decoding

115


the ith symbol, the ith weighted spectral section matrix used in this product must

be taken with respect to transmitted symbol ui. The matrix representation of the

entire diagonal trellis can be written as

ΘH(ui) =i−1∏

j=1

Θhj· Θhi

(ui) ·n∏

j=i+1

Θhj= diag{Qs(ui|v)}, (5.19)

where the block diagonal submatrices of ΘH(ui) can be expressed as

Qs(ui|v) =i−1∏

j=1

( p−1∑

uj=0

wuj ·u⊥

s,jDvj⊖uj

)

×

(wui·u⊥

s,iDvi⊖ui)×

n∏

j=i+1

( p−1∑

uj=0

wuj ·u⊥

s,jDvj⊖uj

)

.

(5.20)

Some simplifications to (5.20) can be made by considering (5.8). Firstly,

p−1∑

uj=0

wuj ·u⊥

s,jDvj⊖uj= wvj ·u

⊥

s,jD0 +[

w0·u⊥

s,j + w1·u⊥

s,j + . . .

+wvj ·u⊥

s,j + . . .+ w(p−1)·u⊥

s,j − wvj ·u⊥

s,j

]

Dǫ.

(5.21)

The expression in (5.21) has a different value depending on whether u⊥s,j is zero or

nonzero. For the cases where u⊥s,j = 0, the result can be expressed as

p−1∑

uj=0

wuj ·u⊥

s,jDvj⊖uj= D0 + (p− 1)Dǫ. (5.22)

Before considering the case of u⊥s,j 6= 0, observe the following lemma concerning

complex roots of unity.

Lemma 5.2.1. For a complex pth root of unity w,p−1∑

g=1

wg = −1.

Proof. The sum may be written explicitly as

p−1∑

g=1

wg = w + w2 + . . .+ wp−1. (5.23)

Then, multiplication by (1 − w) results in

(1 − w)p−1∑

g=1

wg = w + w2 + . . .+ wp−1 − w2 − w3 − . . .− wp

= w − wp

= w − 1,

(5.24)

116


which in turn implies that

p−1∑

g=1

wg =w − 1

1 − w= −1. (5.25)

For the case u⊥s,j 6= 0, applying Lemma 5.2.1 to evaluate (5.21) results in

p−1∑

uj=0

wuj ·u⊥

s,jDvj⊖uj=wvj ·u

⊥

s,jD0+[

wu⊥

s,j(w0+w+. . .+wvj ·u⊥

s,j +. . .+wp−1)−wvj ·u⊥

s,j

]

Dǫ

=wvj ·u⊥

s,j(D0 − Dǫ). (5.26)

Reviewing the structure of (5.20) with the benefit of (5.22) and (5.26) reveals that

for n ≥ 3, changing the notation to reflect the connection with the dual code makes

it more compact. This alternative notation was first discussed in (2.56) and (2.57).

Explicitly, set

D = D0 + (p−1)Dǫ,

∆ = D0 − Dǫ.(5.27)

It then follows from (5.22), (5.26) and (5.27) that

p−1∑

uj=0

wuj ·u⊥

s,jDvj⊖uj=

D if u⊥s,j = 0,

wvj ·u⊥

s,j∆ if u⊥s,j 6= 0,

= wvj ·u⊥

s,j

[

δu⊥

s,j ,0D + (1 − δu⊥

s,j ,0)∆]

,

(5.28)

where δa,b denotes the Dirac-delta function, which has value 1 if a and b are equal,

otherwise it has value 0.

The central term of wui·u⊥

s,iDvi⊖uiin (5.20) has two distinct values depending on

whether ui and vi are equal or not. Using the notation in (5.27), the term may be

written as

wui·u⊥

s,iDvi⊖ui=

wui·u

⊥s,i

p[D + (p− 1)∆] if ui = vi,

wui·u

⊥s,i

p(D − ∆) if ui 6= vi,

= wui·u

⊥s,i

p[D + (δui,vi

p− 1)∆] .

(5.29)

Finally, substituting (5.28) and (5.29) into (5.20) produces the conditional spectral

coefficient matrices

117


Qs(ui|v) =i−1∏

j=1

wvj ·u⊥

s,j

[

δu⊥

s,j ,0D + (1 − δu⊥

s,j ,0)∆]

×

wui·u

⊥s,i

p[D + (δui,vi

p− 1)∆]×n∏

j=i+1

wvj ·u⊥

s,j

[

δu⊥

s,j ,0D + (1 − δu⊥

s,j ,0)∆]

.

(5.30)

These are converted to the conditional spectral coefficients Qs(ui|v), as a weighted

average over the S initial states, using the relationship

Qs(ui|v) = σ0 · Qs(ui|v) · e. (5.31)

The set of pn−k such scalars is collected together to form the vector of conditional

spectral coefficients

Q(ui|v) =[

Q0(ui|v), Q1(ui|v), . . . , Qpn−k(ui|v)]

. (5.32)


In order to convert between the APPs of the original domain and the spectral do-

main coefficients, the first row of the complex generalisation of the Walsh-Hadamard

matrix is required.

Lemma 5.2.2. The first row or column of the complex Walsh-Hadamard transform

matrix Wpn−k consists entirely of ones, and hence the sum of the entries in this row

or column is pn−k.

Proof. This may be proved inductively. The base case is given by the definition of

Wp in (3.37). For d ∈ Z+, assume that the first row or column of Wpd consists

entirely of ones. Then, (3.43) shows that the first row or column of Wpd+1 contains

only ones. Thus, by the Principle of Mathematical Induction, for any value of n−k,the first row or column of Wpn−k consists entirely of ones, and the sum of its entries

is pn−k.

Rearranging (5.17) and (5.19) before substituting into (5.11) gives

P(ui|v) = (τ 0 ⊗ σ0)(Wpn−k ⊗ IS) · ΘH(ui) · (W−1pn−k ⊗ IS)(Ipn−k ⊗ e). (5.33)

By properties of the Kronecker product, it follows that

(τ 0 ⊗ σ0)(Wpn−k ⊗ IS) = ι0 ⊗ σ0, (5.34)

118


where

ι0 =[

1, 1, . . . , 1]

. (5.35)

Additionally, the rightmost two products of (5.33) may be simplified as

(W−1pn−k ⊗ IS)(Ipn−k ⊗ e) = W−1

pn−k ⊗ e. (5.36)

Further consideration of (5.19), (5.34) and (5.36) in relation to (5.33) reveals that

P(ui|v) = (ι0 ⊗ σ0) · diag{Qs(ui|v)} · (W−1pn−k ⊗ e)

= 1pn−k (ι0 ⊗ σ0) · diag{Qs(ui|v)} · (WH

pn−k ⊗ e)(5.37)

via application of (3.44). Looking at the structure of the vector of conditional

spectral coefficients described in (5.31) and (5.32) allows (5.37) to be rewritten as

P(ui|v) =1

pn−kQ(ui|v)WH

pn−k . (5.38)

Since the APPs which are required for decoding are P0(ui|v), for ui∈GF (p), the first

element of the vector in (5.38) is extracted with the assistance of (5.35), resulting

in

P0(ui|v) =1

pn−k

pn−k−1∑

s=0

Qs(ui|v). (5.39)

A decision rule to determine the estimate ui of the ith transmitted symbol ui in the

codeword u can be formulated by comparing the sums of the conditional spectral

coefficients. This decision rule may be expressed as


{ pn−k−1∑

s=0

Qs(ui = g|v)

}

. (5.40)

APP decoding procedure in the spectral domain

Procedure 5.2 is an alternative to Procedure 5.1 which uses the spectral domain. It

requires knowledge of the parity check matrix H, the received word v, and the state

transition and crossover probabilities of the channel model.

Procedure 5.2. Given is an (n, k) linear block code C in standard form over GF (p),


stochastic sequential machine which has S <∞ states. A p-ary DMC in standard

form corresponds to each of the states. The linear block code C shall be defined by

119

5.3. PROPERTIES OF THE CONDITIONAL SPECTRAL COEFFICIENTS

parity check matrix H. A codeword u is transmitted over the channel and a word v

is received. APP decoding in the spectral domain consists of the following steps.


state distribution vector σ0 and the matrix probabilities D0 and Dǫ. Formulate

the state transition matrix D and the difference matrix ∆ in terms of the

matrix probabilities D0 and Dǫ using (5.27).

Step 2. ∀s = dec(s) ∈ {0, 1, . . . , pn−k − 1}, compute dual codewords

u⊥s = s · H =

[

u⊥s,1, u⊥s,2, . . . , u⊥s,n

]

∈ C⊥, (5.41)

which define the arrangement of the D and ∆ matrices in (5.30).

Step 3. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (p), compute the complete set of condi-

tional spectral coefficients Qs(ui|v) using (5.30) and (5.31).

Step 4. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (p), accumulate all pn−k of these coeffi-

cients to compute the APPs P0(ui|v) using (5.39).

Step 5. Use the decision rule in (5.40) to determine an estimate ui of the ith trans-

mitted symbol ui, for each position i ∈ {1, 2, . . . , k}.

5.3 Properties of the Conditional Spectral Coef-

ficients

In order to provide more information concerning the properties of the vectors of

conditional spectral coefficients Q(ui|v), for ui ∈ GF (p), it is shown in this section

that the first entries Q0(ui|v) of these vectors form a probability distribution.

Given (5.38), it is sensible to examine the complex Walsh-Hadamard transform

matrix Wpn−k in more detail. The following results give information about each of

the rows of the complex conjugate of this matrix. Initially, the first row is considered.

Then, the sum of the entries in all other rows is investigated.

Corollary 5.3.1. The first row or column of W∗pn−k , the complex conjugate of

Wpn−k , consists entirely of ones and the sum of its entries is pn−k.

Proof. Note that 1 = 1+0. The result follows in the same way as Lemma 5.2.2.

Lemma 5.3.1. The sum of the entries in any row or column except the first of the

complex Walsh-Hadamard transform matrix Wpn−k is zero.

120


Proof. This can be proved in several ways. One proof is given in Appendix B.

Another minor result concerning the sum of the entries in the complex conjugate

of a vector is required.

Lemma 5.3.2. Let x be a vector ∈ Cpn−k

such that the sum of its entries is zero.

Then the sum of the entries in x∗ is also zero.

Proof. Let the entries of x be xi = ai + bi, for 1 ≤ i ≤ pn−k. Then

pn−k

∑

i=1

xi = 0 ⇒pn−k

∑

i=1

ai = 0 and

pn−k

∑

i=1

bi = 0

⇒pn−k

∑

i=1

ai −

pn−k

∑

i=1

bi = 0

⇒pn−k

∑

i=1

x∗i = 0. (5.42)

Using Lemma 5.3.2 in conjunction with Lemma 5.3.1 shows that the sum of the

entries in every row and column except the first of W∗pn−k is zero. This leads to a

proof of an important result.

Theorem 5.3.1. The set of first elements Q0(ui = g|v) of the conditional spectral

coefficient vectors Q(ui = g|v), for g∈GF (p), form a probability distribution.

Proof. It is first shown that the sum of the first entries of the conditional spectral

coefficient vectors Q(ui = g|v) over ui ∈ GF (p) is one. Then it is proven that

0 ≤ Q0(ui = g|v) ≤ 1 for each g ∈ GF (p).

Clearly, considering the ith transmitted symbol ui,

P (ui = 0|v) + P (ui = 1|v) + . . .+ P (ui = p−1|v) = 1. (5.43)

In addition, each of the APPs P (ui = g|v) assumes an encoding using exactly one

of the cosets. This is normally, but not necessarily, performed using the code itself.

Therefore,

P (ui = g|v) =

pn−k−1∑

s=0

Ps(ui = g|v). (5.44)

121


Then, combining (5.43) and (5.44) gives

p−1∑

g=0

pn−k−1∑

s=0

Ps(ui = g|v) = 1. (5.45)

Examining the vector P(ui|v) of APPs, it follows from (5.38) and the symmetry of

WHpn−k that

P(ui = g|v) =1

pn−kQ(ui = g|v) · W∗

pn−k . (5.46)

Access the entry in the rth row and cth column of the complex conjugate of the

generalised Walsh-Hadamard matrix using the notation

W∗pn−k = [φr,c]pn−k×pn−k . (5.47)

If (5.46) is expanded and substituted into (5.45), the result may be stated as

1

pn−k

p−1∑

g=0

pn−k

∑

c=1

pn−k

∑

r=1

Qr−1(ui = g|v)φr,c = 1. (5.48)

Expanding and then simplifying (5.48) produces

p−1∑

g=0

Q0(ui = g|v)pn−k∑

c=1

φ1,c +pn−k∑

r=2

(

p−1∑

g=0

Qr−1(ui = g|v)pn−k∑

c=1

φr,c

)

pn−k= 1. (5.49)

By Corollary 5.3.1, Lemma 5.3.1 and Lemma 5.3.2, it follows that

pn−k

∑

c=1

φr,c =

pn−k if r = 1,

0 if r 6= 1.(5.50)

Applying (5.50) to (5.49) implies that

1pn−k

[

p−1∑

g=0

Q0(ui =g|v)pn−k +pn−k∑

r=2

(

p−1∑

g=0

Qr−1(ui =g|v)·0)

]

=p−1∑

g=0

Q0(ui =g|v)

= 1.

(5.51)

Thus, the sum of the first elements Q0(ui|v) of the conditional spectral coefficient

vectors Q(ui|v) over all ui ∈ GF (p) is one. The other condition which must be

checked is whether Q0(ui|v) lies between 0 and 1 inclusive for all values of ui. This

will hold if none of the coefficients are negative. Consider (5.30) for the case s = 0.

122


The result can be expressed as

Q0(ui|v) =1

pDi−1 [D + (δui,vi

p− 1)∆]Dn−i. (5.52)

Therefore by (5.27) and (5.31),

Q0(ui|v) =

{

σ0Di−1D0D

n−ie if ui = vi,

σ0Di−1DǫD

n−ie if ui 6= vi.(5.53)

All values in the matrices in (5.53) are non-negative, because they are probabilities.

The same applies to the vector σ0, since its values correspond to the stationary state

probability distribution. Post-multiplication by e is simply a summation of these

non-negative values, and hence the overall product is a non-negative real number,

regardless of the value of ui. Combining

Q0(ui|v) ≥ 0 ∀ui ∈ GF (p) (5.54)

with (5.51) implies that

Q0(ui|v) ≤ 1 ∀ui ∈ GF (p). (5.55)

Therefore {Q0(ui|v) | ui ∈ GF (p)} constitutes a probability distribution.

In (5.45), it is important to note that the summation occurs over all s values

from 0 to pn−k − 1. It follows that

p−1∑

g=0

P0(ui = g|v) ≤ 1. (5.56)

Although equality in (5.56) is technically a possibility, in general the set of APPs

{P0(ui|v) | ui ∈ GF (p)} corresponding to the coset V0 do not sum to one and thus

do not form a probability distribution. Therefore, decoding decisions can only be

made by comparing the APPs P0(ui|v) for each and every ui ∈ GF (p).

5.4 Complexity Analysis

It is possible to analyse the requirements of Procedures 5.1 and 5.2 in terms of ex-

ecution time and memory needed. Again the O notation will be used to provide

an asymptotic upper bound on these requirements, as the values of the considered

parameters approach +∞. Since these two procedures for non-binary codes are gen-

eralisations of their counterparts in Chapter 4 for binary codes, their computational

complexities are generalisations of those derived in Section 4.3.

123


5.4.1 Computational complexity

An idea of the execution time for Procedures 5.1 and 5.2 is obtained by considering

the number of multiplications required in terms of four parameters. These param-

eters are the number, n, of symbols per codeword, the number, k, of information

symbols per codeword, the number, S, of states in the model, and the size, p, of the

Galois field.

Computational complexity of the original domain approach

Assume that the decoding is performed in the original domain using only the APPs

P0(ui|v). To decode a single word, one such probability must be calculated for all p

values of ui, for each value of i∈{1, 2, . . . , k}. Each of these k · p APPs requires pk−1

matrix products to be summed. The structure of the parity check matrix H allows a

choice of p possible symbols for the first k positions. Once these are chosen, there is

only one choice for codeword symbols in the remaining n−k positions. There would

thus be pk possible codewords or matrix products per trellis. However, the condition

“u∈C, ui =g” forces a particular value to occur at one of these k positions, and thus

the total is pk−1 matrix products. The computation of each such matrix product

requires O(nS2) multiplications, since a row vector of length S is multiplied by an

S×S matrix n times. Thus, the total complexity of Procedure 5.1 per codeword is

O(knpkS2).

Computational complexity of the spectral domain approach

The decoding of a word using Procedure 5.2 would require the equivalent of con-

structing k·p trellises. That is, p different trellises for the decoding at each of the k

information symbol positions are needed. One scalar conditional spectral coefficient

must be calculated for each of the pn−k rows of each trellis. In turn, calculation of

each scalar requires O(nS2) multiplications, since a row vector of length S is multi-

plied by an S×S matrix n times. Note that these numbers are complex rather than

real. However, supposing these are stored as a real and a complex part, each com-

plex number multiplication is really a series of four multiplications of real numbers,

which still requires O(nS2) multiplications per trellis overall. The total number of

multiplications is therefore O(knpn−k+1S2). Basic arithmetic then demonstrates the

preference towards the spectral domain for high-rate codes, where

k >n+ 1

2, (5.57)

124


whereas the original domain approach will be more efficient when the rate of the

code is low.

5.4.2 Storage requirements

The quantity Y of real number storage spaces needed for execution of Procedures

5.1 and 5.2 can be determined. These analyses are extrapolations of those for the

binary case in Section 4.3.

Storage requirements of the original domain approach

Calculation of the weighted trellis matrix UH(ui) for the ith transmitted symbol ui

is required in order to extract the Sth principal leading submatrix [UH(ui)](S) of

(5.13). Thus, a block of (pn−kS)2 real number storage spaces must be allocated,

along with a temporary vector of length S whilst the multiplications are performed,

as well as p spaces for the APPs P0(ui|v) before choosing the maximum of these as

the decoded symbol. Therefore for Procedure 5.1,

Y = O(p2(n−k)S2). (5.58)

Storage requirements of the spectral domain approach

When using the spectral domain with non-binary codes, it must be remembered that

there will be times when multiplication of complex numbers is required. This can be

accommodated by a vector of length 2S and a matrix holding 2S2 real entries. Each

of the pn−k conditional spectral coefficients is added to the previous tally, so that

two real numbers will be sufficient for storage of P0(ui|v) for each of the p possible

values of ui. It follows that Procedure 5.2 requires memory to store

Y = O(S2 + p) (5.59)

real numbers. This analysis advocates the use of the spectral domain over the

original domain, particularly if the size of the field and/or the dual code is large.


Consider the task of using APP decoding to estimate the second symbol transmitted

u2 when the word v = [1, 2, 2, 0] is received after transmission of a codeword u from

the (4,2) linear block code over GF (3) described in (3.107). Assume a ternary GEC

125


model is used, with state transition probabilities P and Q and crossover probabilities

pG and pB in the ‘good’ and ‘bad’ states G and B, respectively. Suppose that

P = 10−4, Q = 10−2, pG = 10−3 and pB = 10−2. (5.60)

This results in matrix probabilities D0 and Dǫ, which have values

D0 =

[

1 − 2pG 0

0 1 − 2pB

][

1 − P P

Q 1 −Q

]

=

[

0.9979 0.0000998

0.0098 0.9702

]

, (5.61)

Dǫ =D1 =D2 =

[

pG 0

0 pB

][

1 − P P

Q 1 −Q

]

=

[

0.0009999 0.0000001

0.0001 0.0099

]

. (5.62)

By (5.27), the matrix probabilities D and ∆ are calculated as

D =

[

0.9999 0.0001

0.01 0.99

]

, ∆ =

[

0.9969003 0.0000997

0.0097 0.9603

]

. (5.63)

The stationary state distribution vector for this model may be expressed as

σ0 =[

Q

P+Q, P

P+Q

]

=[

100101, 1

101

]

, (5.64)

and, as the channel model consists of two states,

e =[

1 1]T

. (5.65)

The calculations involved in performing this task using Procedures 5.1 and 5.2 are

shown in the following two subsections. It is demonstrated that both cases result in

the same answer.


Using (3.21), the set of nine elementary trellis matrices Mh(u) for h, u ∈ GF (3) may

be listed in terms of the 3 × 3 circulant matrices as

M0(0) = M1(0) = M2(0) = M0(1) = M0(2) = I3, (5.66)

M1(1) = M2(2) = circ(0, 1, 0), (5.67)

M2(1) = M1(2) = circ(0, 0, 1). (5.68)

126


There are n · p = 12 trellis section matrices to consider for a jth transmitted symbol

uj. The three such matrices relating to the first transmitted symbol u1 may be

determined as

Mh1(0) = M2(0) ⊗ M1(0) = I9, (5.69)

Mh1(1) = M2(1) ⊗ M1(1)=

0 0 0 0 0 0 0 1 0

0 0 0 0 0 0 0 0 1

0 0 0 0 0 0 1 0 0

0 1 0 0 0 0 0 0 0

0 0 1 0 0 0 0 0 0

1 0 0 0 0 0 0 0 0

0 0 0 0 1 0 0 0 0

0 0 0 0 0 1 0 0 0

0 0 0 1 0 0 0 0 0

, (5.70)

and

Mh1(2) = M2(2) ⊗ M1(2) =

0 0 0 0 0 1 0 0 0

0 0 0 1 0 0 0 0 0

0 0 0 0 1 0 0 0 0

0 0 0 0 0 0 0 0 1

0 0 0 0 0 0 1 0 0

0 0 0 0 0 0 0 1 0

0 0 1 0 0 0 0 0 0

1 0 0 0 0 0 0 0 0

0 1 0 0 0 0 0 0 0

. (5.71)

Trellis matrices for the remaining sections may be calculated similarly.

The weighted trellis section matrices for codes over GF (3) are given by

Uhj(uj = 0) = Mhj

(0) ⊗ D(vj|0) = I9 ⊗ Dvj,

Uhj(uj = 1) = Mhj

(1) ⊗ D(vj|1) = Mhj(1) ⊗ Dvj⊖1,

Uhj(uj = 2) = Mhj

(2) ⊗ D(vj|2) = Mhj(2) ⊗ Dvj⊖2.

(5.72)

Applying (5.72) to (5.69)-(5.71) by setting j=1 results in

Uh1(0) = I9 ⊗ Dv1

= diag{D1,D1,D1,D1,D1,D1,D1,D1,D1}, (5.73)

127


Uh1(1)=Mh1

(1)⊗Dv1⊖1 =

0 0 0 0 0 0 0 D0 0

0 0 0 0 0 0 0 0 D0

0 0 0 0 0 0 D0 0 0

0 D0 0 0 0 0 0 0 0

0 0 D0 0 0 0 0 0 0

D0 0 0 0 0 0 0 0 0

0 0 0 0 D0 0 0 0 0

0 0 0 0 0 D0 0 0 0

0 0 0 D0 0 0 0 0 0

, (5.74)

and

Uh1(2)=Mh1

(2)⊗Dv1⊖2 =

0 0 0 0 0 D2 0 0 0

0 0 0 D2 0 0 0 0 0

0 0 0 0 D2 0 0 0 0

0 0 0 0 0 0 0 0 D2

0 0 0 0 0 0 D2 0 0

0 0 0 0 0 0 0 D2 0

0 0 D2 0 0 0 0 0 0

D2 0 0 0 0 0 0 0 0

0 D2 0 0 0 0 0 0 0

. (5.75)

Weighted trellis matrices for the other sections can also be derived using (5.72).

In order to decode the second symbol, the matrix representations of the first,

third and fourth trellis sections irrespective of the sent symbol are required. The

representation of the first section may be calculated as

Uh1=Uh1

(0)+Uh1(1)+Uh1

(2)=

D1 0 0 0 0 D2 0 D0 0

0 D1 0 D2 0 0 0 0 D0

0 0 D1 0 D2 0 D0 0 0

0 D0 0 D1 0 0 0 0 D2

0 0 D0 0 D1 0 D2 0 0

D0 0 0 0 0 D1 0 D2 0

0 0 D2 0 D0 0 D1 0 0

D2 0 0 0 0 D0 0 D1 0

0 D2 0 D0 0 0 0 0 D1

. (5.76)

According to (5.10), the entire weighted trellis matrices are of size 18 × 18 and are

128


given by

UH(u2) = Uh1Uh2

(u2)Uh3Uh4

. (5.77)

The complete structures of these three matrices are irrelevant, because only the 2nd

principal leading submatrices [UH(u2)](2) need to be considered when calculating

the APPs. Therefore, by (5.13), the first element of the vector of APPs in the three

cases can be expressed as

P0(u2 = 0|v) = σ0 · [D1(D2)2D0 + D0D2(D1)

2 + (D2)2D0D2] · e,

P0(u2 = 1|v) = σ0 · [D2D1D2D1 + D0D1(D0)2 + (D1)

3D2] · e,P0(u2 = 2|v) = σ0 · [D2D0D1D0 + D1(D0)

2D1 + (D0)2(D2)

2] · e.(5.78)

Then, substituting (5.61), (5.62), (5.64) and (5.65) into (5.78) gives

P0(u2 = 0|v) = 3.15 × 10−8,

P0(u2 = 1|v) = 1.08 × 10−3,

P0(u2 = 2|v) = 5.77 × 10−6.

(5.79)

Evaluating (5.14) with (5.79), it follows that u2 = 1 under these channel conditions.


The calculations in the spectral domain tend to be more concise, because only

diagonal and block-diagonal matrices are involved. As the code used in this example

is over GF (3), let

w = e−2π3

. (5.80)

Since C is self-dual, the dual codewords are simply those of C. However, their order

is important. Applying (3.107) to (5.41) gives

u⊥0 = [0, 0, 0, 0],

u⊥1 = [1, 2, 0, 1],

u⊥2 = [2, 1, 0, 2],

u⊥3 = [2, 2, 1, 0],

u⊥4 = [0, 1, 1, 1],

u⊥5 = [1, 0, 1, 2],

u⊥6 = [1, 1, 2, 0],

u⊥7 = [2, 0, 2, 1],

u⊥8 = [0, 2, 2, 2].

(5.81)

129


Equation (5.30) produces

Qs(u2|v) = wv1·u⊥

s,1

[

δu⊥

s,1,0D + (1 − δu⊥

s,1,0)∆]

×w

u2·u⊥s,2

3[D + (3δu2,v2

− 1)∆]×4∏

j=3

wvj ·u⊥

s,j

[

δu⊥

s,j ,0D + (1 − δu⊥

s,j ,0)∆]

,

(5.82)

where, for j ∈ {1, 2, 3, 4} and s ∈ {0, 1, . . . , 8}, u⊥s,j is the jth entry in the sth dual

codeword u⊥s as defined in (5.81). In total, there are 27 conditional spectral coeffi-

cient matrices to calculate. Substituting the values of u2, u⊥s,j and v = [1, 2, 2, 0] in

(5.82) givesQ0(u2 = 0|v) = D

(

D−∆

3

)

DD,

Q1(u2 = 0|v) = w∆(

D−∆

3

)

D∆,

Q2(u2 = 0|v) = w2∆(

D−∆

3

)

D∆,

Q3(u2 = 0|v) = w2∆(

D−∆

3

)

w2∆D,

Q4(u2 = 0|v) = D(

D−∆

3

)

w2∆∆,

Q5(u2 = 0|v) = w∆(

D−∆

3

)

w2∆∆,

Q6(u2 = 0|v) = w∆(

D−∆

3

)

w∆D,

Q7(u2 = 0|v) = w2∆(

D−∆

3

)

w∆∆,

Q8(u2 = 0|v) = D(

D−∆

3

)

w∆∆,

(5.83)

Q0(u2 = 1|v) = D(

D−∆

3

)

DD,

Q1(u2 = 1|v) = w∆(

w2 D−∆

3

)

D∆,

Q2(u2 = 1|v) = w2∆(

wD−∆

3

)

D∆,

Q3(u2 = 1|v) = w2∆(

w2 D−∆

3

)

w2∆D,

Q4(u2 = 1|v) = D(

wD−∆

3

)

w2∆∆,

Q5(u2 = 1|v) = w∆(

D−∆

3

)

w2∆∆,

Q6(u2 = 1|v) = w∆(

wD−∆

3

)

w∆D,

Q7(u2 = 1|v) = w2∆(

D−∆

3

)

w∆∆,

Q8(u2 = 1|v) = D(

w2 D−∆

3

)

w∆∆,

(5.84)

Q0(u2 = 2|v) = D(

D+2∆3

)

DD,

Q1(u2 = 2|v) = w∆(

wD+2∆3

)

D∆,

Q2(u2 = 2|v) = w2∆(

w2 D+2∆3

)

D∆,

Q3(u2 = 2|v) = w2∆(

wD+2∆3

)

w2∆D,

Q4(u2 = 2|v) = D(

w2 D+2∆3

)

w2∆∆,

Q5(u2 = 2|v) = w∆(

D+2∆3

)

w2∆∆,

Q6(u2 = 2|v) = w∆(

w2 D+2∆3

)

w∆D,

Q7(u2 = 2|v) = w2∆(

D+2∆3

)

w∆∆,

Q8(u2 = 2|v) = D(

wD+2∆3

)

w∆∆.

(5.85)

130


Calculation of the conditional spectral coefficients Qs(u2|v) by (5.31) is depicted in

the trellises of Figs. 5.1 - 5.3, with the terms in (5.83) - (5.85) appearing on their

branches. In sections 1, 3 and 4 of the trellises, note how the zeroes in the dual

codewords of (5.81) correspond with D matrices, and how the nonzero elements

produce ∆ matrices. The values of the three APPs P0(u2|v) are found by (5.39) to

beP0(u2 = 0|v) = 1

9

∑8s=0Qs(u2 = 0|v) = 3.15 × 10−8,

P0(u2 = 1|v) = 19

∑8s=0Qs(u2 = 1|v) = 1.08 × 10−3,

P0(u2 = 2|v) = 19

∑8s=0Qs(u2 = 2|v) = 5.77 × 10−6,

(5.86)

which match the original domain APPs in (5.79). It can thus again be concluded

that u2 = 1. Procedures 5.1 and 5.2 are therefore two ways of arriving at the same

decoding estimate.


An analysis of the performance of large classes of codes when used over a large

number of possible channels and decoded using the two methods presented in this

chapter would take an inordinate amount of time and computing power. Therefore,

only a select few have been chosen for simulation using an implementation of Pro-

cedure 5.2 in MATLABr. In particular, some non-binary Hamming codes and the

ISBN-10 code are investigated.

5.6.1 Non-binary Hamming codes

The details of the considered non-binary (n, k) Hamming codes used over a non-

binary GEC are as follows:

(4,2) Hamming code over GF (3): As defined in (3.107).

(6,4) Hamming code over GF (5): The parity check matrix for this one-error

correcting Hamming code of order n−k = 2 and rate R = 0.67 is given as

H =

[

1 1 1 1 1 0

1 2 3 4 0 1

]

. (5.87)

(8,6) Hamming code over GF (7): Similarly, this one-error correcting Hamming

code of order two and rate R = 0.75 is defined by the parity check matrix

H =

[

1 1 1 1 1 1 1 0

1 2 3 4 5 6 0 1

]

. (5.88)

131


t t t t t

t t t t t

t t t t t

t t t t t

t t t t t

t t t t t

t t t t t

t t t t t

t t t t t

- - - -

- - - -

- - - -

- - - -

- - - -

- - - -

- - - -

- - - -

- - - -

σ0

σ0

σ0

σ0

σ0

σ0

σ0

σ0

σ0

e Q8(u2 =0|v)

e Q7(u2 =0|v)

e Q6(u2 =0|v)

e Q5(u2 =0|v)

e Q4(u2 =0|v)

e Q3(u2 =0|v)

e Q2(u2 =0|v)

e Q1(u2 =0|v)

e Q0(u2 =0|v)

D D−∆

3 D D

w∆ D−∆

3 D ∆

w2∆D−∆

3 D ∆

w2∆D−∆

3 w2∆ D

D D−∆

3 w2∆ ∆

w∆ D−∆

3 w2∆ ∆

w∆ D−∆

3 w∆ D

w2∆D−∆

3 w∆ ∆

D D−∆

3 w∆ ∆

Figure 5.1: Weighted diagonal trellis of the (4, 2) linear block code C over GF (3)used to compute spectral coefficients Qs(u2 = 0|v); s = 0, 1, . . . , 8.

132


t t t t t

t t t t t

t t t t t

t t t t t

t t t t t

t t t t t

t t t t t

t t t t t

t t t t t

- - - -

- - - -

- - - -

- - - -

- - - -

- - - -

- - - -

- - - -

- - - -

σ0

σ0

σ0

σ0

σ0

σ0

σ0

σ0

σ0

e Q8(u2 =1|v)

e Q7(u2 =1|v)

e Q6(u2 =1|v)

e Q5(u2 =1|v)

e Q4(u2 =1|v)

e Q3(u2 =1|v)

e Q2(u2 =1|v)

e Q1(u2 =1|v)

e Q0(u2 =1|v)

D D−∆

3 D D

w∆ w2 D−∆

3 D ∆

w2∆ wD−∆

3 D ∆

w2∆ w2 D−∆

3 w2∆ D

D wD−∆

3 w2∆ ∆

w∆ D−∆

3 w2∆ ∆

w∆ wD−∆

3 w∆ D

w2∆D−∆

3 w∆ ∆

D w2 D−∆

3 w∆ ∆


133


t t t t t

t t t t t

t t t t t

t t t t t

t t t t t

t t t t t

t t t t t

t t t t t

t t t t t

- - - -

- - - -

- - - -

- - - -

- - - -

- - - -

- - - -

- - - -

- - - -

σ0

σ0

σ0

σ0

σ0

σ0

σ0

σ0

σ0

e Q8(u2 =2|v)

e Q7(u2 =2|v)

e Q6(u2 =2|v)

e Q5(u2 =2|v)

e Q4(u2 =2|v)

e Q3(u2 =2|v)

e Q2(u2 =2|v)

e Q1(u2 =2|v)

e Q0(u2 =2|v)

D D+2∆3 D D

w∆ wD+2∆3 D ∆

w2∆ w2 D+2∆3 D ∆

w2∆ wD+2∆3 w2∆ D

D w2 D+2∆3 w2∆ ∆

w∆ D+2∆3 w2∆ ∆

w∆ w2 D+2∆3 w∆ D

w2∆D+2∆

3 w∆ ∆

D wD+2∆3 w∆ ∆


134


Figures 5.4-5.6 display the SER performance obtained after data was transmitted

over a GEC model and decoded using Procedure 5.2, which operates in the spectral

domain. The results for the (4,2) Hamming code over GF (3) are depicted in Fig.

5.4, those for the (6,4) Hamming code over GF (5) are given in Fig. 5.5, and the SER

performance obtained with the (8,6) Hamming code over GF (7) is displayed in Fig.

5.6. In all three cases, the state transition probabilities of the channel model were

fixed at P = 0.05 and Q = 0.2. Simulations were carried out for pairs of crossover

probability values taken from the sets

pG ∈ {10−4, 10−3, 10−2},

pB ∈ {0.01, 0.02, . . . , 0.1}.(5.89)

A vertical scale incorporating both SER and BER is provided in order to aid

performance comparisons between the different codes. There are a number of ways

that the conversion from SER to an equivalent BER may be carried out, however

the one used here to convert from an error rate in p-ary units of information to one

in terms of bits is given by

BER =SER

log2(p). (5.90)

As would be expected, in all three cases, the SER decreases as the crossover prob-

ability pB in the ‘bad’ state B decreases. The SER also decreases as the crossover

probability pG in the ‘good’ stateG decreases within the range given, however further

decreases in this crossover probability did not produce curves which were apprecia-

bly discernable from the curves for pG = 10−4. To improve legibility, such results

have been omitted from the figures. In addition, the BER increases as the order

of the field increases. Although each code is designed to correct one transmission

error, the codes are between four and eight symbols in length. In particular, the

code over GF (3) is able to correct one out of the four symbols, the code over GF (5)

can correct one out of six symbols, while the code over GF (7) is only capable of

correcting one out of eight symbols. This explains the differences in the error rates.

5.6.2 The ISBN-10 code

Let C be the (10,9) single parity check code over GF (11) where the parity symbol

u10 is defined by the equation

u1 + 2u2 + 3u3 + 4u4 + 5u5 + 6u6 + 7u7 + 8u8 + 9u9 + 10u10 ≡ 0 (mod 11). (5.91)

135


0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.110

−4

10−3

10−2

10−1

100

pB

Err

or r

ate

SER pG = 10−2

SER pG = 10−3

SER pG = 10−4

BER pG = 10−2

BER pG = 10−3

BER pG = 10−4

Figure 5.4: Performance of the (4,2) Hamming code over GF (3) on a GEC withstate transition probabilities P=0.05 and Q=0.2.

0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.110

−4

10−3

10−2

10−1

100

pB

Err

or r

ate

SER pG = 10−2

SER pG = 10−3

SER pG = 10−4

BER pG = 10−2

BER pG = 10−3

BER pG = 10−4


136


0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.110

−4

10−3

10−2

10−1

100

pB

Err

or r

ate

SER pG = 10−2

SER pG = 10−3

SER pG = 10−4

BER pG = 10−2

BER pG = 10−3

BER pG = 10−4


0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.0910

−4

10−3

10−2

10−1

100

pB

Err

or r

ate

SER pG = 10−2

SER pG = 10−3

SER pG = 10−4

BER pG = 10−2

BER pG = 10−3

BER pG = 10−4

Figure 5.7: Performance of the ISBN-10 code on a GEC with state transitionprobabilities P=0.05 and Q=0.2.

137


This produces the parity check matrix

H =[

1 2 3 4 5 6 7 8 9 X]

, (5.92)

where “X” is used to represent “10”, thus avoiding any confusion between “10” and

“1 0”. However, this parity check matrix is not yet in systematic form. To fix this,

multiply H by 10 to obtain

H =[

X 9 8 7 6 5 4 3 2 1]

. (5.93)

Since the ISBN-10 code has Hamming distance d = 2, it is only of real use for error

detection as it detects d− 1 = 1 error but is not guaranteed to correct any errors.

For example, if the barcode is scanned and it is not recognised as a valid ISBN by

the scanner, then an error is detected and the barcode may be scanned again. Note

that the above encoding process refers to the now-obsolete ISBN-10 code. From

2007, the decimal (13,12) ISBN-13 code [68] is used in practice. The transition

occurred because the number of individual codewords available for issue with new

publications was being depleted. There are three more information symbol positions

in the new ISBN-13 code compared to the ISBN-10 code. The parity symbol x13 for

an ISBN-13 codeword can be calculated using the GF (10) equation

u1+u3+u5+u7+u9+u11+u13+3(u2+u4+u6+u8+u10+u12) ≡ 0 (mod 10). (5.94)

This produces the parity check matrix

H =[

1 3 1 3 1 3 1 3 1 3 1 3 1]

, (5.95)

and as ISBN-13 is a decimal code, the “X” symbol is not required.

A notion of the performance of the ISBN-10 code may be obtained from Fig.

5.7. In this simulation, Procedure 5.2 was again used to decode transmissions over

a GEC with parameters P = 0.05 and Q = 0.2. Pairs of crossover probability values

were taken from the sets

pG ∈ {10−4, 10−3, 10−2},

pB ∈ {0.01, 0.02, . . . , 0.09}.(5.96)

It can be shown that an 11-ary symmetric DMC which has crossover probability

138

5.7. SUMMARY

pB within the half-closed interval ( 111, 1

10] has the same capacity as a DMC with

crossover probability less than 111

, and thus crossover probabilities greater than 0.09

are not considered in the simulations. The results suggest that, in the same way

as for the Hamming codes, the performance degrades when either of the crossover

probabilities increases. Additionally, the BER is much higher for the ISBN-10 code

than for the Hamming codes. This occurs for two main reasons. Firstly, the rate of

the ISBN-10 code is higher than those of the Hamming codes, at 0.9. Secondly, the

ISBN-10 code is only designed to detect errors. It is not designed to correct them,

as its Hamming distance is only two.

5.7 Summary

The transition from a memoryless model to one with memory requires the decoder

to factor in all possible state sequences. This is the major difference between the

decoding problems of Chapter 3 compared to those of Chapters 4 and 5. The

APP decoding procedures for channels described by stochastic automata employ

the same underlying algorithms as their memoryless channel counterparts, but the

scalar weightings of crossover probabilities must be replaced by matrix probabilities.

In general, there is one such matrix probability for each pair of possible transmitted

and received symbols. However, the number of such matrix probabilities is reduced

to two when the DMC in each state is symmetric. For a GEC model, there is one

matrix probability corresponding to correct reception of a symbol, considering all

possible state transitions, and another for incorrect reception.

The change from a binary signalling alphabet to one of higher order is handled

in the same way as for the memoryless case. That is, the circulant matrices of

the same order as the size of the signalling alphabet are used to define the matrix

representation of a trellis section in the original domain. Alternatively, the complex

generalisation of the Walsh-Hadamard matrix can be used to perform a similarity

transformation, which produces the diagonal matrices of the spectral domain. In

this case the conditional spectral coefficient matrices are strongly connected to the

structure of the dual code. Capitalising on these ties in the case of a GEC means

converting from the two standard matrix probabilities in the original domain to

the state transition and difference matrices which are representative of the spectral

domain. Either domain may be used to perform the necessary calculations. The

choice of domain should be motivated by the code rate.

For a systematic non-binary linear block code in conjunction with a channel de-

scribed by a stochastic automaton which includes a stochastic sequential machine

139

5.7. SUMMARY

with finitely many states, an analysis of the execution time and memory require-

ments for the APP decoding procedures using the original and spectral domains

developed in this chapter was provided. The computational complexity of the spec-

tral domain approach was favoured for codes with a high code rate, whilst it was

more efficient to use the original domain when the code rate was low. The diag-

onalisation process allowed the overall large-sized matrix multiplications that were

necessary in the original domain to be performed in individual chains of multipli-

cation of smaller-sized matrices. As the conditional spectral coefficients could be

accumulated one at a time, this led to a significant reduction in the amount of

memory used by Procedure 5.2 compared to Procedure 5.1.

It was also proved that the set of the first elements of the conditional spectral

coefficient vectors over all possible transmitted symbols forms a probability distri-

bution. The fact that the first element of each vector of APPs, when considered over

all values of transmitted symbol, do not form a probability distribution means that

APPs for the transmission of each possible symbol, given the received vector, must

be calculated in order to reach a decision for each information symbol position.

An instructive example of APP decoding for a self-dual code over GF (3) was

given using both domains in order to demonstrate the types of calculations which

are necessary in the two procedures. The equality of the two answers to the same

decoding task was verified. Finally, a selection of numerical results from computer

simulations of decoding over various fields was provided. It was observed by con-

version to equivalent BERs that for Hamming codes of order two, raising the order

of the field is likely to increase the error rate. However, decreasing either of the

crossover probabilities of a GEC appears to improve the performance. The poor er-

ror correcting capabilities of the ISBN-10 code were also observed. It must be noted

that these simulations are only a small portion of the possibilities for analysis. A

complete investigation into the performance of non-binary linear block codes on a

GEC is beyond the scope of this thesis.

140

Chapter 6

Generalised Weight Polynomials

for Binary Restricted GECs

A GEC model does not have a unique parametrisation. That is, there is more than

one way to describe the sequence of state transitions and the error generation process

for that model. Perhaps the simplest paradigm discussed in previous chapters was

one dominated by the probabilities of error and state transition events at each

discrete time instant. This chapter examines another paradigm for modelling a

binary GEC, which is instead based on the distribution of error bursts. These

characteristics may in fact be easier to measure, making the method described herein

more useful in a practical sense.

Additionally, the procedures for APP decoding developed in Chapters 4 and 5

depend heavily on vector and matrix multiplication. Four parameters were required

to fully describe the channel. However, restricting the channel by fixing the value

of one parameter allows the model to be described by three variables and by exami-

nation of all possible matrix products which could be involved in the APP decoding

calculations, the matrix multiplication can be replaced by the evaluation of trivariate

polynomials. This is the motivation for the study of restricted GECs as introduced

in Section 2.2.2.

The expression of the APPs in terms of these polynomials in three variables is

pleasing from an aesthetic point of view. It allows a more direct connection with the

fading profile of the channel. Furthermore, these polynomials provide information

about probabilities of codeword symbols as a function of artifacts of the dual code.

This is a similar purpose to the single-variable weight polynomials of a code and

its dual as related by the MacWilliams identity [25]. Since the polynomials derived

in this chapter have three variables, as opposed to the weight enumerators of one

variable, they shall be referred to as generalised weight polynomials (GWPs). This

141


concept of GWPs has been introduced in [26] for use with syndrome decoding and is

adopted in this thesis for APP decoding. Thus, the APP decoding method developed

in this chapter is based on a generalisation of one of the most profound results in

coding theory, the MacWilliams identity.

This chapter is structured as follows. The statement of the APP decoding task

for binary restricted GECs using GWPs is formulated in Section 6.1. Section 6.2

discusses the alternative method of parameterising the channel model for binary

GECs in general using burst-error characteristics. In particular, the relationship

between the channel reliability factor and the matrix probabilities of the spectral

domain is highlighted. It is then possible to describe the APPs in terms of the

trivariate GWPs. This method is outlined in Section 6.3. An example of using

these polynomials to perform APP decoding is given in Section 6.4, while simulation

results for two binary linear block codes are shown in Section 6.5. Finally, Section

6.6 concludes the chapter.

The principal contributions of this chapter are:

• Formulation of the relationship between the channel reliability factor z and

the state transition matrix D and difference matrix ∆ for a binary GEC.

• Derivation of the conditional spectral coefficients, which are necessary for APP

decoding on a binary restricted GEC, in terms of the burst-error characteris-

tics.

• Discussion of the connection between the binary MacWilliams identity and

the polynomial form of the conditional spectral coefficients.

• Description of an APP decoding algorithm using burst-error characteristics

and polynomial evaluation rather than matrix multiplication, for a binary

restricted GEC.

• Numerical examples which highlight some of the many possible applications

of this theory.


Assume that C is a binary (n, k) linear block code in standard form which is used

to protect data transmitted over a channel described by a stochastic automaton D,

where D is a binary restricted GEC as discussed in Section 2.2.2 together with an

142

6.2. BURST-ERROR CHARACTERISTICS

initial state distribution σ0. That is, the binary input, binary output channel has

state set S = {G,B} with the crossover probability in state B given by

pB = 0.5. (6.1)

Since D also falls under the broader classification of a GEC, the APP decoding de-

cisions for each information bit position are given by (4.42). However, that equation

is, at its most elementary level, a statement in terms of four parameters P , Q, pG

and pB. It is also a statement about matrix products. Given (6.1), the task to be

completed in order to perform APP decoding is to find a closed form polynomial

expression for ui involving at most three variables. That is, to find 2k polynomials

f (ui)(x1, x2, x3) =2n−k−1∑

s=0

Qs(ui|v), (6.2)

for i ∈ {1, 2, . . . , k} and ui ∈ GF (2), and where x1, x2 and x3 are variables which

completely define the particular binary restricted GEC model being used. It is then

a consequence of (4.42) that the decoding decisions can be made according to

ui =

{

0 if f (0)(x1, x2, x3) ≥ f (1)(x1, x2, x3),

1 otherwise.(6.3)

6.2 Burst-error Characteristics

One possible set of parameters for describing a binary GEC is the set of two state

transition probabilities P and Q, combined with the two crossover probabilities

pG and pB. This description is adequate if the behaviour of the channel has been

described in terms of the underlying theoretical Markov chain. However, if the

error patterns from a physical channel are being measured, it may be easier or more

practical to use other parameters which are more closely related to the distribution

of the bursts of transmission errors.

Another set of parameters for describing a GEC suggested in Chapter 2 was the

so-called burst-error characteristics. The likelihood of the model being in the ‘good’

state G or the ‘bad’ state B can be retrieved from the average fade to connection

time ratio x and the burst factor y as defined in (2.45) and (2.46) respectively as

x =P

Q, (6.4)

y = 1 − P −Q. (6.5)

143

6.2. BURST-ERROR CHARACTERISTICS

However, these two parameters give no information about the likelihood of transmis-

sion errors. This aspect of the channel is described by the parameter conventionally

known as z. It is called the channel reliability factor and is directly related to the

average symbol error rate of the channel, which is an easier quantity to measure than

the crossover probabilities. Hence, in the discussion that follows it will be necessary

to recall the definition of the average BER for a GEC as given in (2.47) by

pb = pGσG + pBσB. (6.6)

From [26], it is known that the relationship between the channel reliability factor

and the average BER for any binary GEC can be expressed as

z = 1 − 2pb, (6.7)

and in Chapter 2 it was defined as an expected value over both states G and B of

the difference between the probabilities of correct and erroneous transmission. The

following derivation of the channel reliability factor will be based upon the spectral

representation of a trellis, given that calculations are often simpler to perform in

that domain. Instead of the conventional APP decoding trellis, the summation

condition “u ∈ C, ui = g” in (4.3) is relaxed to “u ∈ C”, to give a description of

the syndrome trellis of Section 2.3.3. Such a trellis is not biased toward determining

probabilities for any particular position or bit. It is a particular case of a result

in [31], or alternatively it can be derived from Section 4.2, that for a binary code of

length n, a description of the syndrome trellis in terms of spectral matrices is given

by

ΘH =n∏

j=1

diag{

Θshj

}

, (6.8)

where

Θshj= D0 + D1(−1)<s,hj> =

{

D if < s,hj > = 0,

∆ if < s,hj > = 1.(6.9)

Here, s = bin(s) and bin(·) denotes the function which gives the binary representa-

tion of its input in vector form. Furthermore, < s,hj > refers to the dot product of

the vectors s and hj, where 0 ≤ s ≤ 2n−k−1 and 1≤j≤n. In the spectral domain,

the two matrix probabilities D and ∆ are the only two choices available to describe

the reliability of the channel and each is examined individually. For a binary GEC,

the difference matrix ∆ may be written as

∆ =

[

1 − 2pG 0

0 1 − 2pB

]

· D. (6.10)

144

6.3. DERIVATION OF APPS USING GENERALISED WEIGHT POLYNOMIALS

For all four possible state transitions, ∆ supplies the probability of correct trans-

mission but negatively affected by the probability of incorrect transmission. By

contrast, D does not completely reflect the behaviour of the channel as it is entirely

described by the parameters x and y as shown in (2.73), and is thus independent of

transmission errors. Therefore, the prime candidate for expressing the channel reli-

ability is ∆, which can be converted to a scalar variable by calculating the weighted

average over the two states of the channel in the stationary state distribution. In

other words,

z = σ0∆e. (6.11)

Simplifying,

z =[

σG, σB

]

[

1 − 2pG 0

0 1 − 2pB

][

1 − P P

Q 1 −Q

][

1

1

]

= σG(1 − 2pG) + σB(1 − 2pB)

= 1 − 2pb. (6.12)

The average BER pb of the channel is easier to deduce from measurements than the

two individual crossover probabilities, because the state is hidden for this channel

model. Given the stationary state distribution σ0, the model may be described

by the parameters x, y, z, plus either pB or pG. Fixing one of these two crossover

probabilities could then allow the channel to be described by the three burst-error

characteristics. This approach would allow a reformulation of the conditional spec-

tral coefficient matrices, the details of which are shown in the next section.

6.3 Derivation of APPs using Generalised Weight

Polynomials

Examining in detail Procedure 4.2, which is an APP decoding algorithm for a

GEC using the spectral domain, one becomes aware of its dependence on the non-

commutative multiplication of D and ∆ matrices. A desirable situation would

therefore arise if the pre-multiplication of ∆ by a 2 × 2 matrix K could render one

of the two columns of K unnecessary to be considered for the remainder of the mul-

tiplications. The possibility of this occurring for pG, pB ∈ [0, 1] is now examined.

The maximum value of crossover probability which needs to be considered on a

BSC is 0.5, since a model with crossover probability 0.5+ǫ is equivalent to one with

crossover probability 0.5−ǫ where the received bits are inverted before decoding.

145


Assuming that pG < pB so that the ‘good’ state G is indeed better than the ‘bad’

state B, only the pB = 0.5 scenario as stated in (6.1) will guarantee that a column

of any such matrix K can effectively be ignored in the matrix multiplication. Under

such an assumption, the binary restricted GEC as shown in Fig. 2.9 results. Recall

the definitions of the state transition matrix D and the difference matrix δ, given

respectively in (2.36) and (2.70) as

D =

[

1 − P P

Q 1 −Q

]

, (6.13)

δ =

[

1 − 2pG 0

0 0

]

D. (6.14)

Then the expression for the conditional spectral coefficients Qs(ui|v) in (4.32) and

(4.33) under the additional assumption of (6.1) may be rewritten as

Qs(ui|v) = σ0

i−1∏

j=1

[

(−1)vj ·u⊥

s,j(u⊥s,jD + u⊥s,jδ)]

×

1

2

[

(−1)ui·u⊥

s,iD + (−1)(ui·u⊥

s,i)+viδ]

×n∏

j=i+1

[

(−1)vj ·u⊥

s,j(u⊥s,jD + u⊥s,jδ)]

e (6.15)

= c1σ0ADBe + c2σ0AδBe,

where

A =i−1∏

j=1

u⊥s,jD + u⊥s,jδ, (6.16)

B =n∏

j=i+1

u⊥s,jD + u⊥s,jδ, (6.17)

c1 =1

2(−1)ui·u

⊥

s,i

n∏

j=1j 6=i

(−1)vj ·u⊥

s,j , (6.18)

c2 =1

2(−1)(ui·u

⊥

s,i)+vi

n∏

j=1j 6=i

(−1)vj ·u⊥

s,j . (6.19)

It is important that each matrix in both of the products A and B is either the

state transition matrix D or the difference matrix δ, as certain patterns of D and

δ sequences may be able to be replaced by simpler expressions when the conversion

to the burst-error characteristics x, y and z is performed. This conversion for σ0,

146


D and δ was explained in (2.72), (2.73) and (2.74). Let

M = {D, δ} (6.20)

represent the set of possible matrices from which each matrix in the products A and

B is taken. Define a function g for a length n binary code used over a restricted

GEC with a given stationary state distribution σ0 by

g : M×M× . . .×M → Z[x, y, z]

g(K(1),K(2), . . . ,K(n)) = σ0

∏n

j=1 K(j)e,(6.21)

where K(j) ∈ M ∀j ∈ {1, 2, . . . , n} and Z[x, y, z] denotes the set of polynomials in

indeterminates x, y and z with integer coefficients. In this notation, x, y and z

represent the three burst-error characteristics. One matrix probability corresponds

to each section of the spectral domain trellis as it is traversed from left to right.

A series of lemmas will quickly establish the polynomials output by g for all 2n

possible inputs K(j). As reported in [26] for the restricted GEC and proven in [69]

for a simplified Gilbert channel, the following result concerning powers of the state

transition matrix D is true.

Lemma 6.3.1. Dn(x, y) =1

1 + x

[

1 + xyn x− xyn

1 − yn x+ yn

]

∀n ∈ N.

Proof. The proof is by induction on n. Firstly, the lemma holds for n = 0 because

D0(x, y) = I2 =1

1 + x

[

1 + xy0 x− xy0

1 − y0 x+ y0

]

. (6.22)

Also note the case n = 1 is true by the formulation of the state transition matrix

D(x, y) given in (2.73). Assume the lemma holds for some n ∈ N. Examining the

(n+ 1)th power of D(x, y) reveals that

Dn+1(x, y) =1

1+x

[

1 + xyn x− xyn

1 − yn x+ yn

]

1

1 + x

[

1 + xy x− xy

1 − y x+ y

]

=1

(1 + x)2

[

(1 + x)(1 + xyn+1) (x+ x2)(1 − yn+1)

(1 + x)(1 − yn+1) (1 + x)(x+ yn+1)

]

=1

1 + x

[

1 + xyn+1 x− xyn+1

1 − yn+1 x+ yn+1

]

. (6.23)

Therefore, if the lemma is true for an exponent of n, it is also true for an exponent

of n+1. By the Principle of Mathematical Induction, the lemma holds ∀n ∈ N.

147


The vector σ0 is unaffected by post-multiplication by the state transition matrix

D to the power of any non-negative integer. This is because the stationary state

distribution vector σ0 is an eigenvector of Dn corresponding to the eigenvalue 1.

Lemma 6.3.2. σ0 = σ0Dn ∀n ∈ N.

Proof. This proof is also by induction on the exponent n. The lemma can be shown

to hold for n = 0 by considering

σ0D0 = σ0I2 = σ0. (6.24)

Additionally, the lemma is true for n = 1 because

σ0D =[

11+x

, x1+x

]

11+x

[

1 + xy x− xy

1 − y x+ y

]

= 1(1+x)2

[

1 + xy + x− xy, x− xy + x2 + xy]

= 11+x

[

1, x]

= σ0.

(6.25)

Assume the lemma is true for some n ∈ N. By this assumption and (6.25),

σ0Dn+1 = σ0D

nD

= σ0D (6.26)

= σ0.

Hence the lemma holds for n+1 and by the Principle of Mathematical Induction,

σ0 = σ0Dn ∀n ∈ N.

Since Dn is a stochastic matrix, the sum of the entries in both of its rows is one.

This concept can also be expressed using multiplication by the column vector e.

Lemma 6.3.3. Dne = e ∀n ∈ N.

Proof.

Dne = 11+x

[

1 + xyn x− xyn

1 − yn x+ yn

][

1

1

]

= 11+x

[

1 + x

1 + x

]

= e.

(6.27)

148


A consequence of Lemmas 6.3.2 and 6.3.3 is that g can be calculated for an input

of K(j) = D, ∀j ∈ {1, 2, . . . , n}.

Corollary 6.3.1. g(D,D, . . . ,D) = 1 ∀n ∈ N.

Proof. By the definition of g given in (6.21),

g(D,D, . . . ,D) = σ0Dne = σ0e = 1. (6.28)

It is possible to treat more of the 2n possible sequences of matrices in Mn with

the assistance of the following lemmas. In the next case to consider, the initial and

final matrix probabilities in the matrix product are both δ, while the rest are D.

Lemma 6.3.4. g(δ,D(1),D(2), . . . ,D(n−2), δ) = z2(1 + xyn−1) ∀n ∈ N.

Proof. By Lemma 6.3.3, the output of the function g in this situation is found to be

σ0δDn−2δe =1

1+x

[

1, x]

(1+x)z

[

1 0

0 0

]

Dn−1(1+x)z

[

1 0

0 0

]

De

= (1+x)z2[

1, 0] 1

1+x

[

1+xyn−1 x−xyn−1

1−yn−1 x+yn−1

][

1 0

0 0

]

De

= z2[

1 + xyn−1, x− xyn−1]

[

1 0

0 0

]

e

= z2(1 + xyn−1).

(6.29)

Lemma 6.3.4 is easily generalised to inputs of multiple instances of a sequence

of D matrices enclosed between two δ matrices.

Lemma 6.3.5. For r ∈ Z+ and each value ci ∈ N, where i ∈ {1, 2, . . . , r},

g(δ,D(1),D(2), . . . ,D(c1), δ,D(1), . . . ,D(c2), δ, . . . , δ,D(1), . . . ,D(cr), δ)

= zr+1(1 + xyc1+1)(1 + xyc2+1) . . . (1 + xycr+1).

Proof. The proof is by induction on r. Cases where ci = 0 must be considered here,

as two or more consecutive δ matrices are possible. Firstly, g is evaluated for r=1

as

g(δ,D(1),D(2), . . . ,D(c1), δ) = σ0δDc1δe = z2(1 + xyc1+1), (6.30)

149


which holds by Lemma 6.3.4. Assume that the current lemma is true for some

r∈Z+. That is, it is possible to write

σ0δDc1δDc2δ . . . δDcr =[

fG(x, y, z), fB(x, y, z)]

(6.31)

for some polynomials fG(x, y, z), fB(x, y, z) ∈ Z[x, y, z] satisfying

[


δe

=[


(1 + x)z

[

1 0

0 0

]

1

1 + x

[

1 + xy x− xy

1 − y x+ y

][

1

1

]

= z[

fG(x, y, z) · (1 + xy), fG(x, y, z) · (x− xy)]

[

1

1

]

= (1 + x)zfG(x, y, z). (6.32)

Therefore, the Inductive Hypothesis may be formulated as

(1 + x)zfG(x, y, z) = zr+1(1 + xyc1+1)(1 + xyc2+1) . . . (1 + xycr+1). (6.33)

By Lemma 6.3.3 and (6.33), extending the examination to the case for r + 1 gives

σ0δDc1 . . . δDcrδDcr+1δe

=[


δDcr+1δe

= (1 + x)z[

fG(x, y, z), 0] 1

1 + x

[

1 + xycr+1+1 x− xycr+1+1

1 − ycr+1+1 x+ ycr+1+1

]

δe

= z2fG(x, y, z) · (1 + x)[

1 + xycr+1+1, x− xycr+1+1]

[

1 0

0 0

]

e

= (1 + x)zfG(x, y, z) · z(1 + xycr+1+1)

= zr+1(1 + xyc1+1) . . . (1 + xycr+1) · z(1 + xycr+1+1). (6.34)

Thus it follows that

σ0δDc1δDc2δ . . . δDcr+1δe = zr+2(1 + xyc1+1)(1 + xyc2+1) . . . (1 + xycr+1+1), (6.35)

and the lemma is true for r+1 sequences of state transition matrices enclosed between

difference matrices in the parameter list for g. By the Principle of Mathematical

Induction, the lemma is true for all choices of {ci | 1 ≤ i ≤ r} and all r ∈ Z+.

An extension of the previous case is where there are additional state transition

150


matrices D at the start and/or the end of the list of inputs to the function g. The

following lemma shows they can effectively be ignored when evaluating g.

Lemma 6.3.6. For n1, n2 ∈ N,

g(D(1),D(2), . . . ,D(n1), δ, . . . , δ,D(1),D(2), . . . ,D(n2)) = g(δ, . . . , δ),

where there is some pattern of state transition and/or difference matrices in the list

of inputs to g between the two δ matrices given.

Proof. Firstly by application of the definition of g and Lemma 6.3.2, it can be

established that

g(D(1), . . . ,D(n1), δ, . . . , δ,D(1), . . . ,D(n2)) = σ0Dn1δ . . . δDn2e

= σ0δ . . . δDn2e

= [fG(x, y, z), fB(x, y, z)]Dn2e,

(6.36)

for polynomials fG(x, y, z), fB(x, y, z) ∈ Z[x, y, z] where by Lemma 6.3.5,

fG(x, y, z) + fB(x, y, z) = g(δ, . . . , δ). (6.37)

It then follows from Lemma 6.3.1 that

g(D(1),D(2), . . . ,D(n1), δ, . . . , δ,D(1),D(2), . . . ,D(n2))

= [fG(x, y, z), fB(x, y, z)] 11+x

[

1 + xyn2 x− xyn2

1 − yn2 x+ yn2

][

1

1

]

= fG(x, y, z) + fB(x, y, z)

= g(δ, . . . , δ).

(6.38)

It is now possible to prove the conjecture in [26], which is summarised by the

following theorem.

Theorem 6.3.1. Let g be defined as in (6.20) and (6.21). Then

g(

K(1),K(2), . . . ,K(n))

=

{

zβ if β = 0, 1

zβ∏n−1

r=1 (1 + xyr)γr if 2 ≤ β ≤ n,(6.39)

where the expression∏n

j=1 K(j) contains β matrices of the δ variety and γr occur-

rences of Dr−1 embedded between two δ matrices.

151


Proof. Firstly, if K(j) = D, ∀j ∈ {1, 2, . . . , n}, then β = 0 and g = 1 or z0 by

directly applying Corollary 6.3.1.

Secondly, if∏n

j=1 K(j) contains exactly one δ matrix in the jth position, where

j ∈ {1, 2, . . . , n}, then β = 1. By Lemmas 6.3.2 and 6.3.3, it follows that

g(D(1),D(2), . . . ,D(i−1), δ,D(i+1), . . . ,D(n)) = σ0δe

=1

1 + x

[

1, x]

(1 + x)z

[

1 0

0 0

]

De

= z[

1, x]

[

1 0

0 0

][

1

1

]

= z. (6.40)

If neither of these two cases hold, then∏n

j=1 K(j) must contain two or more δ

matrices. Assume that there are γα instances of Dα which occur between two δ

matrices, 0 ≤ α ≤ n − 2. For l ∈ N and {c0, c1, . . . , cl+1} ∈ N, application of

Lemmas 6.3.5 and 6.3.6 produces

g(D(1), . . . ,D(c0),δ,D(1), . . . ,D(c1),δ, . . . , δ,D(1), . . . ,D(cl),δ,D(1), . . . ,D(cl+1))

= σ0δDc1δ . . . δDclδe

= zl+1(1+xyc1+1) . . . (1+xycl+1)

= zl+1∏n−2

α=0(1 + xyα+1)γα .

(6.41)

The result in (6.39) follows since there are β = l+1 matrices of the δ variety. There

are no other possible products∏n

j=1 K(j), since they must contain either zero, one

or at least two δ matrices. Hence the theorem is proved.

The conditional spectral coefficients Qs(ui|v) can now be written as polynomials

in x, y and z using Theorem 6.3.1. Define the notation

Qs(ui|v) =

Q(0)s (x, y, z) for ui = 0,

Q(1)s (x, y, z) for ui = 1.

(6.42)

Also, let K(m)A denote the mth matrix in the product A in (6.16) and let K

(m+i)B

denote the mth matrix in the product B in (6.17). Then by (6.15) and (6.42), the

conditional spectral polynomials can be expressed in terms of the three burst-error

152


characteristics x, y and z as

Q(ui)s (x, y, z) = c1g(K

(1)A ,K

(2)A , . . . ,K

(i−1)A ,D,K

(i+1)B ,K

(i+2)B , . . . ,K

(n)B )

+ c2g(K(1)A ,K

(2)A , . . . ,K

(i−1)A , δ,K

(i+1)B ,K

(i+2)B , . . . ,K

(n)B )

=

c1 + c2z for β=0,

c1zβ

n−1∏

l=1

(1+xyl)γl + c2zβ+1

n−1∏

r=1

(1+xyr)γr for β≥1,(6.43)

where β is the number of δ(x, y, z) matrices in AD(x, y)B, γl is the multiplicity of

Dl−1(x, y) embedded between two δ(x, y, z) matrices in AD(x, y)B, and γr is the

multiplicity of Dr−1(x, y) embedded between two δ(x, y, z) matrices in Aδ(x, y, z)B.

MacWilliams identity

Some of the procedures detailed in this thesis have used the spectral domain trellis,

which corresponds to the dual code, rather than to the code itself. It has been

shown that for certain codes, the use of such procedures is advantageous in terms

of computational complexity and/or storage requirements. There are similarities

between the relationship of the APPs to the conditional spectral polynomials and

the weight distribution of a code compared to its dual. Suppose that a systematic

binary (n, k) linear block code C contains Aj codewords of weight j and that its

(n, n−k) dual code C⊥ contains Bj codewords of weight j, where 0 ≤ j ≤ n. Define

the weight polynomial for the code C as

A(z) =n∑

j=0

Ajzj, (6.44)

and the weight polynomial for the dual code C⊥ as

B(z) =n∑

j=0

Bjzj. (6.45)

A concise way to describe the relationship between these two weight polynomials is

the binary version of the MacWilliams identity [25]:

B(z) = 2−k(1 + z)nA

(

1 − z

1 + z

)

. (6.46)

Identity (6.46) is particularly useful when investigating the performance of a

binary code over a BSC using syndrome decoding [26, 28]. In general, the weight

153


polynomial for a coset Vt, 0 ≤ t ≤ 2n−k − 1 is given by

Bt(z) =n∑

j=0

Bjtzj, (6.47)

where Vt contains Bjt words of weight j. Syndrome decoding involves calculation

of coset probabilities Pt based on the displacement vector d of the received vector

from the transmitted codeword. The coset probabilities are defined as

Pt = P (d ∈ Vt). (6.48)

In [26] it is shown that for a BSC with crossover probability ǫ,

Pt =1

2n−kBt(1 − 2ǫ) (6.49)

and when t=0, the above equation becomes the MacWilliams identity (6.46).

Such a derivation only applies for a memoryless channel and there is only one

variable involved. The work presented in [26] considered the case of the restricted

GEC and derived generalised weight polynomials for linear block codes used on this

particular type of channel with focus on calculating the performance of syndrome

decoding.

It is also possible to deploy the concept of GWPs for APP decoding and with

a channel which has memory. The generalisation does not provide weight distribu-

tions or probabilities relating to decodability of received words, rather it is used to

directly calculate the necessary APPs. In addition, it is a function of three vari-

ables. Rephrasing in terms of the three burst-error characteristics the original/dual

relationship in (4.41) and (4.42) which determines each APP decoding decision, the

result may be stated as

B(ui)(x, y, z) =2n−k−1∑

s=0

Q(ui)s (x, y, z), (6.50)

where ui ∈ {0, 1}. Since the polynomials B(ui)(x, y, z) generalise the concept of

weight polynomials, they shall be referred to by analogy as generalised weight poly-

nomials.

APP decoding procedure using generalised weight polynomials

Given (6.50), an APP decoding procedure for a code used on a binary restricted GEC

defined by burst-error characteristics x, y and z can be described in the following

way.

154


Procedure 6.1. Given is a binary (n, k) linear block code C in standard form, to

be used on a binary restricted GEC defined by burst-error characteristics x, y and z.

The linear block code C shall be defined by parity check matrix H. A codeword u is

transmitted over the channel and a word v is received. APP decoding using GWPs

can be performed using the following steps.

Step 1. ∀s∈{0, 1, . . . , 2n−k − 1}, compute the dual codeword

u⊥s = sH =

[

u⊥s,1, u⊥s,2, . . . , u⊥s,n

]

∈ C⊥. (6.51)

Step 2. ∀s ∈ {0, 1, . . . , 2n−k − 1}, ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (2), compute

coefficients c1 and c2 using (6.18) and (6.19).

Step 3. ∀s ∈ {0, 1, . . . , 2n−k − 1}, ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (2), compute

conditional spectral polynomials Q(ui)s (x, y, z) using (6.43).

Step 4. ∀i ∈ {1, 2, . . . , k} and ∀ui∈GF (2), compute the generalised weight polyno-

mials B(ui)(x, y, z) by accumulating the 2n−k conditional spectral polynomials

Q(ui)s (x, y, z) as in (6.50).

Step 5. For each position i ∈ {1, 2, . . . , k}, derive an APP decoding decision as

ui =

{

0 if B(0)(x, y, z) ≥ B(1)(x, y, z),

1 if B(0)(x, y, z) < B(1)(x, y, z).(6.52)

Note that if the channel is described in terms of state transition and crossover

probabilities, the burst-error characteristics can be quickly found using (2.45), (2.46),

(2.47) and (6.12). Procedure 6.1 is only applicable for a GEC with the imposed re-

striction of pB = 0.5. This occurs because GEC models with other restrictions on

that parameter do not give the correct form of matrix probabilities which are easily

rendered into the three-variable polynomials in (6.43). Nevertheless, whenever a

GEC model is used and the BSC corresponding to the ‘bad’ state has minimal ca-

pacity, Procedure 6.1 provides an alternative to the matrix multiplication-dependent

Procedure 4.2. An example demonstrating how it can be used for decoding is given

in the next section.

155

6.4. INSTRUCTIVE EXAMPLE

6.4 Instructive Example

Consider the binary (4,2) linear block code C described in Example 2.1. However,

this time assume that v = [1, 0, 0, 1] is received on a binary restricted GEC and the

goal is to find an estimate u2 for the transmitted symbol u2 at position i = 2. The

diagonal weighted trellises as calculated using (6.18), (6.19) and (6.43) for u2 = 0

and u2 = 1 are shown in Fig. 6.1 (a) and (b), respectively. The correspondence

between the zeroes and ones of the codewords of the dual code C⊥ as listed in

(4.57), and the positions of the state transition matrices D and difference matrices

δ in the first, third and fourth trellis sections is clear. The conditional spectral

coefficients can be calculated as a sum of products involving matrix probabilities D

and δ. Firstly for u2 = 0, the four coefficients for s = 0, 1, 2, 3 may be expressed as

Q0(u2 = 0|v) = 12σ0D

4e + 12σ0DδD2e,

Q1(u2 = 0|v) = −12σ0D

3δe − 12σ0DδDδe,

Q2(u2 = 0|v) = −12σ0δDδDe − 1

2σ0δ

3De,

Q3(u2 = 0|v) = 12σ0δDδ

2e + 12σ0δ

4e.

(6.53)

On the other hand for u2 = 1, the resulting four coefficients may be reported as

Q0(u2 = 1|v) = 12σ0D

4e − 12σ0DδD2e,

Q1(u2 = 1|v) = 12σ0D

3δe − 12σ0DδDδe,

Q2(u2 = 1|v) = 12σ0δDδDe − 1

2σ0δ

3De,

Q3(u2 = 1|v) = 12σ0δDδ

2e − 12σ0δ

4e.

(6.54)

These eight sums or differences of matrix products can then be converted to polyno-

mials in x, y, and z using (6.43). In the case for u2 = 0, the four conditional spectral

polynomials are obtained as

Q(0)0 (x, y, z) = 1

2(1 + z),

Q(0)1 (x, y, z) = −1

2[z + (1 + xy2)z2],

Q(0)2 (x, y, z) = −1

2[(1 + xy2)z2 + (1 + xy)2z3],

Q(0)3 (x, y, z) = 1

2[(1 + xy)(1 + xy2)z3 + (1 + xy)3z4].

(6.55)

156


If u2 = 1, a set of four slightly different conditional spectral polynomials results.

These polynomials can be listed as

Q(1)0 (x, y, z) = 1

2(1 − z),

Q(1)1 (x, y, z) = 1

2[z − (1 + xy2)z2],

Q(1)2 (x, y, z) = 1

2[(1 + xy2)z2 − (1 + xy)2z3],

Q(1)3 (x, y, z) = 1

2[(1 + xy)(1 + xy2)z3 − (1 + xy)3z4].

(6.56)

The two generalised weight polynomials are then given by

B(0)(x, y,z) =1

2

[

1 − 2(1+xy2)z2 + xy(y−1)(1+xy)z3 + (1+xy)3z4]

, (6.57)

B(1)(x, y, z) =1

2

[

1 + xy(y − 1)(1 + xy)z3 − (1 + xy)3z4]

. (6.58)

Assume the channel has the same values for the parameters P,Q and pG as given

in the example in [70]. These three values are listed as

P = 1.68 × 10−3,

Q = 3.28 × 10−2,

pG = 5.7 × 10−3.

(6.59)

Clearly the value for pB in [70] of 2.19 × 10−1 cannot be used here, as it is required

that pB = 0.5. The burst-error characteristics (to 5 d.p.) for this channel model can

thus be calculated as

x = 0.05122,

y = 0.96552,

z = 0.94043.

(6.60)

Substituting the above three values into (6.57) and (6.58), it follows that the values of

the two relevant GWPs for this decoding decision and this channel can be expressed

as

B(0)(x, y, z) = 2.46 × 10−2,

B(1)(x, y, z) = 4.72 × 10−2.(6.61)

Comparing these two values according to (6.52) means that the decoded bit can be

determined as

u2 = 1. (6.62)

157


v v v v v

v v v v v

v v v v v

v v v v v

- - - -

- - - -

- - - -

- - - -

σ0(x)

σ0(x)

σ0(x)

σ0(x)

e Q(0)3 (x, y, z)

e Q(0)2 (x, y, z)

e Q(0)1 (x, y, z)

e Q(0)0 (x, y, z)

−δ(x, y, z) D(x,y)+δ(x,y,z)2

δ(x, y, z) −δ(x, y, z)

−δ(x, y, z) D(x,y)+δ(x,y,z)2

δ(x, y, z) D(x, y)

D(x, y) D(x,y)+δ(x,y,z)2

D(x, y) −δ(x, y, z)

D(x, y) D(x,y)+δ(x,y,z)2

D(x, y) D(x, y)

(a)

v v v v v

v v v v v

v v v v v

v v v v v

- - - -

- - - -

- - - -

- - - -

σ0(x)

σ0(x)

σ0(x)

σ0(x)

e Q(1)3 (x, y, z)

e Q(1)2 (x, y, z)

e Q(1)1 (x, y, z)

e Q(1)0 (x, y, z)

−δ(x, y, z) D(x,y)−δ(x,y,z)2

δ(x, y, z) −δ(x, y, z)

−δ(x, y, z) −D(x,y)+δ(x,y,z)2

δ(x, y, z) D(x, y)

D(x, y) −D(x,y)+δ(x,y,z)2

D(x, y) −δ(x, y, z)

D(x, y) D(x,y)−δ(x,y,z)2

D(x, y) D(x, y)

(b)

Figure 6.1: Weighted diagonal trellises of the binary (4, 2) linear block code C

used to compute spectral polynomials (a) Q(u2=0)s (x, y, z) and (b) Q

(u2=1)s (x, y, z);

s = 0, 1, 2, 3 for a binary restricted GEC.

158



A description of the performance of some binary linear block codes when used over

binary restricted GECs and decoded using Procedure 6.1 is provided in this sec-

tion. Such investigations into the performance which could be expected on channel

models with a range of parameter values were obtained using simulations run with

MATLABr. Not all possible parameter sets have been simulated. This is only a

sample of the possible applications of Procedure 6.1 to give an indication as to the

variety of its uses. Here, two codes which can correct two transmission errors per

word are examined.

6.5.1 (16,8) cyclic code

Computer simulations were carried out for the (16,8) cyclic block code described

in (3.105) over a GEC with pB = 0.5. The structure of this code is defined by the

generator polynomial coefficient vector

g = [1, 1, 1, 0, 1, 0, 1, 1, 1], (6.63)

and the columns of the generator matrix are then permuted into standard form.

The BER performance is shown in Fig. 6.2 for an average bit error probability

pb = pGσG + 0.5σB = 1%. (6.64)

This value of pb is typical for a mobile radio channel. By (6.12), it follows that the

channel reliability factor is given by

z = 0.98. (6.65)

Simulations were performed for pairs of x and y values taken from the sets

x ∈ {0.004, 0.008, . . . , 0.02},

y ∈ {0, 0.05, . . . , 1}.(6.66)

As expected, the performance improves as x decreases, since the channel has a higher

probability of being in the ‘good’ state G where the crossover probability pG is lower.

The performance also improves with a lower burst factor y, which corresponds to

increased statistical independence of errors.

159


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110

−4

10−3

10−2

10−1

y

BE

R

x = 0.004

x = 0.008

x = 0.012

x = 0.016

x = 0.020

Figure 6.2: Performance of the (16,8) block code on a binary restricted GEC andpb = 1%.

6.5.2 (22,13) Chen code

Another example of a code capable of correcting two errors is the quasi-perfect code

by Chen, Fan and Jin [60]. The 9 × 22 parity check matrix for this code is given

in (3.106). In this case, computer simulations were run for values of burst-error

characteristics taken from the sets

x = 0.004,

y ∈ {0, 0.1, . . . , 1},z ∈ {0.95, 0.98, 0.99},

(6.67)

thus corresponding to average bit error probabilities of 2.5%, 1% and 0.5%.

It can be observed from the plots of the BER performance of this code given in

Fig. 6.3 that for each of the channel reliability factor z values investigated, a decrease

in the burst factor y produces a decrease in the post-decoding BER. Additionally, it

is observed that a decrease in the channel reliability factor z results in an increase in

the post-decoding BER. This happens because a lower z value for a binary restricted

GEC corresponds to an increased crossover probability pG in the ‘good’ state G, and

thus more errors are likely to occur.

160

6.6. SUMMARY

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110

−5

10−4

10−3

10−2

y

BE

R

z = 0.95

z = 0.98

z = 0.99

Figure 6.3: Performance of the (22,13) Chen code on a binary restricted GEC withaverage fade to connection time ratio x = 0.004.

There is relatively little change in the performance of the Chen code over a

binary restricted GEC with z = 0.95, from y = 0 to y = 1. In these cases, the

crossover probability pG in the ‘good’ state has value 2.31 × 10−2. Whether the

channel is experiencing errors in bursts or independently, the crossover probabilities

are high and the code is not effective in correcting errors. There is a more profound

difference between the cases y=0 and y=1 for the situations where z=0.99. Here,

the corresponding value of the crossover probability pG is 3.02 × 10−3. The code is

not able to correct more than two errors per word, and thus it does not perform

particularly well when the burst factor y is high. However, the low value of pG

means that when errors occur more independently as y approaches zero, such errors

can be corrected and a low BER is observed.

6.6 Summary

A second and more practically-oriented system of parameters for a binary restricted

GEC has been the focus of this chapter. This system has been designed to consider

the distribution of the error bursts rather than the crossover and state transition

probabilities.

161

6.6. SUMMARY

An expression for the channel reliability factor was given in terms of the average

error rate over the two states of the channel. This was established by considering the

representation of a syndrome trellis, that is, one which is not used in the calculation

of a particular APP. The objective was then to take advantage of the restriction

on pB in order to determine expressions for the conditional spectral coefficients. A

large portion of the chapter presented a proof of the structure of this expression in

terms of any possible spectral domain trellis.

The binary MacWilliams identity was discussed as an entity providing infor-

mation as a function of one variable about the weight distribution of a code by

consideration of the weight distribution of its dual. Similarly, information about

how a received word should be decoded, as a function of the three burst-error char-

acteristics, is related through the GWPs to the conditional spectral coefficients. The

structure of these coefficients in terms of matrix probabilities depends on the ele-

ments of the dual codewords. These matrix probabilities in turn are dependent on

the burst-error characteristics.

As a result of this correspondence, an APP decoding procedure for a binary

restricted GEC was obtained. A example demonstrating the tasks involved in de-

coding a received word was included in order to reinforce the concepts involved in

this procedure. In particular, the steps of converting the conditional spectral co-

efficients in matrix form to an alternative description in terms of the burst-error

characteristics, and then construction of the GWPs leading to a decoding decision

were outlined. Finally, simulations of the decoding procedure on some binary linear

block codes were carried out, where it was observed that the resulting BER appears

to grow in proportion to both the average fade to connection time ratio and the

burst factor. By contrast, an increase in the value of the channel reliability factor

appears to result in a decreased BER, since the channel reliability factor is inversely

proportional to the average probability of a transmission error occurring. This sam-

ple of simulation results indicates the large number of options for further analysis of

the performance of binary linear block codes on a restricted GEC when combined

with APP decoding.

162

Chapter 7

Generalised Weight Polynomials

for Non-binary Restricted GECs

It is also possible to perform APP decoding using GWPs when the code used is non-

binary. As in the binary case, the behaviour of a non-binary restricted GEC can be

described in terms of the burst-error characteristics. The channel reliability factor

does however need to be calculated differently when non-binary data is transmitted

over the channel. The GWPs again provide a method of calculating APPs using

polynomial evaluation, rather than matrix multiplication. A simple and familiar

relationship between the spectral domain trellis entries and the structure of the

GWPs is established. It is noted that this relationship has similarities with the

non-binary version of the MacWilliams identity.

This chapter is organised as follows. Firstly, the problem to be solved in this

chapter is stated in Section 7.1. In Section 7.2, the channel reliability factor for a

non-binary GEC is examined in further detail. Then, the three burst-error char-

acteristics are used to express the conditional spectral coefficients for a non-binary

restricted GEC in Section 7.3. A resemblance to the MacWilliams identity is dis-

cussed, after which the decoding algorithm can be described with reference to the

GWPs. An example of the decoding process is given in Section 7.4, and Section

7.5 contains a discussion of simulation results. Finally, the chapter is concluded in

Section 7.6.

The major contributions of this chapter are:

• The relationship between the channel reliability factor z and the matrix prob-

abilities D and ∆ for a non-binary GEC using the standard DMC model is

established.

• Derivation of the conditional spectral coefficients for APP decoding on a non-

163


binary restricted GEC is given in terms of the three burst-error characteristics.

• Similarities between these conditional spectral polynomials and the non-binary

MacWilliams identity are noted.

• Description of an APP decoding algorithm for a non-binary restricted GEC

which is based on the evaluation of polynomials is given.

• Numerical examples which provide an indication of the possible uses of this

APP decoding algorithm are reported.


Let D be a non-binary restricted GEC with a p-ary DMC in both states of its state

set S = {G,B}, together with an initial state distribution vector σ0. Since D is a

restricted channel, the probability of receiving any symbol whilst in state B given

a transmitted symbol must be identical for all possible received symbols. That is,

under the standard DMC model in Fig. 2.3(a), the crossover probability in the ‘bad’

state can be written as

pB =1

p. (7.1)

If the alternative model in Fig. 2.3(b) was used, then the value of pB would be

slightly different. This scenario will not be considered in as much detail as the

standard model.

Suppose that C is an (n, k) linear block code in standard form over GF (p). The

linear block code C can be used to encode data prior to transmission over D. As Dis a particular type of GEC, the APP decoding decisions ui, for i ∈ {1, 2, . . . , k}, can

be found by computing (5.40). This equation is a statement about matrix products,

the entries of which are constructed from four channel parameters P , Q, pG and

pB. Incorporating (7.1), the challenge is to determine a closed form polynomial

expression for ui which instead involves the three burst-error characteristics. That is,

the task is to find a method of expressing the sums of conditional spectral coefficients

Qs(ui|v) in (5.40), not in terms of σ0, D, δ and e, but in terms of the average fade

to connection time ratio x, the burst factor y, and the channel reliability factor z.

Decoding a received word then involves determining k · p polynomials B(ui)(x, y, z),

one for each of the k values of i ∈ {1, 2, . . . , k} and each of the p values of ui ∈ GF (p).

By analogy with (5.40), it follows that the decoding decision for each information

164

7.2. THE CHANNEL RELIABILITY FACTOR FOR A NON-BINARY GEC

symbol can be made by evaluating


{

B(ui)(x, y, z)}

. (7.2)

7.2 The Channel Reliability Factor for a Non-

binary GEC

Two sets of parameters have been discussed for the non-binary GEC model. The

matrix multiplication algorithms given in Chapter 5 involved matrices containing

elements which were composed of the parameters P,Q, pG and pB. However in

Chapter 2, it was also explained that the behaviour of the channel could be de-

scribed by burst-error characteristics. Since the system by which the state changes

is identical to that of the binary GEC, the average fade to connection time ratio x

and the burst factor y are defined as in (2.45) and (2.46). The channel reliability

factor z is, however, defined slightly differently.

Here the results of [31] are applied directly. For a code of length n, a description

of the syndrome trellis, which does not consider the likelihood of any specific trans-

mitted symbol in any specific position and uses matrices of the spectral domain, is

given by

ΘH =n∏

j=1

diag{

Θshj

}

, (7.3)

where

Θshj=

∑

g∈GF (p)

Dgw<s,ghj> =

{

D if < s,hj > = 0,

∆ if < s,hj > 6= 0.(7.4)

In this formulation, w is a complex pth root of unity, s = vecp(s), and vecp(·) denotes

the p-ary vector representation of its decimal input. Additionally, < s,hj > refers

to the dot product of the vectors s and hj, where 0 ≤ s ≤ pn−k−1 and 1≤ j ≤ n.

The result in (7.4) can be derived using Lemma 5.2.1 and (5.27). By the same

arguments presented in Section 6.2 for binary codes, selecting D as the definition of

the channel reliability factor is also implausible for non-binary codes because of its

complete independence from the crossover probabilities of the channel. Hence the

definition using the difference matrix ∆ is selected. The matrix to scalar conversion

is again performed using the stationary state distribution σ0 and the column vector

e of all ones, so that the channel reliability factor for a non-binary GEC can be

expressed as

z = σ0∆e. (7.5)

165

7.2. THE CHANNEL RELIABILITY FACTOR FOR A NON-BINARY GEC

0

1

z

p−1p

ps

p

p−1

6

-

\\

\\

\\

\\

\\

\\

\\\

Figure 7.1: The relationship between the channel reliability function z and themean SER, ps.

This confirms the expression for z in terms of the average SER for a GEC using the

standard DMC model as given in Section 2.2. Explicitly, using the average SER of

that non-binary GEC model as given in (2.59), the relationship may be expressed

as

z =[

σG, σB

]

[

1 − ppG 0

0 1 − ppB

][

1 − P P

Q 1 −Q

][

1

1

]

= σG(1 − ppG) + σB(1 − ppB)

= 1 − p

p−1ps.

(7.6)

Although not explicitly derived here, it is to be noted that the expression for the

channel reliability factor z in the case of a non-binary GEC using the alternative

DMC model is also that of (7.6).

As shown in the graph of the channel reliability factor z as a function of the

average SER ps in Fig. 7.1, the maximum value of z is one, which occurs when ps

is zero. The minimum value of z is zero and occurs when ps reaches its maximum

value of p−1p

. The mean SER ps of the channel may be practically easier to obtain

than the two individual crossover probabilities, due to the channel being described

as a Markov model where the current state is hidden.

The crossover probabilities for a non-binary GEC are restricted to be at most1p, since all possible capacities of symmetric DMCs can be observed by limiting the

crossover probability to [0, 1p]. This can be shown in the following way.

Lemma 7.2.1. The function representing the channel capacity of a symmetric DMC

over GF (p), for a positive prime p, assumes all possible values in [0, 1] when restrict-

166

7.3. DERIVATION OF NON-BINARY APPS USING GENERALISED WEIGHT POLYNOMIALS

ing its domain to [0, 1p].

Proof. The capacity function in p-ary units of information for a standard DMC

model with capacity ǫ is defined in (2.14) as

f(ǫ) = 1 + [1 − (p− 1)ǫ] logp[1 − (p− 1)ǫ] + (p− 1)ǫ logp ǫ. (7.7)

The result in (7.7) can be verified with the alternative DMC model in [71]. Al-

though f(ǫ) is undefined for ǫ < 0, f is considered continuous on [0, 1p] as f consists

of products and sums of logarithms, which are themselves continuous on (0, 1p]. Fur-

thermore,

limx→0+

f(x) = 1 (7.8)

is sufficient to ensure continuity at the endpoint of the domain of f . A DMC model

can be defined for 0 ≤ ǫ ≤ 1p−1

, however for all primes p, the inequality

1

p<

1

p− 1(7.9)

holds, and thus it can be said that f is continuous on the shorter interval of [0, 1p].

The values of f at the endpoints of this interval are

f(0) = 1, f

(

1

p

)

= 0. (7.10)

Then by the Intermediate Value Theorem, for all capacities c ∈ (0, 1), there exists

a crossover probability ǫ ∈ (0, 1p) such that f(ǫ) = c. Combining this fact with

(7.10) means that restriction of the domain of f to [0, 1p] is sufficient to result in the

maximal range of f , which is [0, 1].

Thus for any GEC model, only crossover probabilities up to and including 1p

need

to be considered. The behaviour of a GEC may then be described by the parameters

x, y, z and pB, where 0≤pB ≤ 1p. It is also possible to express the conditional spectral

coefficient matrices as developed in Chapter 5 in terms of these four parameters.

7.3 Derivation of Non-binary APPs using Gener-

alised Weight Polynomials

In Chapter 6, it was demonstrated how the ability to describe a binary restricted

GEC in terms of three parameters produced an alternative to expressing the condi-

tional spectral coefficients Qs(ui|v) as matrix products. The alternative was to write

167


the conditional spectral coefficients as polynomials with a structure determined by

the distribution of the state transition matrices D and difference matrices δ in each

matrix product. The same task is performed here for the non-binary restricted GEC.

The similarity of the binary and non-binary results will be shown.

For brevity, only the non-binary restricted GEC using the standard DMC model

will be discussed. Application of the restriction with the standard model means

pB =1

p. (7.11)

With this restriction in place, the channel can be described by the three burst-

error characteristics x, y and z. As this is a GEC, the definitions of the burst-error

characteristics x and y in terms of P andQ as given in (2.45) and (2.46), respectively,

are applicable here. Also, (2.38) and (2.81) combine to produce the expression

z = (1 − ppG)Q

P +Q(7.12)

for the channel reliability factor. Equation (7.11) implies the use of the difference

matrix δ to describe the conditional spectral coefficients Qs(ui|v). Consequently,

the spectral domain is appropriate to use and the notation of the state transition

matrix D and the difference matrix δ as given in (2.36) and (2.77), respectively,

will be adopted. The subsequent combination of the expression for the conditional

spectral coefficient matrices Qs(ui|v) given in (5.30) with (5.31) and (7.11) gives a

new way of writing the conditional spectral coefficients as

Qs(ui|v) = σ0

i−1∏

j=1

{

wvj ·u⊥

s,j

[

δu⊥

s,j ,0D + (1 − δu⊥

s,j ,0)δ]}

×

wui·u⊥

s,i

p[D + (δui,vi

p− 1)δ] ×n∏

j=i+1

{

wvj ·u⊥

s,j

[

δu⊥

s,j ,0D + (1 − δu⊥

s,j ,0)δ]}

e (7.13)

= c1σ0ADBe + c2σ0AδBe,

where

A =i−1∏

j=1

[

δu⊥

s,j ,0D + (1 − δu⊥

s,j ,0)δ]

, (7.14)

B =n∏

j=i+1

[

δu⊥

s,j ,0D + (1 − δu⊥

s,j ,0)δ]

, (7.15)

168


and

c1 =wui·u

⊥

s,i

p

n∏

j=1j 6=i

wvj ·u⊥

s,j , (7.16)

c2 =wui·u

⊥

s,i

p(δui,vi

p− 1)n∏

j=1j 6=i

wvj ·u⊥

s,j . (7.17)

In addition, let

M = {D, δ} (7.18)

and note that each matrix within the products A and B is a member of the set

M. It is now possible to reformulate (7.13) in terms of the burst-error characteristic

definitions for the stationary state distribution vector σ0(x), the state transition

matrix D(x, y) and the difference matrix δ(x, y, z) as given in (2.72), (2.73) and

(2.74), respectively.

Define a function g for a linear block code of length n over GF (p) being used on

a restricted GEC with stationary state distribution vector σ0 as

g : M×M× . . .×M → Z[x, y, z]

g(K(1),K(2), . . . ,K(n)) = σ0

∏n

j=1 K(j)e,(7.19)

where

K(j) ∈ M ∀j ∈ {1, 2, . . . , n}. (7.20)

Then, Theorem 6.3.1 also applies in the non-binary case and

g(

K(1),K(2), . . . ,K(n))

=

{

zβ if β = 0, 1

zβ∏n−1

r=1 (1 + xyr)γr if 2 ≤ β ≤ n,(7.21)

where the expression∏n

j=1 K(j) contains β instances of the difference matrix δ and

γr occurrences of Dr−1, which is r−1 consecutive instances of the state transition

matrix D, embedded between two δ matrices.

It then becomes possible to rewrite the conditional spectral coefficients Qs(ui|v)

as polynomials in terms of the burst-error characteristics x, y and z. Let this polyno-

mial be denoted Q(ui)s (x, y, z). To reference each matrix in (7.14) and (7.15), define

the notation

A = K(1)A K

(2)A · · ·K(i−1)

A , (7.22)

B = K(i+1)B K

(i+2)B · · ·K(n)

B . (7.23)

169


Then, (7.13) can be rewritten using (7.21) as

Q(ui)s (x, y, z) = c1g(K

(1)A ,K

(2)A , · · · ,K(i−1)

A ,D,K(i+1)B ,K

(i+2)B , · · · ,K(n)

B )

+ c2g(K(1)A ,K

(2)A , · · · ,K(i−1)

A , δ,K(i+1)B ,K

(i+2)B , · · · ,K(n)

B )

=

c1 + c2z for β=0,

c1zβ

n−1∏

l=1

(1+xyl)γl + c2zβ+1

n−1∏

r=1

(1+xyr)γr for β≥1,(7.24)

where A,B, c1 and c2 are defined in (7.14)-(7.17). Furthermore, β is the number

of difference matrices δ(x, y, z) in AD(x, y)B, γl is the multiplicity of Dl−1(x, y)

embedded between two δ(x, y, z) matrices in AD(x, y)B, and γr is the multiplicity

of Dr−1(x, y) embedded between two δ(x, y, z) matrices in Aδ(x, y, z)B.

Non-binary MacWilliams identity

There is also a non-binary version of the MacWilliams identity [25]. For a systematic

(n, k) linear block code C over GF (p) containing Aj codewords of weight j, and its

(n, n− k) dual code C⊥ containing Bj codewords of weight j, 0≤j≤n, the identity

may be expressed in terms of an indeterminate z as

A

(

1 − z

1 + (p− 1)z

)

=pk

[1 + (p− 1)z]nB(z). (7.25)

In this formulation, the weight polynomial A(z) for the linear block code C is

A(z) =n∑

j=0

Ajzj (7.26)

and the weight polynomial B(z) for the dual code C⊥ is

B(z) =n∑

j=0

Bjzj. (7.27)

There is thus a transformation involved in obtaining the weight polynomial A(z) for

the code C from the weight polynomial B(z) for the dual code C⊥. In a similar way,

(5.38) demonstrated that the vector P(ui|v) of a posteriori probabilities is related to

the vector Q(ui|v) of conditional spectral coefficients through a transformation ma-

trix Wpn−k . Specifically, the relationship between the original and spectral domains

was given by

P(ui|v) =1

pn−kQ(ui|v)WH

pn−k . (7.28)

170


As shown in Chapter 5, the APP decoding decisions are made by comparing the first

elements P0(ui|v) of the vectors P(ui|v) for the different values of ui. In (7.2), the

polynomials B(ui)(x, y, z) were conceived as generalisations of these first elements

P0(ui|v) to trivariate polynomials. Since (7.24) demonstrates how to express the

conditional spectral coefficientsQs(ui|v) as functionsQ(ui)s (x, y, z) of the three burst-

error characteristics x, y and z, consideration of the structure of (7.2) and (7.28)

reveals that

B(ui)(x, y, z) =

pn−k−1∑

s=0

Q(ui)s (x, y, z) (7.29)

provides the necessary link between the original domain polynomials B(ui)(x, y, z)

and the spectral domain polynomials Q(ui)s (x, y, z). The weight polynomials of (7.25)

possess a similar dual relationship, but since (7.29) has generalised the relationship

of the polynomials B(ui)(x, y, z) with their spectral domain counterparts to three

variables, the polynomials B(ui)(x, y, z) shall be referred to as generalised weight

polynomials. It therefore follows that the final decoding decision for each position

i ∈ {1, 2, . . . , k} is given by


{B(ui)(x, y, z)}. (7.30)

APP decoding procedure using generalised weight polynomials

It is now possible to describe a procedure which performs APP decoding of non-

binary linear block codes over a restricted GEC.

Procedure 7.1. Given is an (n, k) linear block code C over GF (p). The code is

in standard form, defined by parity check matrix H, and is to be used on a p-ary

restricted GEC defined by burst-error characteristics x, y and z. A codeword u is

transmitted over the channel and a word v is received. APP decoding using GWPs

consists of the following steps.

Step 1. ∀s ∈ {0, 1, . . . , pn−k − 1}, compute the dual codeword

u⊥s = sH =

[

u⊥s,1, u⊥s,2, . . . , u⊥s,n

]

∈ C⊥. (7.31)

Step 2. ∀s ∈ {0, 1, . . . , pn−k − 1}, ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (p), compute

coefficients c1 and c2 using (7.16) and (7.17).

Step 3. ∀s ∈ {0, 1, . . . , pn−k − 1}, ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (p), compute

conditional spectral polynomials Q(ui)s (x, y, z) using (7.24).

171


Step 4. ∀i ∈ {1, 2, . . . , k} and ∀ui ∈ GF (p), compute the generalised weight poly-

nomials B(ui)(x, y, z) by accumulating the pn−k conditional spectral polynomi-

als Q(ui)s (x, y, z) as in (7.29).

Step 5. Derive an APP decoding decision ui for the ith transmitted symbol ui for

each position i∈{1, 2, . . . , k} using (7.30).

7.4 Instructive Example

Let C be the (4,2) linear block code over GF (3) as given in (3.107). Assume that

v = [1, 2, 2, 0] is received through a ternary restricted GEC with pB = 13

and it is

required that the second symbol transmitted u2 be estimated using APP decoding.

The estimate u2 of this symbol can be found by applying Procedure 7.1. Diagonal

weighted trellises for the three possibilities u2 = 0, u2 = 1 and u2 = 2 are similar to

those in Figs. 5.1-5.3. However, the replacement of δ for ∆ has been made since the

channel is restricted. Additionally, the conditional spectral polynomialsQ(ui)s (x, y, z)

are given in terms of the burst-error characteristics x, y and z. The results are

presented in Figs. 7.2-7.4. Referring to the codewords of C⊥ in (5.81), note the

correspondence in the first, third and fourth trellis sections between the locations

of the state transition matrices D(x, y) and the positions of the zero entries of the

dual codewords. Furthermore, the locations of the difference matrices δ(x, y, z) in

trellis sections one, three and four correspond to the positions of nonzero entries of

the dual codewords. Since p = 3, w is fixed at e−2π3

. The nine conditional spectral

polynomials in each of the three cases are obtained by multiplying across each row of

the trellis and using (7.16), (7.17) and (7.24). That is, considering the case u2 = 0,

the conditional spectral polynomials can be listed as

Q(0)0 (x, y, z) = 1

3(1 − z),

Q(0)1 (x, y, z) = 1

3w[z2(1 + xy3) − z3(1 + xy)(1 + xy2)],

Q(0)2 (x, y, z) = 1

3w2[z2(1 + xy3) − z3(1 + xy)(1 + xy2)],

Q(0)3 (x, y, z) = 1

3w[z2(1 + xy2) − z3(1 + xy)2],

Q(0)4 (x, y, z) = 1

3w2[z2(1 + xy) − z3(1 + xy)2],

Q(0)5 (x, y, z) = 1

3[z3(1 + xy)(1 + xy2) − z4(1 + xy)3],

Q(0)6 (x, y, z) = 1

3w2[z2(1 + xy2) − z3(1 + xy)2],

Q(0)7 (x, y, z) = 1

3[z3(1 + xy)(1 + xy2) − z4(1 + xy)3],

Q(0)8 (x, y, z) = 1

3w[z2(1 + xy) − z3(1 + xy)2].

(7.32)

172


If u2 = 1, then the conditional spectral polynomials can be reported as

Q(1)0 (x, y, z) = 1

3(1 − z),

Q(1)1 (x, y, z) = 1

3[z2(1 + xy3) − z3(1 + xy)(1 + xy2)],

Q(1)2 (x, y, z) = 1

3[z2(1 + xy3) − z3(1 + xy)(1 + xy2)],

Q(1)3 (x, y, z) = 1

3[z2(1 + xy2) − z3(1 + xy)2],

Q(1)4 (x, y, z) = 1

3[z2(1 + xy) − z3(1 + xy)2],

Q(1)5 (x, y, z) = 1

3[z3(1 + xy)(1 + xy2) − z4(1 + xy)3],

Q(1)6 (x, y, z) = 1

3[z2(1 + xy2) − z3(1 + xy)2],

Q(1)7 (x, y, z) = 1

3[z3(1 + xy)(1 + xy2) − z4(1 + xy)3],

Q(1)8 (x, y, z) = 1

3[z2(1 + xy) − z3(1 + xy)2].

(7.33)

Finally, under the supposition of u2 = 2, the conditional spectral polynomials can

be calculated as

Q(2)0 (x, y, z) = 1

3+ 2

3z,

Q(2)1 (x, y, z) = 1

3w2z2(1 + xy3) + 2

3w2z3(1 + xy)(1 + xy2),

Q(2)2 (x, y, z) = 1

3wz2(1 + xy3) + 2

3wz3(1 + xy)(1 + xy2),

Q(2)3 (x, y, z) = 1

3w2z2(1 + xy2) + 2

3w2z3(1 + xy)2,

Q(2)4 (x, y, z) = 1

3wz2(1 + xy) + 2

3wz3(1 + xy)2,

Q(2)5 (x, y, z) = 1

3z3(1 + xy)(1 + xy2) + 2

3z4(1 + xy)3,

Q(2)6 (x, y, z) = 1

3wz2(1 + xy2) + 2

3wz3(1 + xy)2,

Q(2)7 (x, y, z) = 1

3z3(1 + xy)(1 + xy2) + 2

3z4(1 + xy)3,

Q(2)8 (x, y, z) = 1

3w2z2(1 + xy) + 2

3w2z3(1 + xy)2.

(7.34)

The three generalised weight polynomials which will be used in order to determine

u2 are obtained by adding each set of nine polynomials. After performing these

additions, the GWPs can be determined as

B(0)(x, y, z) =1−z−z2m(x, y)+z3(5+2xy+3xy2)(1+xy)−2z4(1+xy)3

3, (7.35)

B(1)(x, y, z) =1 − z + 2z2m(x, y) − 4z3(1 + xy)2 − 2z4(1 + xy)3

3, (7.36)

B(2)(x, y, z) =1 + 2z − z2m(x, y) − 4z3(1 + xy)2 + 4z4(1 + xy)3

3, (7.37)

where

m(x, y) = 3 + xy(1 + y + y2) (7.38)

denotes an expression which is common to all three GWPs. The values for the burst-

error characteristics x, y and z can now be substituted into (7.35)-(7.38). According

173


r r r r r

r r r r r

r r r r r

r r r r r

r r r r r

r r r r r

r r r r r

r r r r r

r r r r r

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

σ0(x)

σ0(x)

σ0(x)

σ0(x)

σ0(x)

σ0(x)

σ0(x)

σ0(x)

σ0(x)

e Q(0)8 (x, y, z)

e Q(0)7 (x, y, z)

e Q(0)6 (x, y, z)

e Q(0)5 (x, y, z)

e Q(0)4 (x, y, z)

e Q(0)3 (x, y, z)

e Q(0)2 (x, y, z)

e Q(0)1 (x, y, z)

e Q(0)0 (x, y, z)


D(x, y) D(x, y)

wδ(x, y, z) D(x,y)−δ(x,y,z)3

D(x, y) δ(x, y, z)

w2δ(x, y, z) D(x,y)−δ(x,y,z)3

D(x, y) δ(x, y, z)


w2δ(x, y, z) D(x, y)


w2δ(x, y, z) δ(x, y, z)


w2δ(x, y, z) δ(x, y, z)


wδ(x, y, z) D(x, y)


wδ(x, y, z) δ(x, y, z)



Figure 7.2: Weighted diagonal trellis of the (4, 2) linear block code C over GF (3)

used to compute conditional spectral polynomials Q(u2=0)s (x, y, z); s = 0, 1, . . . , 8.

174


r r r r r

r r r r r

r r r r r

r r r r r

r r r r r

r r r r r

r r r r r

r r r r r

r r r r r

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

σ0(x)

σ0(x)

σ0(x)

σ0(x)

σ0(x)

σ0(x)

σ0(x)

σ0(x)

σ0(x)

e Q(1)8 (x, y, z)

e Q(1)7 (x, y, z)

e Q(1)6 (x, y, z)

e Q(1)5 (x, y, z)

e Q(1)4 (x, y, z)

e Q(1)3 (x, y, z)

e Q(1)2 (x, y, z)

e Q(1)1 (x, y, z)

e Q(1)0 (x, y, z)


D(x, y) D(x, y)

wδ(x, y, z) w2[D(x,y)−δ(x,y,z)]3

D(x, y) δ(x, y, z)

w2δ(x, y, z) w[D(x,y)−δ(x,y,z)]3

D(x, y) δ(x, y, z)

w2δ(x, y, z) w2[D(x,y)−δ(x,y,z)]3


D(x, y) w[D(x,y)−δ(x,y,z)]3

w2δ(x, y, z) δ(x, y, z)


w2δ(x, y, z) δ(x, y, z)

wδ(x, y, z) w[D(x,y)−δ(x,y,z)]3




D(x, y) w2[D(x,y)−δ(x,y,z)]3




175


r r r r r

r r r r r

r r r r r

r r r r r

r r r r r

r r r r r

r r r r r

r r r r r

r r r r r

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

σ0(x)

σ0(x)

σ0(x)

σ0(x)

σ0(x)

σ0(x)

σ0(x)

σ0(x)

σ0(x)

e Q(2)8 (x, y, z)

e Q(2)7 (x, y, z)

e Q(2)6 (x, y, z)

e Q(2)5 (x, y, z)

e Q(2)4 (x, y, z)

e Q(2)3 (x, y, z)

e Q(2)2 (x, y, z)

e Q(2)1 (x, y, z)

e Q(2)0 (x, y, z)

D(x, y) D(x,y)+2δ(x,y,z)3

D(x, y) D(x, y)

wδ(x, y, z) w[D(x,y)+2δ(x,y,z)]3

D(x, y) δ(x, y, z)

w2δ(x, y, z) w2[D(x,y)+2δ(x,y,z)]3

D(x, y) δ(x, y, z)

w2δ(x, y, z) w[D(x,y)+2δ(x,y,z)]3


D(x, y) w2[D(x,y)+2δ(x,y,z)]3

w2δ(x, y, z) δ(x, y, z)

wδ(x, y, z) D(x,y)+2δ(x,y,z)3

w2δ(x, y, z) δ(x, y, z)

wδ(x, y, z) w2[D(x,y)+2δ(x,y,z)]3


w2δ(x, y, z) D(x,y)+2δ(x,y,z)3


D(x, y) w[D(x,y)+2δ(x,y,z)]3




176


to (7.30), the three polynomials B(u2)(x, y, z) must be evaluated and then compared

in magnitude. The estimate u2 for the transmitted symbol u2 is the element ofGF (3)

which produces the highest evaluation amongst the three GWPs after substitution

of the three burst-error characteristics x, y and z. For example, if the GEC model

has burst-error characteristics

x = 1.2 × 10−2,

y = 5 × 10−2,

z = 9.85 × 10−1,

(7.39)

then the values of the three GWPs can be determined as

B(0)(x, y, z) = 3.57 × 10−5,

B(1)(x, y, z) = 4.14 × 10−2,

B(2)(x, y, z) = 1.19 × 10−3.

(7.40)

Therefore the result of the decoding can be calculated as u2 = 1.


Some computer simulations of APP decoding using Procedure 7.1 were carried out

using MATLABr. The SER performance for a selection of different burst-error

characteristic values was obtained for two codes. The objective of the simulations is

not to provide a complete performance analysis of codes. Instead, some observations

can be made and the possibilities for analysis can be seen.

7.5.1 (4,2) Hamming code over GF (3)

Computer simulations of decoding using Procedure 7.1 were first effectuated for the

(4,2) Hamming code over GF (3) described in (3.107). The channel is a ternary

restricted GEC. A plot of the SER values obtained is given in Fig. 7.5. The value

of the channel reliability factor z was fixed at 9.85 × 10−1, corresponding to a mean

symbol error rate of ps = 1%. Pairs of values for the average fade to connection

time ratio x and the burst factor y were chosen from the sets

x ∈ {0.003, 0.006, . . . , 0.015},y ∈ {0, 0.05, . . . , 1}.

(7.41)

As can be observed from Fig. 7.5, a decrease in the average fade to connection time

ratio x means the GEC is more likely to be in the ‘good’ state G, where the crossover

177


probability is lower. Thus, the simulation results are consistent with the hypothesis

that post-decoding SER increases with the average fade to connection time ratio x,

as was observed in Section 6.5.

There is a marked improvement in the performance of this code as the burst

factor y decreases, particularly for high values of the average fade to connection

time ratio x as the burst factor y tends toward zero from above. In this case, the

error bursts are both rare and decreasing in duration. Thus, it becomes increasingly

probable that at most one transmission error per received word has occurred. So

the code is increasingly likely to be able to correct such errors and the resulting SER

decreases sharply.

It is also to be observed that the five values of the average fade to connection time

ratio x produce the same post-decoding SER when the burst factor y is zero. It can

be seen that this is the expected behaviour by considering (7.24) for y = 0. Suppose

the ith transmitted symbol ui of a codeword from a code C is being estimated and

let β be the number of nonzero entries in the sth codeword of the dual code C⊥,

0≤ s≤ pn−k−1, other than in the ith position of that word. It can then be shown

that the conditional spectral polynomial Q(ui)s (x, y, z) for this situation is given by

Q(ui)s (x, 0, z) = zβ(c1 + c2z). (7.42)

Thus by (7.29) and (7.30), when the burst factor y is zero, the APP decoding deci-

sions are independent of the average fade to connection time x, which explains the

intersection of the five curves in Fig. 7.5. A similar argument involving the evalua-

tion of Q(ui)s (0, y, z) can be used to show that when the average fade to connection

time x is zero, the SER achieved is independent of the burst factor y.

7.5.2 (26,22) BCH code over GF (3)

The Bose-Chaudhuri-Hocquenghem (BCH) codes were one of the major advances in

the history of coding theory [72]. Although better codes have since been discovered,

at the time they represented a class of codes with good error correction capabilities.

Their design was general enough so that they could be implemented over any field.

In fact, the Hamming codes are a subclass of the BCH codes. Gorenstein and

Zierler [73] noted the codes’ ability for correction of errors which occurred in bursts,

rather than independently. In this example, a code over GF (3) is constructed. The

BCH codes are cyclic, so if the coefficients of the generator polynomial are given as

g = [1, 2, 1, 2, 2] (7.43)

178


0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110

−4

10−3

10−2

10−1

y

SE

R

x = 0.003

x = 0.006

x = 0.009

x = 0.012

x = 0.015

Figure 7.5: Performance of the (4,2) Hamming code over GF (3) on restrictedGECs with pB = 1

3, ps =1%.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 110

−4

10−3

10−2

10−1

y

SE

R

x = 0.005

x = 0.01

x = 0.015

Figure 7.6: Performance of the (26,22) BCH code over GF (3) on restricted GECswith pB = 1

3, ps =1%.

179


and n is set as 33−1 = 26, then the result is a 22× 26 generator matrix with zeroes

in its upper right and lower left regions, and g cyclically shifted one position to the

right in each row. This is however not in standard form. By performing elementary

row operations, a generator matrix in standard form for an equivalent code C is

found as

G = [I22 | K], (7.44)

where

K =

2 1 1 1 1 0 1 2 0 1 1 2 1 1 2 0 2 0 1 0 0 2

1 1 0 0 0 1 2 2 2 2 0 2 1 0 2 2 1 2 2 1 0 1

2 2 2 1 1 0 2 1 2 0 0 2 0 2 2 2 1 1 0 2 1 2

1 1 1 1 0 1 2 0 1 1 2 1 1 2 0 2 0 1 0 0 2 2

T

. (7.45)

This results in a 4 × 26 parity check matrix of

H=

1 2 2 2 2 0 2 1 0 2 2 1 2 2 1 0 1 0 2 0 0 1 1 0 0 0

2 2 0 0 0 2 1 1 1 1 0 1 2 0 1 1 2 1 1 2 0 2 0 1 0 0

1 1 1 2 2 0 1 2 1 0 0 1 0 1 1 1 2 2 0 1 2 1 0 0 1 0

2 2 2 2 0 2 1 0 2 2 1 2 2 1 0 1 0 2 0 0 1 1 0 0 0 1

. (7.46)

It is clear from (7.44) and (7.45) that d(C) = 3 and so this code of rate R = 0.85

uses its four parity symbols to correct one error.

Computer simulations of the transmission of data encoded using the (26,22) BCH

code over GF (3) as described above were also carried out. The ternary restricted

GEC was again used with

z = 0.985, (7.47)

or an average symbol error rate ps of one percent. The values for the average fade

to connection time ratio x and the burst factor y were selected from the sets

x ∈ {0.005, 0.01, 0.015},y ∈ {0, 0.1, . . . , 1}.

(7.48)

The results of these simulations are displayed in Fig. 7.6. As with the Hamming

code above, this BCH code is capable of correcting one symbol error. However,

this code is much longer and its code rate is much higher than the (4,2) Hamming

code. This explains the higher post-decoding SER in almost all cases for the BCH

code compared to the Hamming code when both have an average transmission SER

of 1%. The probability of having an error pattern which can be detected but not

180

7.6. SUMMARY

corrected with 26 symbols is higher than that with four symbols, assuming the same

frequency of transmission errors. However, the same relationships between the SER

and both x and y as for the (4,2) Hamming code are observed.

The gradients of the curves in Fig. 7.6 are lower than that of Fig. 7.5, particularly

when the burst factor y is small and the channel more closely resembles one with

independent errors. In such a situation, the value of the channel reliability factor z

is low enough, and the code length n = 26 is large enough, that in many cases, a

detectable error pattern will occur which cannot be corrected. This contributes to

a relatively high post-decoding SER and thus the high gradients at the left of Fig.

7.5 are not observed in Fig. 7.6.

It can be observed that the same intersection of the curves at y = 0 occurs in

Fig. 7.6, as occurred in Fig. 7.5. The reason for this is given in (7.42). That is,

for any linear block code and a fixed value of the channel reliability factor z, the

post-decoding SER obtained on a restricted GEC which has a burst factor y of zero

appears to be independent of the value of the average fade to connection time ratio

x.

7.6 Summary

The focus of this chapter has been the design of a procedure to perform APP de-

coding for linear block codes over a non-binary restricted GEC. This was achieved

through a number of steps. Firstly, the channel reliability factor was discussed for

a general non-binary GEC and a formula was found for this parameter in terms

of the average transmission symbol error rate of the channel. The expression was

found by consideration of the syndrome trellis. A proof showing that only crossover

probabilities of at most 1p

need to be considered in order to deal with all possible

channel capacities was also given.

Then, the next section dealt specifically with how to obtain conditional spectral

polynomials from the conditional spectral coefficients. Expressions for each row of

the spectral domain trellis could be obtained principally by consideration of the

structure of the dual code along with the received vector. The theorem developed

in Chapter 6 for a binary restricted GEC also applies in the non-binary case to give

the conditional spectral polynomials in terms of the burst-error characteristics of

the average fade to connection time ratio x, the burst factor y, and the channel

reliability factor z.

The relationship between the weight polynomial for a systematic non-binary code

and that of its dual is given by the MacWilliams identity and its instantiation given

181

7.6. SUMMARY

in this chapter was in terms of one variable. Similarly, the generalised weight poly-

nomials in this chapter give a posteriori probabilities of the original domain based on

probabilities which are derived from the spectral domain. The weight polynomials

are termed “generalised” because they are functions of three variables, specifically

the three burst-error characteristics which describe a non-binary restricted GEC.

An APP decoding procedure was developed using the evaluation of these GWPs.

A decoding example for a ternary linear block code was then given in order

to illustrate the steps involved in the procedure. Firstly, the conditional spectral

polynomials were calculated, and these were used to determine the GWPs. Once the

values of the three burst-error characteristics were substituted, a decoding decision

could be made. Finally, simulation results were obtained for two ternary codes

over restricted GECs with various burst-error characteristics and some observations

were made. In particular, increases in either the average fade to connection time

ratio x or the burst factor y appear to increase the post-decoding SER and hence

degrade the performance of the code. It was also noted that if the burst factor y

for a restricted GEC is zero, then for a fixed value of the channel reliability factor

z, the post-decoding SER appears to be the same for all values of the average fade

to connection time ratio x.

182

Chapter 8

Conclusion and General

Discussion

Wireless communication systems were a step forward in man’s quest to improve

communications over large distances. However, relative to a memoryless system, this

more complex scheme brought its own set of problems with it. The channel models

are likely to experience error bursts. This necessitated the use of error correcting

codes to protect the information from such errors. However, hand-in-hand with this,

effective and efficient decoding algorithms are required. The research presented in

this thesis has been a step towards achieving that goal.

This chapter is structured as follows. In Section 8.1, the principal ideas and

findings of this thesis are summarised in order to demonstrate what has been ac-

complished, and for which channels the methods are applicable. However, this is

by no means a closed subject matter. There are still many more problems to be

solved and there are ways in which the strategies may be improved. For this reason,

Section 8.2 explores some of the directions which future research in this exciting and

pertinent area may take.

8.1 Summary of Major Findings and Contribu-

tions

Given a channel model and a code for transmission of data over that channel, it is also

necessary to have a method of retrieving the encoded information after reception.

More specifically, there is a need to develop decoding algorithms for the mobile

environment. It is desirable that these algorithms fulfil certain criteria. Namely,

that they are able to correct some patterns of transmission errors, that they are

183

8.1. SUMMARY OF MAJOR FINDINGS AND CONTRIBUTIONS

practical for a large number of codes, and that they are based on a simple concept.

The errors which occur with a wireless system do not occur independently. In-

stead they occur in bursts and this adds additional complexity to the model. So to

simplify things, the basic method of the algorithms was first developed for a memo-

ryless channel. After presentation of the necessary background material in Chapter

2, Chapter 3 revealed that the evaluation of the APPs could be carried out in either

of two domains. In both the original and spectral domains, the computations could

be performed using a trellis or alternatively a representation of the trellis in matrix

form could be found. This was a succinct way to represent the information, and

also provided a link between the two domains with the use of a similarity matrix

transformation.

In the case of the original domain, the parity check matrix was used in the

creation of a trellis for the code. The entries on the branches of this trellis were then

weighted by the error probabilities of the channel model and the details of obtaining

the matrix representation of such a trellis were summarised. The steps required

to perform APP decoding with this matrix representation were then given. For a

systematic code, the possibility of each information symbol position taking on each

value in the signalling alphabet must be investigated. It should be noted that each of

the algorithms developed in this thesis can be extended for use with non-systematic

codes. In such cases, the weighted trellis matrices, conditional spectral coefficients,

conditional spectral polynomials, or generalised weight polynomials are calculated

for all positions of the code, producing APPs for each and every transmitted symbol.

The complexity of such methods is greater than those presented here for systematic

codes, as a higher number of APPs must be determined in order to perform the

decoding.

It was next shown how to obtain a spectral domain matrix representation of

the trellis. This involved diagonalising the matrices which were used in the original

domain and the resulting trellis had a very simple structure. After weighting the

individual components according to the error probabilities induced by the channel,

they could be summed and the same decoding decisions as for the original domain

could be made.

The above methodology for memoryless channels formed the basis for the fol-

lowing two chapters. The phenomenon of errors occurring in bursts in the mobile

environment is however modelled better by a finite state channel model. In other

words, it is the nature of wireless communications that there will be times when

errors are fairly likely and other times when they are not. The different states are

designed to correspond to different error likelihoods and the model transfers between

184

8.1. SUMMARY OF MAJOR FINDINGS AND CONTRIBUTIONS

these states according to the probabilities in the state transition matrix. However,

it is in general not possible to ascertain which state the model is in at any time. It

is only possible to observe the received sequence of symbols.

Chapters 4 and 5 demonstrated the possibility of APP decoding for this type

of channel model. The error probabilities were represented by matrices instead of

scalars, with one entry for each pair of current and successive states. This has two

chief drawbacks in terms of the calculations required. Namely, matrix multiplication

is in general non-commutative, and is more computationally complex than scalar

multiplication. Hence the procedures developed for a finite state channel model will

not be as efficient as those for a memoryless channel. The tradeoff is being better

equipped to deal with burst errors.

Procedures were developed for both binary and non-binary codes operating in

either the original or the spectral domain. It was also noted that some of the con-

ditional spectral coefficients formed a probability distribution. However, there is an

underlying question as to which domain to use. An analysis of the computational

complexities and storage requirements of the two approaches concluded that the

spectral domain is suitable for codes of high rate. Additionally, the spectral domain

approach benefits from the lower amount of storage space required, but the original

domain suffers from requiring more storage by comparison in order to perform its

calculations. However, if storage space is not a primary concern, then the original

domain is preferred when the code has a low rate. It should be noted that the meth-

ods presented herein are comparable with multiple-state extensions of algorithms

which have been developed for memoryless channels.

Simulation results have advocated that increases in the order of the field, the

crossover probabilities and the probability of transferring to the ‘bad’ state of a GEC

are all factors which increase the SER. On the other hand, an increase in the prob-

ability of transferring to the ‘good’ state, which has a lower crossover probability,

will usually decrease the SER.

Chapters 6 and 7 of this thesis focussed on representing the behaviour of a special

type of GEC in terms of burst-error characteristics. Although this has already been

achieved for the binary case, the description for the non-binary restricted GEC

is novel, and in particular the treatment of the channel reliability factor for the

non-binary restricted GEC is new. The value of this burst-error characteristic is

increased with higher probabilities of correct symbol reception, but it must also be

handicapped by higher probabilities of incorrect reception.

This thesis discussed the concept of a restricted GEC, which has the property

that when in the ‘bad’ state, given a transmitted symbol, all symbols are equally

185

8.2. FUTURE RESEARCH

likely to be received. A proof was given of a statement presented in [26] concern-

ing the definition of the conditional spectral coefficients in terms of the burst-error

characteristics of a binary restricted GEC, and thus the conditional spectral polyno-

mials were obtained. Using the same strategy of proof, this result was extended to

non-binary restricted GECs. The structure of the conditional spectral polynomials

depended heavily on the positions of the zero and nonzero elements of the codewords

of the dual code. A similarity was noted between the weight polynomials, which are

often functions of a single variable, as described in the MacWilliams identities and

the expressions derived for the conditional spectral coefficients in this thesis as they

both concerned a relationship between a code and its dual. Since the polynomials in

terms of the burst-error characteristics are functions of three variables rather than

one, they have been named generalised weight polynomials.

This theory permitted the development of two additional APP decoding algo-

rithms, specifically for restricted GECs. One algorithm handled the binary case

and the other was the non-binary extension. Ultimately, these methods gave more

pleasing results than those of Chapters 4 and 5. This is because the calculations of

the a posteriori probabilities are constructed in terms of burst-error characteristics,

which provide a more useful description of a wireless channel since they focus on the

bursts themselves rather than the states of the model. Additionally, the similarities

between these trivariate polynomial expressions for the APPs and the weight poly-

nomials of the MacWilliams identity are aesthetically pleasing. The error correction

capabilities of a sample of linear block codes using these methods for decoding on

a restricted GEC were shown. Decreases in either the average fade to connection

time ratio or the burst factor appeared to result in a lower SER. Additionally, for

a fixed value of the channel reliability factor and a burst factor of zero, the same

post-decoding SER was obtained for each of the average fade to connection time

ratio values simulated. This was verified by setting the burst factor to zero in the

expression for the conditional spectral polynomials.

8.2 Future Research

There are at least three directions along which future research in this area of APP

decoding of linear block codes over discrete channels may proceed. These advances

can be developed independently, which means that an extension in one direction

from the material presented here will not hinder the possibilities of advances in the

other directions. The possible extensions are the signalling alphabet, the use of

reliability information, and the channel model.

186


Table 8.1: Elements of GF (32) and their ternary vector images.

Exponential Polynomial [GF (3)]2 image0 0 [0,0]D0 1 [0,1]D1 D [1,0]D2 2D + 1 [2,1]D3 2D + 2 [2,2]D4 2 [0,2]D5 2D [2,0]D6 D + 2 [1,2]D7 D + 1 [1,1]

Signalling alphabet

Firstly, the procedures reported here have been developed for linear block codes over

GF (p), where p is either 2 (Chapters 3, 4 and 6) or an odd prime (Chapters 3, 5 and

7). There are however many codes used in practice which are defined over a field

GF (pa), for a > 1. Examples include BCH codes and Reed-Solomon codes, which

have applications such as compact discs and Digital Versatile Discs and are resilient

against burst errors.

One method of processing elements of GF (pa) is to use their p-ary image as

a vector. Recall from Chapter 2 that the elements of GF (pa) can be regarded as

polynomials modulo a monic irreducible polynomial of degree a. These polynomials

have a coefficients chosen from GF (p) and thus there is a bijection ϑ between them

and the set of p-ary vectors of length a. If the polynomials are in terms of an

indeterminate D, then define

ϑ : GF (pa) → [GF (p)]a

ca−1Da−1 + . . .+ c1D + c0 7→

[

ca−1, . . . , c1, c0

]

. (8.1)

The following example should clarify concepts. Knowing that the nonzero elements

of a Galois field form a cyclic multiplicative group, a field of order nine can be con-

structed by taking the elements {0, 1, D1, . . . D7} defined modulo the monic polyno-

mial

f ∗(D) = D2 +D + 2, (8.2)

which is irreducible over the base field GF (3). The elements in exponential form are

listed in Table 8.1 along with their polynomial equivalents and ternary vector images

as defined by (8.1). Using p-ary symbols instead of bits can be advantageous because

it permits the transmission of more information per time unit. On the other hand,

187


if such a symbol is decoded incorrectly, then so much more information is lost. For

this reason coding using the base field is sometimes preferred over pa-ary symbols,

particularly if the channel conditions are harsh. In the example in Table 8.1, each

symbol carries 1.58 bits, as opposed to 3.17 bits if 9-ary symbols were used. In this

way, the non-binary procedures reported in this research are capable of decoding

symbols in GF (pa), hence a code over any field can be used. Ideally, collections of

a symbols would be combined into a single signal during the modulation process,

and then demodulated into a symbols after transmission through the channel. The

performance of such schemes however has not been investigated.

Reliability information

It is important to remember that the values being calculated with the procedures

described in this thesis are probabilities. As such, they are small in magnitude. This

is especially the case when many are multiplied together, or when the number of

possible received symbols is large, meaning the individual error probabilities are tiny.

This phenomenon is discussed in [74]. When the size of the code and/or the size of

the signalling alphabet is large, implementation of these APP decoding procedures

as listed on most computers will result in underflow. That is, all symbols will be

decoded as zero and the SER will be intolerably high. This is further motivation to

refrain from working directly with elements of GF (pa). One solution given in [74],

at least for memoryless channels, is to normalise the APPs by dividing by the sum

of all APPs found for each information symbol position. This will also work for

channels with memory.

The underflow problem in [74] resulted from the consideration of concatenated

codes. These, along with iterative decoding are two ways of reusing the reliability

information. As discussed in Section 2.3.2, this means a MAP decoding algorithm.

In basic iterative decoding, the same word is decoded multiple times. For the first

iteration, the a priori probabilities of all symbols are equal. However for all subse-

quent iterations, the APP derived from the previous iteration becomes the a priori

probability for the next. Thus the algorithm will converge to a solution over time.

This solution may still not be the correct one. When sufficient iterations have been

performed, a hard decision is made using the ‘arg max’ operation. Note that iterative

decoding generally lowers the SER at the expense of increasing the amount of time

or energy required to perform the decoding. It is possible to adapt the procedures

reported in this research to accommodate iterative decoding. It would require the

use of the penultimate rather than the final line of (2.104), which would then lead

to different problem statements in (2.106) for memoryless channels and (2.107) for

188


channels with memory. In essence, each path through the trellis is weighted by the

relevant APPs obtained in the previous decoding iteration. These weightings are

now a priori information. The summation over sets of trellis paths is performed in

the same way as for the non-iterative case to calculate APPs to be used as a priori

information for the following iteration, and so forth.

Reliability information can also be reused when decoding parallel concatenated

block turbo codes, also known as product codes. Suppose there exist linear block

codes C1, C2, . . . , Cl each over GF (p) where for 1 ≤ i ≤ l, Ci is an (ni, ki) code

in standard form with Hamming distance di. To encode, the∏l

i=1 ki information

symbols are arranged into an l-dimensional hypermatrix. All vectors in the first

dimension are encoded using C1. The resulting vectors are encoded in the second

dimension using C2. This process continues until the encoding of all vectors in the

lth dimension produces a codeword of∏l

i=1 ni symbols. The code has a Hamming

distance of∏l

i=1 di. Decoding is carried out in the reverse order of the encoding.

The APP of an information symbol when decoding in one dimension becomes the a

priori probability of that symbol in the following dimension. The viability of such a

scheme has been investigated for two-dimensional single parity check product codes

in [74] and [75], however only for memoryless channels. The APP decoding schemes

in this research have yet to be considered for product codes over channels with

memory.

Channel model

Another theme discussed briefly in [74] is using the structure of GF (p) for p > 2 to

perform the necessary calculations in a different way which may lower the compu-

tational complexity of the decoding procedures. It may be possible to implement a

similar scheme for channels with memory so that the time needed to retrieve the in-

formation symbols is shortened. As reported in Section 5.4.1, increasing the number

of states of the channel model by just one means large increases in the time taken

for the procedures to be executed.

This research has only considered channel models with a memory of zero or one

symbol duration. In order to consider more complex models with greater memory,

it would be necessary to work with matrix probabilities which are different from

D0, Dǫ, D and ∆. It may also be possible to find expressions for the conditional

spectral coefficients in terms of burst-error characteristics as discussed in Chapters

6 and 7 for channel models other than a restricted GEC. Thus, the concepts of this

research could be applied to a wide variety of practical situations.

189

Appendices

191

Appendix A

Proof of (3.44)

It is claimed in (3.44) that W−1pn−k = 1

pn−k WHpn−k . A proof of this result is given

below.

Proof. Firstly note that the complex conjugate of a pth root of unity is that same

root with a negative exponent. That is, if wγ is a pth root of unity then

(wγ)∗ = w−γ. (A.1)

If the ith row of the symmetric matrix Wpn−k is given by[

wi,1, wi,2, . . . , wi,pn−k]

,

for 1 ≤ i ≤ pn−k, then the entry in row i and column j of the matrix product

Wpn−k · WHpn−k = [ψ(pn−k)i,j]pn−k×pn−k of Wpn−k and its Hermitian may be determined

as

ψ(pn−k)i,j =⟨[

wi,1, wi,2, . . . , wi,pn−k]

,[

(wj,1)∗, (wj,2)∗, . . . , (wj,pn−k

)∗]⟩

= wi,1w−j,1 + wi,2w−j,2 + . . .+ wi,pn−k

w−j,pn−k

by (A.1)

=

pn−k

∑

m=1

wi,m−j,m. (A.2)

Then for the diagonal entries, i = j and

ψ(pn−k)i,i =

pn−k

∑

m=1

wi,m−i,m

=

pn−k

∑

m=1

1

= pn−k. (A.3)

193

For a non-diagonal element which is in row i and column j 6= i, it can be shown

that∑pn−k

m=1 wi,m−j,m is zero by induction. Thus, the value of ψ(pd)i,j is first examined

for d = 1. The ith row of Wp can be given as[

wi·0, wi·1, wi·2, . . . , wi·(p−1)]

, and

similarly the jth row is given as[

wj·0, wj·1, wj·2, . . . , wj·(p−1)]

. Then

ψ(p1)i,j =

p∑

m=1

w(i−j)·m

= wi−j

p∑

m=1

wm

= 0, (A.4)

where (A.4) follows because the sum of all complex pth roots of unity is zero. So it

has been shown that any two different rows of Wp have a dot product of zero. Now

assume ψ(pd)i,j equals zero for a positive integer d. By definition,

Wpd+1 = Wp ⊗ Wpd =

w0Wpd w0Wpd . . . w0Wpd

w0Wpd w1Wpd . . . wp−1Wpd

......

...

w0Wpd wp−1Wpd . . . w(p−1)(p−1)Wpd

. (A.5)

For notational efficiency, let a and b be the rth and sth row of Wpd , respectively.

The dot product of the (h · pd + r)th and the (i · pd + s)th rows of Wpd+1 , where

0≤h, i≤p− 1 and 1≤r, s≤pd, may be calculated as

⟨[

w0a, wha, w2ha, . . . , w(p−1)ha]

,[

w0b, wib, w2ib, . . . , w(p−1)ib]⟩

= w0 〈a,b〉 + whwi 〈a,b〉 + w2hw2i 〈a,b〉 + . . .+ w(p−1)hw(p−1)i 〈a,b〉= < a,b > [w0 + wh+i + w2(h+i) + . . .+ w(p−1)(h+i)]

= 0[w0 + wh+i + w2(h+i) + . . .+ w(p−1)(h+i)] by the Inductive Hypothesis

= 0. (A.6)

Thus ψ(pd+1)i,j equals zero, and by the Principle of Mathematical Induction, ψ(pd)i,j

equals zero for all positive integers d. In summary, it has been shown that

Wpn−k · WHpn−k = pn−kIpn−k , (A.7)

or alternatively thatW−1

pn−k =1

pn−kWH

pn−k . (A.8)

194

Appendix B

Proof of Lemma 5.3.1

Lemma 5.3.1. The sum of the entries in any row or column except the first of the

complex Walsh-Hadamard transform matrix Wpn−k is zero.

Proof. This proof is given by induction on d = n−k. Let Σ(r, d) stand for the

sum of the entries in the rth row of Wpd . It is required to show that Σ(r, d) = 0,

∀r ∈ {2, 3, . . . , pd}. Firstly,

Σ(r, 1) = w(r−1)·0 + w(r−1)·1 + . . . w(r−1)·(p−1)

(1−wr−1)Σ(r, 1) = (1 − wr−1)[

w(r−1)·0 + w(r−1)·1 + . . .+ w(r−1)·(p−1)]

= w0(r−1)+w1(r−1)+. . .+w(p−1)(r−1)−wr−1−w2(r−1)−. . .−wp(r−1)

= w0 − wp(r−1)

= 0. (B.1)

Since 2 ≤ r ≤ p, it follows that 1 − wr−1 is nonzero. Therefore (B.1) implies that

Σ(r, 1) equals zero and the lemma is true for n−k = 1. Now assume that the

lemma is true for a positive integer n−k = d. This means the sum of the entries

in any row except the first row of Wpd is zero. Examine the (i · pd + s)th row of

Wpd+1 for some i ∈ {0, 1, . . . , p−1} and some s ∈ {1, 2, . . . , pd} but excluding the

case where i = 0 and s = 1, as this was covered in Lemma 5.2.2. Let b be the sth

row of Wpd . Then the structure of the (i · pd + s)th row of Wpd+1 can be reported

as[

w0b, wib, w2ib, . . . , w(p−1)ib]

. There are two cases to consider. Firstly,

assume s = 1 and i 6= 0. Then by Lemma 5.2.2, the row vector b can be expressed

as

b =[

1, 1, . . . , 1]

, (B.2)

and using the base case in (B.1), the sum of the entries in the (i · pd + 1)th row may

195

be calculated as

Σ(i · pd + 1, d+ 1) = pd(w0) + pd(wi) + pd(w2i) + . . .+ pd[w(p−1)i]

= pd[

w0 + wi + w2i + . . .+ w(p−1)i]

= pd(0)

= 0.

(B.3)

The other case to consider is where s 6= 1. By the Inductive Hypothesis, the sum of

the entries in the sth row of Wpd is zero. Then

Σ(i · pd+s, d+1) = w0Σ(s, d) + wiΣ(s, d) + w2iΣ(s, d) + . . .+ w(p−1)iΣ(s, d)

=[

w0 + wi + w2i + . . .+ w(p−1)i]

Σ(s, d) (B.4)

= 0.

Therefore, the lemma is true for n−k = d+1 and by the Principle of Mathematical

Induction, the lemma is true for all positive integers d. Since Wpn−k is symmetric,

the result also holds for all of its columns. Thus, the sum of the entries in all rows

and columns of Wpd except the first is zero.

196

Bibliography

[1] Q. Bi, G. I. Zysman, and H. Menkes, “Wireless mobile communications at the

start of the 21st century,” IEEE Commun. Mag., vol. 39, no. 1, pp. 110–116,

Jan. 2001.

[2] J. Chen and R. M. Tanner, “A hybrid coding scheme for the Gilbert-Elliott

channel,” IEEE Trans. Commun., vol. 54, no. 10, pp. 1787–1796, Oct. 2006.

[3] E. O. Elliott, “Estimates of error rates for codes on burst-noise channels,” Bell

System Technical Journal, vol. 42, pp. 1977–1997, Sept. 1963.

[4] J.-Y. Chouinard, M. Lecours, and G. Y. Delisle, “Simulation of error sequences

in a mobile communications channel with Fritchman’s error generation model,”

in Proc. IEEE Pacific Rim Conf. on Commun., Computers and Signal Process.,

Victoria, Canada, June 1989, pp. 134–137.

[5] B. Hayes, “Third base,” American Scientist, vol. 89, no. 6, pp. 489–494, Nov.-

Dec. 2001.

[6] C. E. Shannon, “A mathematical theory of communication,” Bell System Tech-

nical Journal, vol. 27, no. 3 and 4, pp. 379–423 and 623–656, July and Oct.

1948.

[7] C. Berrou, A. Glavieux, and P. Thitimajshima, “Near Shannon limit error-

correcting coding and decoding: Turbo-codes (1),” in Proc. IEEE Int. Conf.

on Commun., vol. 2, Geneva, Switzerland, May 1993, pp. 1064–1070.

[8] L. R. Bahl, J. Cocke, F. Jelinek, and J. Raviv, “Optimal decoding of linear

codes for minimizing symbol error rate,” IEEE Trans. Inf. Theory, vol. 20,

no. 2, pp. 284–287, Mar. 1974.

[9] T. Johansson and K. Zigangirov, “A simple one-sweep algorithm for optimal

APP symbol decoding of linear block codes,” IEEE Trans. Inf. Theory, vol. 44,

no. 7, pp. 3124–3129, Nov. 1998.

197

BIBLIOGRAPHY

[10] H.-J. Zepernick, “A forward-only recursion algorithm for MAP decoding of

linear block codes,” Int. J. Adaptive Control and Signal Process., vol. 16, no. 8,

pp. 577–588, Sept. 2002.

[11] W. Turin, “MAP decoding in channels with memory,” IEEE Trans. Commun.,

vol. 48, no. 5, pp. 757–763, May 2000.

[12] L. Ping and K. L. Yeung, “Symbol-by-symbol APP decoding of the Golay code

and iterative decoding of concatenated Golay codes,” IEEE Trans. Inf. Theory,

vol. 45, no. 7, pp. 2558–2562, Nov. 1999.

[13] Y. Kaji, R. Shibuya, T. Fujiwara, T. Kasami, and S. Lin, “MAP and LogMAP

decoding algorithms for linear block codes using a code structure,” IEICE

Trans. on Fund., vol. E83-A, no. 10, pp. 1884–1890, Oct. 2000.

[14] C. R. Hartmann and L. D. Rudolph, “An optimum symbol-by-symbol decoding

rule for linear codes,” IEEE Trans. Inf. Theory, vol. 22, no. 5, pp. 514–517, Sept.

1976.

[15] E. Dubrova, Y. Jamal, and J. Mathew, “Non-silicon non-binary computing:

Why not?” in Proc. 1st Workshop on Non-Silicon Computation, Boston, USA,

2002, pp. 23–29.

[16] J. Berkmann, “On turbo decoding of nonbinary codes,” IEEE Commun. Lett.,

vol. 2, no. 4, pp. 94–96, Apr. 1998.

[17] A. C. Reid, D. P. Taylor, and T. A. Gulliver, “Non-binary turbo codes,” in

Proc. Int. Symp. on Inf. Theory, Lausanne, Switzerland, July 2002, p. 57.

[18] J. Berkmann, “A symbol-by-symbol MAP decoding rule for linear codes over

rings using the dual code,” in Proc. Int. Symp. on Inf. Theory, Cambridge,

USA, Aug. 1998, p. 90.

[19] A. Goupil, M. Colas, G. Gelle, and D. Declercq, “FFT-based BP decoding of

general LDPC codes over Abelian groups,” IEEE Trans. Commun., vol. 55,

no. 4, pp. 644–649, Apr. 2007.

[20] D. Declercq and M. Fossorier, “Decoding algorithms for nonbinary LDPC codes

over GF (q),” IEEE Trans. Commun., vol. 55, no. 4, pp. 633–643, Apr. 2007.

[21] J. Garcia-Frias and J. D. Villasenor, “Combining hidden Markov source models

and parallel concatenated codes,” IEEE Commun. Lett., vol. 1, no. 4, pp. 111–

113, July 1997.

198

BIBLIOGRAPHY

[22] ——, “Turbo codes for continuous Markov channels with unknown parameters,”

in Proc. IEEE Global Telecommun. Conf., Rio de Janeiro, Brazil, Dec. 1999.

[23] ——, “Turbo decoding of Gilbert-Elliot channels,” IEEE Trans. Commun.,

vol. 50, no. 3, pp. 357–363, Mar. 2002.

[24] A. W. Eckford, F. R. Kschischang, and S. Pasupathy, “Analysis of low-density

parity-check codes for the Gilbert-Elliott channel,” IEEE Trans. Inf. Theory,

vol. 51, no. 11, pp. 3872–3889, Nov. 2005.

[25] F. J. MacWilliams, “A theorem on the distribution of weights in a systematic

code,” Bell System Technical Journal, vol. 42, pp. 79–94, Jan. 1963.

[26] L. Kittel and H.-J. Zepernick, “Generalized weight polynomials for linear binary

block codes used on a burst error channel,” in Proc. Int. Symp. on Inf. Theory

and its Appl., Honolulu, USA, Nov. 1990, pp. 175–178.

[27] H.-J. Zepernick, “A posteriori probability decoding of linear block codes over

prime fields,” in Proc. Int. Conf. Optimization Techniques and Appl., Hong

Kong, China, Dec. 2001, pp. 1497–1504.

[28] ——, “On computing the performance of linear block codes in nonindependent

channel errors,” in Proc. IEEE Int. Conf. on Commun., vol. 2, Dallas, USA,

June 1996, pp. 989–994.

[29] M. J. Golay, “Notes on digital coding,” Proc. IRE, vol. 37, no. 6, p. 657, June

1949.

[30] L. N. Kanal and A. R. K. Sastry, “Models for channels with memory and their

applications to error control,” Proc. IEEE, vol. 66, no. 7, pp. 724–744, July

1978.

[31] H.-J. Zepernick, “Modal analysis of linear nonbinary block codes used on

stochastic finite state channels,” in Proc. IEEE Int. Symp. on Inf. Theory,

Whistler, Canada, Sept. 1995, p. 287.

[32] E. N. Gilbert, “Capacity of a burst-noise channel,” Bell System Technical Jour-

nal, vol. 39, pp. 1253–1265, Sept. 1960.

[33] B. Wong and C. Leung, “On computing undetected error probabilities on the

Gilbert channel,” IEEE Trans. Commun., vol. 43, no. 11, pp. 2657–2661, Nov.

1995.

199

BIBLIOGRAPHY

[34] B. D. Fritchman, “A binary channel characterization using partitioned Markov

Chains,” IEEE Trans. Inf. Theory, vol. 13, no. 2, pp. 221–227, Apr. 1967.

[35] J.-Y. Chouinard, M. Lecours, and G. Y. Delisle, “Estimation of Gilbert’s and

Fritchman’s models parameters using the Gradient Method for digital mobile

radio channels,” IEEE Trans. Veh. Technol., vol. 37, no. 3, pp. 158–166, Aug.

1988.

[36] A. Semmar, M. Lecours, J.-Y. Chouinard, and J. Ahern, “Characterization

of error sequences in UHF digital mobile radio channels,” IEEE Trans. Veh.

Technol., vol. 40, no. 4, pp. 769–776, Nov. 1991.

[37] W. Griffiths, “APP decoding of linear block codes on Fritchman channels,”

in Proc. 5th Australian Telecommun. Cooperative Research Centre Workshop,

Melbourne, Australia, Nov. 2005, pp. 50–53.

[38] L. E. Baum, T. Petrie, G. Soules, and N. Weiss, “A maximization technique

occurring in the statistical analysis of probabilistic functions of Markov chains,”

Annals of Mathematical Statistics, vol. 41, no. 1, pp. 164–171, Feb. 1970.

[39] W. Turin, Performance Analysis and Modeling of Digital Transmission Systems,

3rd ed. New York, USA: Kluwer Academic/Plenum Publishers, 2004.

[40] R. H. McCullough, “The binary regenerative channel,” Bell System Technical

Journal, vol. 47, pp. 1713–1735, Oct. 1968.

[41] J. Swoboda, “Ein statistisches Modell fur die Fehler bei binarer

Datenubertragung auf Fernsprechkanalen (in German),” AEU, vol. 23, pp. 313–

332, 1969.

[42] C. White, P. Farrell, J. Hagan, M. Reimean, H. Rudin, A. Goldstein, and

H. Ohnsorge, “Meeting reports,” IEEE Commun. Mag., vol. 17, no. 4, pp.

28–34, June 1979.

[43] I. F. Blake, “Codes over integer residue rings,” Information and Control, vol. 29,

no. 4, pp. 295–300, Dec. 1975.

[44] G. Caire and E. Biglieri, “Linear block codes over cyclic groups,” IEEE Trans.

Inf. Theory, vol. 41, no. 5, pp. 1246–1256, Sept. 1995.

[45] J. B. Fraleigh, A First Course in Abstract Algebra, 7th ed. Addison Wesley,

2002.

200

BIBLIOGRAPHY

[46] R. Lidl and H. Niederreiter, Introduction to finite fields and their applications.

New York, USA: Cambridge University Press, 1986.

[47] C. Langton, “Coding and decoding with convolutional codes,” http://www.

complextoreal.com/convo.htm, 1999.

[48] R. E. Blahut, Theory and Prctice of Error Control Codes. Reading, USA:

Addison-Wesley Publishing Company, 1983.

[49] M. Bossert, Channel Coding for Telecommunications. Chichester, England:

John Wiley & Sons, Ltd, 1999.

[50] A. J. Viterbi, “Convolutional codes and their performance in communication

systems,” IEEE Trans. Commun. Technol., vol. COM-19, no. 5, pp. 751–772,

Oct. 1971.

[51] G. D. Forney, “The Viterbi algorithm,” Proc. IEEE, vol. 61, no. 3, pp. 268–278,

Mar. 1973.

[52] F. Jelinek, “Fast sequential decoding algorithm using a stack,” IBM J. Res.

Develop., vol. 13, no. 6, pp. 675–685, Nov. 1969.

[53] J. Erfanian and S. Pasupathy, “Low-complexity parallel-structure symbol-by-

symbol detection for ISI channels,” in Proc. IEEE Pacific Rim Conf. on Com-

mun., Computers and Signal Process., Victoria, Canada, June 1989, pp. 350–

353.

[54] P. Robertson, E. Villebrun, and P. Hoeher, “A comparison of optimal and sub-

optimal MAP decoding algorithms operating in the log domain,” in Proc. IEEE

Int. Conf. on Commun., vol. 2, Seattle, USA, June 1995, pp. 1009–1013.

[55] C. E. Shannon, “The zero error capacity of a noisy channel,” IRE Trans. Inf.

Theory, vol. IT-2, no. 3, pp. S8–S19, Sept. 1956.

[56] V. Sidorenko, G. Markarian, and B. Honary, “Minimal trellis design for linear

codes based on the Shannon product,” IEEE Trans. Inf. Theory, vol. 42, no. 6,

Part 1, pp. 2048–2053, Nov. 1996.

[57] J. K. Wolf, “Efficient Maximum Likelihood decoding of linear block codes using

a trellis,” IEEE Trans. Inf. Theory, vol. 24, no. 1, pp. 76–80, Jan. 1978.

201

BIBLIOGRAPHY

[58] D. Geller, I. Kra, S. Popescu, and S. Simanca, “On circu-

lant matrices,” State University of New York at Stony Brook,

http://www.math.sunysb.edu/∼sorin/eprints/circulant.pdf, 2002.

[59] D. Coppersmith and S. Winograd, “Matrix multiplication via arithmetic pro-

gressions,” in Proc. 19th Annual ACM Conf. on Theory of Computing, New

York, USA, May 1987, pp. 1–6.

[60] Z. Chen, P. Fan, and F. Jin, “On a new binary [22, 13, 5] code,” IEEE Trans.

Inf. Theory, vol. 36, no. 1, pp. 228–229, Jan. 1990.

[61] L. N. Kanal and A. R. K. Sastry, “Models for channels with memory and their

applications to error control,” Proc. IEEE, vol. 66, no. 7, pp. 724–744, July

1978.

[62] M. Zorzi, R. R. Rao, and L. B. Milstein, “On the accuracy of a first-order

Markov model for data transmission on fading channels,” in Proc. Fourth IEEE

Int. Conf. on Universal Personal Commun. Record, vol. 1, Tokyo, Japan, Nov.

1995, pp. 211–215.

[63] J. Garcia-Frias and J. D. Villasenor, “Turbo decoders for Markov channels,”

IEEE Commun. Lett., vol. 2, no. 9, pp. 257–259, Sept. 1998.

[64] H.-J. Zepernick and B. Rohani, “On symbol-by-symbol MAP decoding of linear

UEP codes,” in Proc. IEEE Global Telecommun. Conf., vol. 3, San Francisco,

USA, Nov. 2000, pp. 1621–1626.

[65] A. Trofimov and T. Johansson, “A memory-efficient optimal APP symbol-

decoding algorithm for linear block codes,” IEEE Trans. Commun., vol. 52,

no. 9, pp. 1429–1434, Sept. 2004.

[66] J. H. van Lint, “A survey of perfect codes,” Rocky Mountain Journal of Math-

ematics, vol. 5, no. 2, pp. 199–224, 1975.

[67] J. Berkmann, “Symbol-by-symbol MAP decoding of nonbinary codes,” in Proc.

ITG Fachtagung: Codierung fur Quelle, Kanal und Ubertragung, Aachen, Ger-

many, Mar. 1998, pp. 95–100.

[68] Z. Wykes, ISBN-13 For Dummiesr, Special Edition. Indianapolis, USA: Wiley

Publishing Inc., 2005.

202

BIBLIOGRAPHY

[69] J. R. Yee and E. J. Weldon, Jr., “Evaluation of the performance of error-

correcting codes on a Gilbert channel,” in Proc. IEEE Int. Conf. on Commun.,

vol. 2, New Orleans, USA, May 1994, pp. 655–659.

[70] L. Wilhelmsson and L. B. Milstein, “On the effect of imperfect interleaving

for the Gilbert-Elliott channel,” IEEE Trans. Commun., vol. 47, no. 5, pp.

681–688, May 1999.

[71] B. Dronma, “Codes over different alphabets and signal sets,” Master’s thesis,

Department of Mathematics, University of Bergen, Bergen, Norway, May 2004.

[72] E. R. Berlekamp, Key Papers in The Development of Coding Theory. New

York: Institute of Electrical and Electronics Engineers, 1974.

[73] D. Gorenstein and N. Zierler, “A class of error-correcting codes in pm symbols,”

Journal of the Society for Industrial and Applied Mathematics, vol. 9, no. 2, pp.

207–214, June 1961.

[74] W. Griffiths, H.-J. Zepernick, and M. Caldera, “On APP decoding of non-

binary block turbo codes over discrete channels,” in Proc. Int. Symp. on Inf.

Theory and its Appl., Parma, Italy, Oct. 2004, pp. 362–366.

[75] M. Caldera and H.-J. Zepernick, “APP decoding of nonbinary SPC product

codes over discrete memoryless channels,” in Proc. 10th Int. Conf. on Telecom-

mun., vol. 2, Papeete, French Polynesia, Feb. 2003, pp. 1167–1170.

203

on a posteriori probability decoding of linear block codes ... · on a posteriori probability...

Documents