mutual information as a tool for the design, analysis, and...

51
Mutual Information as a Tool for the Design, Analysis, and Testing of Modern Communication Systems June 8, 2007 Matthew Valenti Associate Professor West Virginia University Morgantown, WV 26506-6109 [email protected]

Upload: vuongtuyen

Post on 02-Mar-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

Mutual Informationas a Tool for the

Design, Analysis, and Testingof Modern Communication Systems

June 8, 2007

Matthew ValentiAssociate ProfessorWest Virginia UniversityMorgantown, WV [email protected]

6/8/2007 Mutual Information for Modern Comm. Systems 2/51

0.5 1 1.5 210-7

10-6

10-5

10-4

10-3

10-2

10-1

100

Eb/No in dB

BE

R

1 iteration

2 iterations

3 iterations6 iterations

10 iterations

18 iterations

Motivation:Turbo Codes

Berrou et al 1993– Rate ½ code.– 65,536 bit message.– Two K=5 RSC

encoders.– Random interleaver.– Iterative decoder.– BER = 10-5 at 0.7 dB.

Comparison with Shannon capacity:– Unconstrained: 0 dB.– With BPSK: 0.2 dB.

6/8/2007 Mutual Information for Modern Comm. Systems 3/51

Key Observationsand Their Implications

Key observations:– Turbo-like codes closely approach the channel capacity.– Such codes are complex and can take a long time to simulate.

Implications:– If we know that we can find a code that approaches capacity, why

waste time simulating the actual code?– Instead, let’s devote our design effort towards determining

capacity and optimizing the system with respect to capacity.– Once we are done with the capacity analysis, we can design

(select?) and simulate the code.

6/8/2007 Mutual Information for Modern Comm. Systems 4/51

ChallengesHow to efficiently find capacity under the constraints of:– Modulation.– Channel.– Receiver formulation.

How to optimize the system with respect to capacity.– Selection of free parameters, e.g. code rate, modulation index.– Design of the code itself.

Dealing with nonergodic channels– Slow and block fading.– hybrid-ARQ systems.– Relaying networks and cooperative diversity.– Finite-length codewords.

6/8/2007 Mutual Information for Modern Comm. Systems 5/51

Overview of TalkThe capacity of AWGN channels– Modulation constrained capacity.– Monte Carlo methods for determining constrained capacity.– CPFSK: A case study on capacity-based optimization.

Design of binary codes– Bit interleaved coded modulation (BICM) and off-the-shelf codes.– Custom code design using the EXIT chart.

Nonergodic channels.– Block fading and Information outage probability.– Hybrid-ARQ.– Relaying and cooperative diversity.– Finite length codeword effects.

6/8/2007 Mutual Information for Modern Comm. Systems 6/51

Noisy Channel Coding Theorem(Shannon 1948)

Consider a memoryless channel with input X and output Y

– The channel is completely characterized by p(x,y)The capacity C of the channel is

– where I(X,Y) is the (average) mutual information between X and Y. The channel capacity is an upper bound on information rate r.– There exists a code of rate r < C that achieves reliable communications.– “Reliable” means an arbitrarily small error probability.

{ }⎭⎬⎫

⎩⎨⎧

== ∫∫ dxdyypxp

yxpyxpYXICxpxp )()(

),(log),(max);(max)()(

Sourcep(x)

Channelp(y|x)

X YReceiver

6/8/2007 Mutual Information for Modern Comm. Systems 7/51

Capacity of the AWGN Channel with Unconstrained Input

Consider the one-dimensional AWGN channel

The capacity is

The X that attains capacity is Gaussian distributed.– Strictly speaking, Gaussian X is not practical.

{ } ⎟⎟⎠

⎞⎜⎜⎝

⎛+== 12log

21);(max 2)(

o

s

xp NEYXIC

The input X is drawn from any distribution with average energy E[X2] = Es

X

N~zero-mean white Gaussianwith energy E[N2]= N0/2

Y = X+N

bits per channel use

6/8/2007 Mutual Information for Modern Comm. Systems 8/51

Capacity of the AWGN Channel with a Modulation-Constrained InputSuppose X is drawn with equal probability from the finite set S = {X1,X2, …, XM}

– where f(Y|Xk) = κ p(Y|Xk) for any κ common to all Xk

Since p(x) is now fixed

– i.e. calculating capacity boils down to calculating mutual info.

{ } );();(max)(

YXIYXICxp

==

Modulator:Pick Xk at random from S= {X1,X2, …, XM}

Xk

Nk

ML Receiver:Compute f(Y|Xk)for every Xk ∈ S

Y

6/8/2007 Mutual Information for Modern Comm. Systems 9/51

Entropy and Conditional Entropy

Mutual information can be expressed as:

Where the entropy of X is

And the conditional entropy of X given Y is

∫== dxxhxpXhEXH )()()]([)(

)(log)(

1log)( xpxp

xh −==

dxdyyxhyxpYXhEYXH )|(),()]|([)|( ∫∫==

)|(log)|( yxpyxh −=

where

where

)|()();( YXHXHYXI −=

self-information

6/8/2007 Mutual Information for Modern Comm. Systems 10/51

Calculating Modulation-Constrained Capacity

To calculate:

We first need to compute H(X)

Next, we need to compute H(X|Y)=E[h(X|Y)]– This is the “hard” part.– In some cases, it can be done through numerical integration.– Instead, let’s use Monte Carlo simulation to compute it.

)|()();( YXHXHYXI −=

MME

XpE

XhEXH

log][log

)(1log

)]([)(

==

⎥⎦

⎤⎢⎣

⎡=

=

MXp 1)( =

6/8/2007 Mutual Information for Modern Comm. Systems 11/51

Step 1: Obtain p(x|y) from f(y|x)

Modulator:Pick Xk at randomfrom S

Xk

Nk

Noise Generator

Receiver:Compute f(Y|Xk)for every Xk ∈ S

Y

∑∈

=Sx

yxp'

1)|'(

∑∑∑∈

∈∈

===

SxSx

Sx

xyfxyf

ypxpxyp

ypxpxyp

yxpyxpyxp

''

'

)'|()|(

)()'()'|(

)()()|(

)|'()|()|(

Since

We can get p(x|y) from

6/8/2007 Mutual Information for Modern Comm. Systems 12/51

Step 2: Calculate h(x|y)

Modulator:Pick Xk at randomfrom S

Xk

Nk

Noise Generator

Receiver:Compute f(Y|Xk)for every Xk ∈ S

Y

Given a value of x and y (from the simulation) compute

Then compute∑

=

Sxxyf

xyfyxp

')'|(

)|()|(

∑∈

+−=−=Sx

xyfxyfyxpyxh'

)'|(log)|(log)|(log)|(

6/8/2007 Mutual Information for Modern Comm. Systems 13/51

Step 3: Calculating H(X|Y)

Modulator:Pick Xk at randomfrom S

Xk

Nk

Noise Generator

Receiver:Compute f(Y|Xk)for every Xk ∈ S

Y

Since:

Because the simulation is ergodic, H(X|Y) can be found by taking the sample mean:

where (X(n),Y(n)) is the nth realization of the random pair (X,Y).– i.e. the result of the nth simulation trial.

dxdyyxhyxpYXhEYXH )|(),()]|([)|( ∫∫==

∑=

∞→=

N

n

nn

NYXh

NYXH

1

)()( )|(1lim)|(

6/8/2007 Mutual Information for Modern Comm. Systems 14/51

Example: BPSK

Suppose that S ={+1,-1} and N has variance N0/2Es

Then: 2)|(log xyNExyf

o

s −−=

Modulator:Pick Xk at randomfrom S ={+1,-1}

Xk

Nk

Noise Generator

Receiver:Compute log f(Y|Xk)for every Xk ∈ S

Y

BPSK Capacity as a Function of Number of Simulation Trials

Eb/No = 0.2 dB

As N gets large, capacity converges to C=0.5

10 1 10 2 10 3 10 4 10 5 10 60.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

trials

capa

city

Unconstrained vs. BPSK Constrained Capacity

0 1 2 3 4 5 6 7 8 9 10-1-2

0.5

1.0

Eb/No in dB

BPSK Capacity Bound

Cod

e R

ate

r

Shan

non

Capa

city

Boun

d

Spe

ctra

l Effi

cien

cy

It is theoreticallypossible to operatein this region.

It is theoreticallyimpossible to operatein this region.

Power Efficiency of StandardBinary Channel Codes

Turbo Code1993

OdenwalderConvolutionalCodes 1976

0 1 2 3 4 5 6 7 8 9 10-1-2

0.5

1.0

Eb/No in dB

BPSK Capacity Bound

Cod

e R

ate

r

Shan

non

Capa

city

Boun

d

UncodedBPSK

IS-951991

510−=bP

Spe

ctra

l Effi

cien

cy

arbitrarily lowBER:

LDPC Code2001

Chung, Forney,Richardson, Urbanke

Software to Compute Capacitywww.iterativesolutions.com

-2 0 2 4 6 8 10 12 14 16 18 200

1

2

3

4

5

6

7

8

Eb/No in dB

Cap

acity

(bits

per

sym

bol)

2-D U

nconst

rained

Capacit

y

256QAM

64QAM

16QAM

16PSK

8PSK

QPSK

Capacity of PSK and QAM in AWGN

BPSK

Capacity of Noncoherent Orthogonal FSK in AWGNW. E. Stark, “Capacity and cutoff rate of noncoherent FSKwith nonselective Rician fading,” IEEE Trans. Commun., Nov. 1985.

M.C. Valenti and S. Cheng, “Iterative demodulation and decoding of turbo coded M-ary noncoherent orthogonal modulation,” IEEE JSAC, 2005.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

5

10

15

Rate R (symbol per channel use)

Min

imum

Eb/

No

(in d

B)

M=2

M=4

M=16

M=64

Noncoherent combining penalty

min Eb/No = 6.72 dBat r=0.48

Capacity of Nonorthogonal CPFSK

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 15

10

15

20

25

30

35

40

(MSK) h

min

Eb/

No

(in d

B)

No BW Constraint

BW constraint: 2 Hz/bps

(orthogonal)

STh

S. Cheng, R. Iyer Sehshadri, M.C. Valenti, and D. Torrieri, “The capacity of noncoherent continuous-phase frequency shift keying,” in Proc. Conf. on Info. Sci. and Sys. (CISS), (Baltimore, MD), Mar. 2007.

for h= 1min Eb/No = 6.72 dB

at r=0.48

6/8/2007 Mutual Information for Modern Comm. Systems 22/51

Overview of TalkThe capacity of AWGN channels– Modulation constrained capacity.– Monte Carlo methods for determining constrained capacity.– CPFSK: A case study on capacity-based optimization.

Design of binary codes– Bit interleaved coded modulation (BICM).– Code design using the EXIT chart.

Nonergodic channels.– Block fading: Information outage probability.– Hybrid-ARQ.– Relaying and cooperative diversity.– Finite length codeword effects.

6/8/2007 Mutual Information for Modern Comm. Systems 23/51

BICM(Caire 1998)

Coded modulation (CM) is required to attain the aforementioned capacity.– Channel coding and modulation handled jointly.– Alphabets of code and modulation are matched.– e.g. trellis coded modulation (Ungerboeck); coset codes (Forney)

Most off-the-shelf capacity approaching codes are binary.A pragmatic system would use a binary code followed by a bitwise interleaver and an M-ary modulator.– Bit Interleaved Coded Modulation (BICM).

BinaryEncoder

BitwiseInterleaver

Binaryto M-arymapping

lu nc' nc kx

6/8/2007 Mutual Information for Modern Comm. Systems 24/51

BICM Receiver

The symbol likelihoods must be transformed into bit log-likelihood ratios (LLRs):

– where represents the set of symbols whose nth bit is a 1.– and is the set of symbols whose nth bit is a 0.

Modulator:Pick Xk ∈ Sfrom (c1 … c μ)

Xk

Nk

Receiver:Compute f(Y|Xk)for every Xk ∈ S

Y Demapper:Compute λnfrom set of f(Y|Xk)

f(Y|Xk) λncn

fromencoder

todecoder

( )

( )∑

∈=

)0(

)1(

|

|log

nk

nk

SXk

SXk

n XYf

XYfλ

)1(

nS)0(

nS

000001

011

010110

111

101

100)1(

3S

6/8/2007 Mutual Information for Modern Comm. Systems 25/51

BICM Capacity

Can be viewed as μ=log2M binary parallel channels, each with capacity

Capacity over parallel channels adds:

As with the CM case, Monte Carlo integration may be used.

Modulator:Pick Xk ∈ Sfrom (c1 … c μ)

Xk

Nk

Receiver:Compute f(Y|Xk)for every Xk ∈ S

Y Demapper:Compute λnfrom set of f(Y|Xk)

f(Y|Xk) λncn

),( nnn cIC λ=

∑=

1nnCC

CM vs. BICM Capacity for 16QAM

-20 -15 -10 -5 0 5 10 15 20 25 300

0.5

1

1.5

2

2.5

3

3.5

4

Es/No in dB

Cap

acity

CM BICM w/ SP labeling

BICM w/ gray labeling

6/8/2007 Mutual Information for Modern Comm. Systems 27/51

BICM-ID(Li & Ritcey 1997)

A SISO decoder can provide side information to the demapper in the form of a priori symbol likelihoods.– BICM with Iterative Detection The demapper’s output then

becomes

Modulator:Pick Xk ∈ Sfrom (c1 … c μ)

Xk

Nk

Receiver:Compute f(Y|Xk)for every Xk ∈ S

Y Demapper:Compute λnfrom set of f(Y|Xk)and p(Xk)

f(Y|Xk) λncn

fromencoder

to decoder

p(Xk) from decoder

( )

( )∑

∈=

)0(

)1(

)(|

)(|log

nk

nk

SXkk

kSX

k

n XpXYf

XpXYfλ

6/8/2007 Mutual Information for Modern Comm. Systems 28/51

convert LLR to symbol likelihood

Information Transfer Function(ten Brink 1998)

Assume that vn is Gaussian and that:

For a particular channel SNR Es/No, randomly generate a priori LLR’swith mutual information Iv.Measure the resulting capacity:

Demapper:Compute λnfrom set of f(Y|Xk)and p(Xk)

f(Y|Xk) λn

p(Xk)

Inte

rleav

er

SISOdecoder

vn

zn

vnn IvcI =),(

∑=

==μ

μλ1

),(n

znn IcIC

Information Transfer Function

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Iv

I z

gray

SP

MSEW16-QAMAWGN

Es/N0 = 6.8 dB

6 6.5 7 7.5 80

1

2

3

4

Es/No in dB

Cap

acity

(BICM capacity curve)

6/8/2007 Mutual Information for Modern Comm. Systems 30/51

convert LLR to symbol likelihood

Information Transfer Functionfor the Decoder

Similarly, generate a simulated Gaussian decoder input znwith mutual information Iz.Measure the resulting mutual information Iv at the decoder output.

Demapper:Compute λnfrom set of f(Y|Xk)and p(Xk)

f(Y|Xk) λn

p(Xk)

Inte

rleav

er

SISOdecoder

vn

zn

),( nnv vcII =

EXIT Chart

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Iv

I z

16-QAMAWGN6.8 dBadding curve for a FEC codemakes this an extrinsic informationtransfer (EXIT) chart

gray

SP

MSEW

K=3convolutional code

Code Design by Matching EXIT Curves

from M. Xiao and T. Aulin,“Irregular repeat continuous-phase modulation,”IEEE Commun. Letters, Aug. 2005.

coherent MSKEXIT curve at 0.4 dBCapacity is 0.2 dB

6/8/2007 Mutual Information for Modern Comm. Systems 33/51

Overview of TalkThe capacity of AWGN channels– Modulation constrained capacity.– Monte Carlo methods for determining constrained capacity.– CPFSK: A case study on capacity-based optimization.

Design of binary codes– Bit interleaved coded modulation (BICM).– Code design using the EXIT chart.

Nonergodic channels.– Block fading: Information outage probability.– Hybrid-ARQ.– Relaying and cooperative diversity.– Finite length codeword effects.

6/8/2007 Mutual Information for Modern Comm. Systems 34/51

Ergodicityvs. Block Fading

Up until now, we have assumed that the channel is ergodic.– The observation window is large enough that the time-average converges

to the statistical average.Often, the system might be nonergodic.Example: Block fading

b=1γ1

b=2γ2

b=3γ3

b=4γ4

b=5γ5

The codeword is broken into B equal length blocksThe SNR changes randomly from block-to-blockThe channel is conditionally GaussianThe instantaneous Es/No for block b is γb

6/8/2007 Mutual Information for Modern Comm. Systems 35/51

Accumulating Mutual InformationThe SNR γb of block b is a random.Therefore, the mutual information Ib for the block is also random.– With a complex Gaussian input, Ib= log(1+γb)– Otherwise the modulation constrained capacity can be used for Ib

b=1I1 = log(1+γ1)

b=2I2

b=3I3

b=4I4

b=5I5

The mutual information of each block is Ib= log(1+γb)Blocks are conditionally GaussianThe entire codeword’s mutual info is the sum of the blocks’

(Code combining)∑=

=B

bb

B II1

1

6/8/2007 Mutual Information for Modern Comm. Systems 36/51

Information OutageAn information outage occurs after B blocks if

– where R≤log2M is the rate of the coded modulation

An outage implies that no code can be reliable for the particular channel instantiationThe information outage probability is

– This is a practical bound on FER for the actual system.

RI B <1

[ ]RIPP B <= 10

0 10 20 30 40 5010

-6

10-5

10-4

10-3

10-2

10-1

100

Es/No in dB

Info

rmat

ion

Out

age

Pro

babi

lity

Modulation Constrained InputUnconstrained Gaussian Input

B=1

B=2B=3B=4B=10

16-QAMR=2Rayleigh Block Fading

Notice the loss of diversity(see Guillén i Fàbrebas and Caire 2006)

as B →∞,the curvebecomesvertical atthe ergodicRayleigh fadingcapacity bound

6/8/2007 Mutual Information for Modern Comm. Systems 38/51

Hybrid-ARQ(Caire and Tunnineti 2001)

Once the codeword can be decoded with high reliability.Therefore, why continue to transmit any more blocks?With hybrid-ARQ, the idea is to request retransmissions until– With hybrid-ARQ, outages can be avoided.– The issue then becomes one of latency and throughput.

b=1I1 = log(1+γ1)

b=2I2

b=3I3

b=4I4

b=5I5

RI B >1

RI B >1

R

NACK NACK ACK {Wasted transmissions}

6/8/2007 Mutual Information for Modern Comm. Systems 39/51

Latency and Throughputof Hybrid-ARQ

With hybrid-ARQ B is now a random variable.– The average latency is proportional to E[B].– The average throughput is inversely proportional to E[B].

Often, there is a practical upper limit on B– Rateless coding (e.g. Raptor codes) can allow Bmax →∞

An example– HSDPA: High-speed downlink packet access– 16-QAM and QPSK modulation– UMTS turbo code– HSET-1/2/3 from TS 25.101– Bmax = 4

-10 -5 0 5 10 15 20 25 300

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Es/No in dB

Nor

mal

ized

thro

ughp

ut

Unconstrained Gaussian InputModulation Constrained InputSimulated HSDPA Performance

16-QAM QPSK

R = 3202/2400 for QPSKR = 4664/1920 for QAMBmax = 4

T. Ghanim and M.C. Valenti, “The throughput of hybrid-ARQ in block fading under modulation constraints,” in Proc. Conf. on Info. Sci. and Sys. (CISS), Mar. 2006.

6/8/2007 Mutual Information for Modern Comm. Systems 41/51

Hybrid-ARQand Relaying

Now consider the following ad hoc network:

We can generalize the concept of hybrid-ARQ– The retransmission could be from any relay that has accumulated enough

mutual information.– “HARBINGER” protocol

• Hybrid ARq-Based INtercluster GEographic Relaying• B. Zhao and M. C. Valenti. “Practical relay networks: A generalization of

hybrid-ARQ,” IEEE JSAC, Jan. 2005.

Source Destination

Relays

6/8/2007 Mutual Information for Modern Comm. Systems 42/51

HARBINGER: Overview

Amount of fill is proportional to the accumulated entropy.

Once node is filled, it is admitted to the decoding set D.

Any node in D can transmit.

Nodes keep transmitting until Destination is in D.

Source Destination

HARBINGER: Initial Transmission

hop ISource Destination

Now D contains three nodes.Which one should transmit?Pick the one closest to the destination.

HARBINGER: 2nd Transmission

hop II

Source DestinationRelay

HARBINGER: 3rd Transmission

hop IV

Source

Relay

Destination

HARBINGER: 4th Transmission

hop III

Source

Relay

Destination

HARBINGER: ResultsC

umul

ativ

e tra

nsm

it SN

R E

b/N

o (d

B)

1 1080

85

90

95

100

105

110

115

Average delay

Direct LinkM ultihopHARBINGER

1 relay

10 relays

Topology:Relays on straight lineS-D separated by 10 m

Coding parameters:Per-block rate R=1No limit on MCode Combining

Channel parameters:n = 3 path loss exponent2.4 GHzd0 = 1 m reference dist

Unconstrained modulation

B. Zhao and M. C. Valenti. “Practical relay networks: A generalization of hybrid-ARQ,” IEEE JSAC, Jan. 2005.

Relaying has better energy-latencytradeoff than conventional multihop

Finite Length Codeword Effects

Outage Region

101

102

103

104

105

106

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Codeword length

capa

city

-2 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 310

-6

10-5

10-4

10-3

10-2

10-1

100

Eb/No in dB

FER

informationoutage probabilityfor (1092,360)code with BPSKin AWGN

FER of the (1092,360)UMTS turbo codewith BPSKin AWGN

6/8/2007 Mutual Information for Modern Comm. Systems 50/51

ConclusionsWhen designing a system, first determine its capacity.– Only requires a slight modification of the modulation simulation.– Does not require the code to be simulated.– Allows for optimization with respect to free parameters.

After optimizing with respect to capacity, design the code.– BICM with a good off-the-shelf code.– Optimize code with respect to the EXIT curve of the modulation.

Information outage analysis can be used to characterize:– Performance in slow fading channels.– Delay and throughput of hybrid-ARQ retransmission protocols.– Performance of multihop routing and relaying protocols.– Finite codeword lengths.

6/8/2007 Mutual Information for Modern Comm. Systems 51/51

Thank YouFor more information and publications– http://www.csee.wvu.edu/~mvalenti

Free software– http://www.iterativesolutions.com– Runs in matlab but implemented mostly in C– Modulation constrained capacity– Information outage probability– Throughput of hybrid-ARQ– Standardized codes: UMTS, cdma2000, and DVB-S2