computing and communications 2. information theory

1896 1920 1987 2006

Computing and Communications2. Information Theory

-Channel Capacity

Ying Cui

Department of Electronic Engineering

Shanghai Jiao Tong University, China

2017, Autumn

1

Outline

• Communication system

• Examples of channel capacity

• Symmetric channels

• Properties of channel capacity

• Definitions

• Channel coding theorem

• Source-channel coding theorem

2

Reference

• Elements of information theory, T. M. Cover and J. A. Thomas, Wiley

3

CHANNEL CAPACITY

4

Communication System

– map source symbols from finite alphabet into some sequence of channel symbols, i.e., input sequence of channel

– output sequence of channel is random but has a distribution depending on input sequence of channel• two different input sequences may give rise to same output sequence, i.e.,

inputs are confusable• choose a “nonconfusable” subset of input sequences so that with high

probability there is only one highly likely input that could have caused the particular output

– attempt to recover transmitted message from output sequence of channel• reconstruct input sequences with a negligible probability of error

5

Channel Capacity

6

EXAMPLES OF CHANNEL CAPACITY

7

Noiseless Binary Channel

• Binary input is reproduced exactly at output

• C = max I(X; Y) = 1 bit, achieved using p(x) = (1/2, 1/2)

– one error-free bit can be transmitted per channel use

8

Noisy Channel with Nonoverlapping Outputs

• Two possible outputs corresponding to each of the two inputs– appear to be noisy, but really not

• C = max I(X; Y) = 1 bit, achieved using p(x) = (1/2, 1/2)– input can be determined from the output

– every transmitted bit can be recovered without error

9

Noisy Typewriter

• Channel input is either unchanged with probability 1/2 or is transformed into the next letter with probability 1/2

• If the input has 26 symbols and we use every alternate input symbol, we can transmit one of 13 symbols without error with each transmission

• C = max I(X; Y)= max (H(Y) – H(Y|X))= max H(Y) – 1 = log 26 – 1 = log 13

achieved using p(x) = (1/26,…, 1/26)

10

• Input symbols are complemented with probability p

• \\\

equality is achieved when the input distribution is uniform

Binary Symmetric Channel

11

Binary Erasure Channel

• Two inputs and three outputs, a fraction of bits are erased

• Xx

• Recover at most 1-α of bits, as α of bits are lost

12

𝜋=1/2achieved when

SYMMETRIC CHANNELS

13

Symmetric

– example of symmetric channel

14

Proof

15

PROPERTIES OF CHANNEL CAPACITY

16

Properties of Channel Capacity

•

•

•

• is a continuous function of p(x)

• is a concave function of p(x)

• Problem for computing channel capacity is a convex problem– maximization of a bounded concave function over a closed

convex set

– maximum can then be found by standard nonlinear optimization techniques such as gradient search

17

0 since ( ; ) 0C I X Y

log since max ( ; ) max ( ) logC C I X Y H X

( ; )I X Y

( ; )I X Y

log since max ( ; ) max ( ) logC C I X Y H Y

DEFINITIONS

18

Discrete Memoryless Channel (DMC)

19

Code

20

Probability of Error

21

Rate and Capacity

– write (2𝑛𝑅, 𝑛) codes to mean ( 2𝑛𝑅 , 𝑛) codes to simplify the notation

22

CHANNEL CODING THEOREM(SHANNON’S SECOND THEOREM)

23

Basic Idea

• For large block lengths, every channel has a subset of inputs producing disjoint sequences at the output

• Ensure that no two input X sequences produce the same output Y sequence, to determine which X sequence was sent

24

Basic Idea

• Total number of possible output Y sequences is ≈

2𝑛𝐻(𝑌)

• Divide into sets of size 2𝑛𝐻(𝑌|𝑋) corresponding to the different input X sequences

• Total number of disjoint sets is less than or equal to

2𝑛(𝐻 𝑌 −𝐻(𝑌|𝑋)) = 2𝑛𝐼(𝑋;𝑌)

• Send at most ≈ 2𝑛𝐼(𝑋;𝑌) distinguishable sequences of length n

25

Channel Coding Theorem

26

New Ideas in Shannon’s Proof

• Allowing an arbitrarily small but nonzero probability of error

• Using the channel many times in succession, so that the law of large numbers comes into effect

• Calculating the average of the probability of error over a random choice of codebooks

– symmetrize the probability and can then be used to show the existence of at least one good code

• Shannon’s proof outline was based on idea of typical sequences, but was not made rigorous until much later

27

Current Proof

• Use the same essential ideas– random code selection, calculation of the average probability of error

for a random choice of codewords, and so on

• Main difference is in the decoding rule-decode by joint typicality– look for a codeword that is jointly typical with the received sequence– if find a unique codeword satisfying this property, declare that word to

be the transmitted codeword– properties of joint typicality

• with high probability the transmitted codeword and the received sequence are jointly typical, since they are probabilistically related

• probability that any other codeword looks jointly typical with the received sequence is 2−𝑛𝐼

• thus, if we have fewer then 2𝑛𝐼 codewords, then with high probability there will be no other codewords that can be confused with the transmitted codeword, and the probability of error is small

28

SOURCE–CHANNEL SEPARATION THEOREM (SHANNON’S THIRD THEOREM)

29

Two Main Basic Theorems

• Data compression: R>H

• Data transmission: R<C

• Is condition H < C necessary and sufficient for sending a source over a channel?

30

Example

• Consider two methods for sending digitized speech over a discrete memoryless channel

– one-stage method: design a code to map the sequence of speech samples directly into the input of the channel

– two-stage method: compress the speech into its most efficient representation, then use the appropriate channel code to send it over the channel

• Lose something by using the two-stage method?

– data compression does not depend on the channel

– channel coding does not depend on the source distribution

31

Joint vs. Separate Channel Coding

• Joint source and channel coding

• Separate source and channel coding

32

Source–Channel Coding Theorem

– consider the design of a communication system as a combination of two parts• source coding: design source codes for the most efficient representation of the

data

• channel coding: design channel codes appropriate for the channel encodes (combat the noise and errors introduced by the channel)

– the separate encoders can achieve the same rates as the joint encoder• hold for the situation where one transmitter communicates to one receiver

33

Summary

34

Summary

35

[email protected]/Personal/yingcui

36

mailto:[email protected]

http://iwct.sjtu.edu.cn/Personal/yingcui

computing and communications 2. information theory

Documents