computing and communications 2. information theory

36
1896 1920 1987 2006 Computing and Communications 2. Information Theory -Channel Capacity Ying Cui Department of Electronic Engineering Shanghai Jiao Tong University, China 2017, Autumn 1

Upload: others

Post on 01-Nov-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Computing and Communications 2. Information Theory

1896 1920 1987 2006

Computing and Communications2. Information Theory

-Channel Capacity

Ying Cui

Department of Electronic Engineering

Shanghai Jiao Tong University, China

2017, Autumn

1

Page 2: Computing and Communications 2. Information Theory

Outline

• Communication system

• Examples of channel capacity

• Symmetric channels

• Properties of channel capacity

• Definitions

• Channel coding theorem

• Source-channel coding theorem

2

Page 3: Computing and Communications 2. Information Theory

Reference

• Elements of information theory, T. M. Cover and J. A. Thomas, Wiley

3

Page 4: Computing and Communications 2. Information Theory

CHANNEL CAPACITY

4

Page 5: Computing and Communications 2. Information Theory

Communication System

– map source symbols from finite alphabet into some sequence of channel symbols, i.e., input sequence of channel

– output sequence of channel is random but has a distribution depending on input sequence of channel• two different input sequences may give rise to same output sequence, i.e.,

inputs are confusable• choose a “nonconfusable” subset of input sequences so that with high

probability there is only one highly likely input that could have caused the particular output

– attempt to recover transmitted message from output sequence of channel• reconstruct input sequences with a negligible probability of error

5

Page 6: Computing and Communications 2. Information Theory

Channel Capacity

6

Page 7: Computing and Communications 2. Information Theory

EXAMPLES OF CHANNEL CAPACITY

7

Page 8: Computing and Communications 2. Information Theory

Noiseless Binary Channel

• Binary input is reproduced exactly at output

• C = max I(X; Y) = 1 bit, achieved using p(x) = (1/2, 1/2)

– one error-free bit can be transmitted per channel use

8

Page 9: Computing and Communications 2. Information Theory

Noisy Channel with Nonoverlapping Outputs

• Two possible outputs corresponding to each of the two inputs– appear to be noisy, but really not

• C = max I(X; Y) = 1 bit, achieved using p(x) = (1/2, 1/2)– input can be determined from the output

– every transmitted bit can be recovered without error

9

Page 10: Computing and Communications 2. Information Theory

Noisy Typewriter

• Channel input is either unchanged with probability 1/2 or is transformed into the next letter with probability 1/2

• If the input has 26 symbols and we use every alternate input symbol, we can transmit one of 13 symbols without error with each transmission

• C = max I(X; Y)= max (H(Y) – H(Y|X))= max H(Y) – 1 = log 26 – 1 = log 13

achieved using p(x) = (1/26,…, 1/26)

10

Page 11: Computing and Communications 2. Information Theory

• Input symbols are complemented with probability p

• \\\

equality is achieved when the input distribution is uniform

Binary Symmetric Channel

11

Page 12: Computing and Communications 2. Information Theory

Binary Erasure Channel

• Two inputs and three outputs, a fraction of bits are erased

• Xx

• Recover at most 1-α of bits, as α of bits are lost

12

𝜋=1/2achieved when

Page 13: Computing and Communications 2. Information Theory

SYMMETRIC CHANNELS

13

Page 14: Computing and Communications 2. Information Theory

Symmetric

– example of symmetric channel

14

Page 15: Computing and Communications 2. Information Theory

Proof

15

Page 16: Computing and Communications 2. Information Theory

PROPERTIES OF CHANNEL CAPACITY

16

Page 17: Computing and Communications 2. Information Theory

Properties of Channel Capacity

• is a continuous function of p(x)

• is a concave function of p(x)

• Problem for computing channel capacity is a convex problem– maximization of a bounded concave function over a closed

convex set

– maximum can then be found by standard nonlinear optimization techniques such as gradient search

17

0 since ( ; ) 0C I X Y

log since max ( ; ) max ( ) logC C I X Y H X

( ; )I X Y

( ; )I X Y

log since max ( ; ) max ( ) logC C I X Y H Y

Page 18: Computing and Communications 2. Information Theory

DEFINITIONS

18

Page 19: Computing and Communications 2. Information Theory

Discrete Memoryless Channel (DMC)

19

Page 20: Computing and Communications 2. Information Theory

Code

20

Page 21: Computing and Communications 2. Information Theory

Probability of Error

21

Page 22: Computing and Communications 2. Information Theory

Rate and Capacity

– write (2𝑛𝑅, 𝑛) codes to mean ( 2𝑛𝑅 , 𝑛) codes to simplify the notation

22

Page 23: Computing and Communications 2. Information Theory

CHANNEL CODING THEOREM(SHANNON’S SECOND THEOREM)

23

Page 24: Computing and Communications 2. Information Theory

Basic Idea

• For large block lengths, every channel has a subset of inputs producing disjoint sequences at the output

• Ensure that no two input X sequences produce the same output Y sequence, to determine which X sequence was sent

24

Page 25: Computing and Communications 2. Information Theory

Basic Idea

• Total number of possible output Y sequences is ≈

2𝑛𝐻(𝑌)

• Divide into sets of size 2𝑛𝐻(𝑌|𝑋) corresponding to the different input X sequences

• Total number of disjoint sets is less than or equal to

2𝑛(𝐻 𝑌 −𝐻(𝑌|𝑋)) = 2𝑛𝐼(𝑋;𝑌)

• Send at most ≈ 2𝑛𝐼(𝑋;𝑌) distinguishable sequences of length n

25

Page 26: Computing and Communications 2. Information Theory

Channel Coding Theorem

26

Page 27: Computing and Communications 2. Information Theory

New Ideas in Shannon’s Proof

• Allowing an arbitrarily small but nonzero probability of error

• Using the channel many times in succession, so that the law of large numbers comes into effect

• Calculating the average of the probability of error over a random choice of codebooks

– symmetrize the probability and can then be used to show the existence of at least one good code

• Shannon’s proof outline was based on idea of typical sequences, but was not made rigorous until much later

27

Page 28: Computing and Communications 2. Information Theory

Current Proof

• Use the same essential ideas– random code selection, calculation of the average probability of error

for a random choice of codewords, and so on

• Main difference is in the decoding rule-decode by joint typicality– look for a codeword that is jointly typical with the received sequence– if find a unique codeword satisfying this property, declare that word to

be the transmitted codeword– properties of joint typicality

• with high probability the transmitted codeword and the received sequence are jointly typical, since they are probabilistically related

• probability that any other codeword looks jointly typical with the received sequence is 2−𝑛𝐼

• thus, if we have fewer then 2𝑛𝐼 codewords, then with high probability there will be no other codewords that can be confused with the transmitted codeword, and the probability of error is small

28

Page 29: Computing and Communications 2. Information Theory

SOURCE–CHANNEL SEPARATION THEOREM (SHANNON’S THIRD THEOREM)

29

Page 30: Computing and Communications 2. Information Theory

Two Main Basic Theorems

• Data compression: R>H

• Data transmission: R<C

• Is condition H < C necessary and sufficient for sending a source over a channel?

30

Page 31: Computing and Communications 2. Information Theory

Example

• Consider two methods for sending digitized speech over a discrete memoryless channel

– one-stage method: design a code to map the sequence of speech samples directly into the input of the channel

– two-stage method: compress the speech into its most efficient representation, then use the appropriate channel code to send it over the channel

• Lose something by using the two-stage method?

– data compression does not depend on the channel

– channel coding does not depend on the source distribution

31

Page 32: Computing and Communications 2. Information Theory

Joint vs. Separate Channel Coding

• Joint source and channel coding

• Separate source and channel coding

32

Page 33: Computing and Communications 2. Information Theory

Source–Channel Coding Theorem

– consider the design of a communication system as a combination of two parts• source coding: design source codes for the most efficient representation of the

data

• channel coding: design channel codes appropriate for the channel encodes (combat the noise and errors introduced by the channel)

– the separate encoders can achieve the same rates as the joint encoder• hold for the situation where one transmitter communicates to one receiver

33

Page 34: Computing and Communications 2. Information Theory

Summary

34

Page 35: Computing and Communications 2. Information Theory

Summary

35