ece 566 - information theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · thus in...

198
ECE 566 - INFORMATION THEORY Prof. Thinh Nguyen 1

Upload: others

Post on 02-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

ECE 566 - INFORMATION THEORYProf. Thinh Nguyen1

Page 2: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

LOGISTICS Textbook: Elements of Information Theory, Thomas Cover and Joy

Thomas, Wiley, second edition

Class email list: [email protected]

2

Page 3: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

LOGISTICS

Grading Policy

Quiz 1: 15% Quiz 2: 15% Quiz 3: 15% Quiz 4: 15%

Lowest score quiz will be dropped

Homework: 15% Final: 40%

3

Page 4: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

WHAT IS INFORMATION?

Information theory deals with the concept ofinformation, its measurement and itsapplications.

4

Page 5: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

ORIGIN OF INFORMATION THEORY

Two schools of thoughts British

Semantic information : related to the meaning of messages Pragmatic information : related to the usage and effect of

messages American

Syntactic: related to the symbols from which messages arecomposed, and their interrelations

5

Page 6: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

ORIGIN OF INFORMATION THEORY

Consider the following sentences1. Jacqueline was awarded the gold medal by the judges at

the national skating competition.2. The judges awarded Jacqueline the gold medal at the

national skating competition.3. There is a traffic jam on I-5 between Corvallis and

Eugene in Oregon4. There is a traffic jam on I-5 in Oregon

(1) and (2): syntactically different but semanticallyand pragmatically identical.

(3) and (4): syntactically and semantically different,(3) gives more precise information than (4)

(3) and (4) are irrelevant for Californians. 6

Page 7: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

ORIGIN OF INFORMATION THEORY

The British tradition is closely related tophilosophy, psychology and biology. Influencedscientists include MacKay, Carnap, Bar-Hillel, … Very hard if not impossible to quantify

The American traditional deals with thesyntactic aspects of information. Influencedscientists include Shannon, Renyi, Gallager andCsiszar, … Can be rigorously quantified by mathematics

7

Page 8: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

ORIGIN OF INFORMATION THEORY

Basic questions of information theory in the American tradition involve the measurement of syntactic information

The fundamental limits on the amount of information which can be transmitted

The fundamental limits on the compression of information which can be achieved

How to build information processing systems approaching these limits?

8

Page 9: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

ORIGIN OF INFORMATION THEORY

H. Nyquist (1924) published an article whereinhe discussed how messages (characters) could besent over a telegraph channel with maximumpossible speed, but without distortion.

R. Hartley (1928) who first defined a measure ofinformation as the logarithm of the number ofdistinguishable messages one can representusing n characters, each can take s possibleletters.

9

Page 10: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

ORIGIN OF INFORMATION THEORY

For a message of length n, we have the Hartley’s measure of information as:

For a message of length of 1, we have the Hartley’s measure of information as:

This definition corresponds with the intuitive idea that a message consisting of n symbols contains ntimes as much information as a message consisting of only one symbols.

)log()log()( snssH nnH ==

)log(1)log()( 11 sssH H ==

10

Page 11: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

ORIGIN OF INFORMATION THEORY

Note that any function would satisfy our intuition. One can show that the only functions that satisfy such equation is of the form:

The choice of a is arbitrary, and is more a matter of normalization.

)()( snfsf n =

)(log)( ssf a=

11

Page 12: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

AND NOW…, THE SHANNON’S INFORMATION THEORY(a.k.a communication theory, mathematical information theory, or in short as information theory)

12

Page 13: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

CLAUDE SHANNON

“The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point.” (Claude Shannon 1948)

Channel Coding Theorem: It is possible to achieve near perfect communication

of information over a noisy channel.

In this course we will: Define what we mean by information Show how we can compress the information in a

source to its theoretically minimum value and show the tradeoff between data compression and distortion.

Prove the Channel Coding Theorem and derive the information capacity of different channels

1916-2001

13

Page 14: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

)(log)()](log[)( 22 xpxpXEXHAx∑∈

−=−=

The little formula that starts it all

14

Page 15: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SOME FUNDAMENTAL QUESTIONS

What is the minimum number of bits that can be used to represent the following texts?

(that can be addressed by information theory)

"There is a wisdom that is woe; but there is a woe that is madness. And there is a Catskill eagle in some souls that can alike dive down into the blackest gorges, and soar out of them again and become invisible in the sunny spaces. And even if he for ever flies within the gorge, that gorge is in the mountains; so that even in his lowest swoop the mountain eagle is still higher than other birds upon the plain, even though they soar."

- Moby Dick, Herman Melville

15

Page 16: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SOME FUNDAMENTAL QUESTIONS

How fast can information be sent reliably over an error prone channel?

(that can be addressed by information theory)

"Thera is a wisdem that is woe; but thera is a woe that is madness. And there is a Catskill eagle in sme souls that can alike divi down into the blackst goages, and soar out of thea agein and become invisble in the sunny speces. And even if he for ever flies within the gorge, that gorge is in the mountains; so that even in his lotest swoap the mountin eagle is still highhr than other berds upon the plain, even though they soar."

- Moby Dick, Herman Melville

16

Page 17: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

FROM QUESTIONS TO APPLICATIONS

17

Page 18: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SOME NOT SO FUNDAMENTAL QUESTIONS(that can be addressed by information theory)

How to make money? (Kelly’s Portfolio Theory)

What is the optimal dating strategy?

18

Page 19: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

FUNDAMENTAL ASSUMPTION OFSHANNON’S INFORMATION THEORY

E.g., Newtonian universe is deterministic, all events can be predicted with 100% accuracy by Newton’s laws, given the initial conditions.

U N I V E R S E

Physical Model (Laws)

If there is no irreducible randomness, the description/history of our universe, therefore can be compressed into a few equations.

19

Page 20: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

FUNDAMENTAL ASSUMPTION OFSHANNON’S INFORMATION THEORY

Some well-defined stochastic process that generates all the observations

U N I V E R S E

Probabilistic model

Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based on the “ignorant model.” Nevertheless, it is a good approximation, depending on how accurate the probabilistic model used to describe the phenomenon of interest. 20

Page 21: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

FUNDAMENTAL ASSUMPTION OFSHANNON’S INFORMATION THEORY

"There is a wisdom that is woe; but there is a woe that is madness. And there is a Catskill eagle in some souls that can alike dive down into the blackest gorges, and soar out of them again and become invisible in the sunny spaces. And even if he for ever flies within the gorge, that gorge is in the mountains; so that even in his lowest swoop the mountain eagle is still higher than other birds upon the plain, even though they soar."

- Moby Dick, Herman Melville

We know the texts above is not random, but we can build a probabilistic model for it.

For example, a first order approximation could be to build a histogram of different letters in the text, then approximate the probability that a particular letter appears based on the histogram (assuming iid).

21

Page 22: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

WHY SHANNON INFORMATION IS SOMEWHATUNSATISFACTORY?

How much Shannon information does this picture contain?

22

Page 23: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

ALGORITHMIC INFORMATION THEORY(KOLMOGOROV COMPLEXITY)

The Kolmogorov complexity K(x) of a sequence x is the minimum size of the program and its inputs needed to generate x.

Example: If x was a sequence of all ones, a highly compressible sequence, the program would simply be a print statement in a loop. On the other extreme, if x were a random sequence with no structure, then the only program that could generate it would contain the sequence itself. 23

Page 24: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

REVIEW OF BASIC PROBABILITY

A discrete random variable X takes a value xfrom the alphabet A with probability p(x).

24

Page 25: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

EXPECTED VALUE

If g(x) is real valued and defined on A then

∑∈

=Ax

X xgxpXgE )()()]([

25

Page 26: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SHANNON INFORMATION CONTENT

The Shannon Information Content of an outcome with probability p is –log2p

26

Page 27: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

MINESWEEPER

Where is the bomb? 16 possibilities - needs 4 bits to specify

27

Page 28: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

ENTROPY

∑∈

−=−=Ax

xpxpXpEXH ))((log)())]((log[)( 22

28

Page 29: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

ENTROPY EXAMPLES

Bernoulli Random Variable

29

Page 30: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

ENTROPY EXAMPLES

Four Colored Shapes A = [ ; ; ; ]

p(X) = [1/2;1/4;1/8;1/8]

30

Page 31: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

DERIVATION OF SHANNON ENTROPY

∑=

=n

iii ppXH

12log)(

Intuitive Requirements:

1. We want H to be a continuous function of probabilities pi . That is, a small change in pi should only cause a small change in H

2. If all events are equally likely, that is, pi = 1/n for all i, then H should be a monotonically increasing function of n. The more possible outcomes there are, the more information should be contained in the occurrence of any particular outcome.

3. It does not matter how we divide the group of outcomes, the total of information should be the same.

31

Page 32: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

DERIVATION OF SHANNON ENTROPY

32

Page 33: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

DERIVATION OF SHANNON ENTROPY

33

Page 34: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

DERIVATION OF SHANNON ENTROPY

34

Page 35: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

DERIVATION OF SHANNON ENTROPY

35

Page 36: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

LITTLE PUZZLE

Given twelve coins, eleven of which are identical, and one is either lighter or heavier than the rest. You have a scale that can determine whether what you put on the left pan is heavier, lighter, or the same as what you put on the right pan. What is the minimum number of weighing for you determine the odd coin, and whether it is heavier or lighter than the rest?

36

Page 37: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

JOINT ENTROPY H(X,Y)

)],(log[),( YXpEYXH −=p(X, Y) Y = 0 Y = 1

X = 0 1/2 1/4

X = 1 0 1/4

p(X,Y) Y = 0 Y = 1

X = 0 1/4 1/4

X = 1 1/4 1/4

37

Page 38: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

JOINT ENTROPY H(X,Y)2XY =Example: ]1 ,1[−∈X 0.9] ,1.0[=Xp

p(X,Y) Y = 1

X = -1 0.1

X = 1 0.9

38

Page 39: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

CONDITIONAL ENTROPY H(Y|X)

)]|(log[)|( XYpEXYH −=p(Y|X) Y = 0 Y = 1

X = 0 2/3 1/3

X = 1 0 1

39

Page 40: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

CONDITIONAL ENTROPY H(Y|X)

Example: ]1 ,1[−∈X 0.9] ,1.0[=Xp

p(Y|X) Y = 1

X = -1 1

X = 1 1

2XY =

p(X|Y) Y = 1

X = -1 0.1

X = 1 0.9 40

Page 41: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

INTERPRETATIONS OF CONDITIONALENTROPY H(Y|X) H(Y|X): The average uncertainty/information in

Y when you know X

H(Y|X): The weighted average of row entropiesp(X,Y) Y=0 Y = 1 H(Y|X) = x p(X)X=0 1/2 1/4 H(1/3) 3/4X=1 0 1/4 H(1) 1/4

41

Page 42: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

CHAIN RULES

For two RVs,

In general,

),...,|(),...,,( 121

121 XXXXHXXXH i

n

iiin −

=−∑=

)|()()|()(),( YXHYHXYHXHYXH +=+=

Proof:

Proof:

42

Page 43: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

GRAPHICAL VIEW OF ENTROPY, JOINTENTROPY, AND CONDITIONAL ENTROPY

H(X,Y)

H(Y)H(X)

H(X|Y)H(Y|X)

)|()()|()(),( YXHYHXYHXHYXH +=+=

43

Page 44: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

MUTUAL INFORMATION

The mutual information is the average amount of information that you get about X from observing the value of Y

),()()()|()();( YXHYHXHYXHXHYXI −+=−=

H(X,Y)

H(Y)H(X)

H(X|Y)H(Y|X)

I(X;Y)

44

Page 45: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

MUTUAL INFORMATION

The mutual information is symmetrical

);();( XYIYXI =

Proof:

45

Page 46: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

MUTUAL INFORMATION EXAMPLE

If you try to guess Y, you have a 50% chance of being correct

However, what if you know X? Best guess: choose Y = X What is the overall probability

of guessing correctly?

p(X, Y) Y = 0 Y = 1

X = 0 1/2 1/4

X = 1 0 1/4

H(X,Y)=1.5

H(Y)=1H(X)=0.811

H(X|Y)=0.5

H(Y|X)=0.689I(X;Y)=0.311

46

Page 47: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

CONDITIONAL MUTUAL INFORMATION

)|,()|()|(),|()|()|;( ZYXHZYHZXHZYXHZXHZYXI −+=−=

Note: Z conditioning applies to both X and Y

47

Page 48: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

CHAIN RULE FOR MUTUAL INFORMATION

),...,,|;();,...,,( 1211

21 XXXYXIYXXXI ii

n

iin −−

=∑=

Proof:

48

Page 49: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

EXAMPLE OF USING CHAIN RULE FORMUTUAL INFORMATION

X +

Z

Y

Find I(X,Z;Y).

p(X, Z) Z = 0 Z = 1

X = 0 1/4 1/4

X = 1 1/4 1/4

49

Page 50: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

CONCAVE AND CONVEX FUNCTIONS

f(x) is strictly convex over (a,b) if

1 0 ),,( )()1()())1(( <<∈≠∀−+<−+ λλλλλ bavuvfufvuf

Examples:

Technique to determine the convexity of a function:

50

Page 51: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

JENSEN’S INEQUALITY

convex ])[()]([ XEfXfE ≥)(Xf

strictly convex ])[()]([ XEfXfE >)(Xf

Proof:

51

Page 52: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

JENSEN’S INEQUALITY EXAMPLE

Mnemonic example:2)( xxf =

52

Page 53: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

RELATIVE ENTROPY

Relative entropy of Kullback-Leibler Divergence between two probability mass vectors (functions) p and q is defined as:

)()](log[])()([log

)()(log)()||( XHxqE

xqxpE

xqxpxpqpD p

Axp −−===∑

Property of )||( qpD

53

Page 54: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

RELATIVE ENTROPY EXAMPLE

Rain Cloudy SunnyWeather at

Seattlep(x) 1/4 1/2 1/4

Weather at Corvallis

q(x) 1/3 1/3 1/3

)||( qpD

)||( pqD54

Page 55: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

INFORMATION INEQUALITY

0)||( ≥qpD

Proof:

55

Page 56: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

INFORMATION INEQUALITY

Uniform distribution has highest entropyProof:

Mutual Information is non-negativeProof:

0);( ≥YXI

||log)( AXH ≤

56

Page 57: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

INFORMATION INEQUALITY

Conditioning reduces entropyProof:

Independence bound Proof:

∑=

≤n

iin XHXXXH

121 )(),...,,(

)()|( XHYXH ≤

57

Page 58: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

INFORMATION INEQUALITY

Conditional independence bound

Proof:

Mutual information independence bound

Proof:

∑=

≥n

iiinn YXIYYYXXXI

12121 );(),...,,;,...,,(

∑=

≤n

iiinn YXHYYYXXXH

12121 )|(),...,,|,...,,(

58

Page 59: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SUMMARY

59

Page 60: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

LECTURE 2: SYMBOL CODE

Symbol codes Non-singular codes Uniquely decodable codes Prefix codes (instantaneous codes)

Kraft Inequality Minimum Code Length

60

Page 61: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SYMBOL CODES

Mapping of letters from one alphabet to another

ASCII codes: computers do not process English language directly. Instead, English letters are first mapped into bits, then processed by the computer.

Morse codes

Decimal to binary conversion

Process of mapping from letters in one alphabet to letters in another is called coding. 61

Page 62: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SYMBOL CODES

Symbol code is a simple mapping. Formally, symbol code C is a mapping X D+

X = the source alphabet D = the destination alphabet D+ = set of all finite strings from D

Example:

{X,Y,Z} {0,1}+ , C(X) = 0, C(Y) = 10, C(Z) = 1

Codeword: the result of a mapping, e.g., 0, 10, … Code: a set of valid codewords. 62

Page 63: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SYMBOL CODES

Extension: C+ is a mapping X+ D+ formed by concatenating C(xi) without punctuation

Example: C(XXYXZY) = 001001110

Non-singular: x1≠x2 → C(x1) ≠C(x2)

Uniquely decodable: C+ is non-singular That is C+(x+) is unambiguous

63

Page 64: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

PREFIX CODES

Instantaneous or Prefix code: No codeword is a prefix of another

Prefix → uniquely decodable → non-singular

Examples:

C1 C2 C3 C4W 0 00 0 0X 11 01 10 01Y 00 10 110 011Z 01 11 1110 0111

64

Page 65: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

CODE TREE

Form a D-ary tree where D = |D|

D branches at each node Each branch denotes a symbol di in D Each leaf denotes a source symbol xi in X C(xi) = concatenation of symbols di along

the path from the root to the leaf that represents xi

Some leaves may be unused

X

W

Y

Z

0

1

1

00

01

1

C(WWYXZ) = 65

Page 66: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

KRAFT INEQUALITY FOR PREFIX CODES

For any prefix code, we have:

where li is the length of codeword i and D = |D|. Proof:

∑ ≤−||

1X

i

liD

66

Page 67: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

KRAFT INEQUALITY FOR UNIQUELYDECODABLE CODES

For any uniquely decodable code with codewords of lengths l1, l2, …, l|X|, then

Proof:∑ ≤−

||

1X

i

liD

67

Page 68: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

CONVERSE OF KRAFT INEQUALITY

If , then there exists a prefix code with codelengths l1, l2, … l|X|.

Proof:

∑ ≤−||

1X

i

liD

68

Page 69: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

EXAMPLE OF CONVERSE OF KRAFTINEQUALITY

69

Page 70: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

OPTIMAL CODE

If l(C(x))= the length of C(x), then C is optimal if LC = E[l(C(X))] is as small as possible for a given p(X).

C is a uniquely decodable code → LC ≥H(X)/log2D

Proof:

70

Page 71: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SUMMARY

Symbol code

Kraft inequality for uniquely decodable code

Lower bound for any uniquely decodable code

71

Page 72: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

LECTURE 3: PRACTICAL SYMBOL CODES

Fano code Shannon code Huffman code

72

Page 73: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

FANO CODE

Put the probabilities in decreasing order Split as close to 50-50 as possible. Repeat with each half

73

H(p) = 2.81 bits

L F=2.89 bits

Not optimal!

Lopt =2.85 bits

Page 74: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

CONDITIONS FOR OPTIMAL PREFIX CODE(ASSUMING BINARY PREFIX CODE) An optimal prefix code must satisfy:

p(xi) > p(xj) → l(xi) ≤ l(xj) (else swap them)

The two longest codewords must have the same length ( else chop a bit off the codeword)

In the tree corresponding to the optimum code, there must be two branches stemming from each intermediate node.

74

Page 75: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

HUFFMAN CODE CONSTRUCTION

1. Take the two smallest p(xi) and assign each a different last bit. Then merge into a single symbol.

2. Repeat step 1 until only one symbol remains

75

Page 76: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

HUFFMAN CODE EXAMPLE

76

Page 77: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

HUFFMAN CODE EXAMPLE (CONT)Letter Probability Codeword

x1 0.4x2 0.2x3 0.2x4 0.1x5 0.1

77

Page 78: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

HUFFMAN CODE IS OPTIMAL PREFIXCODE

78

We want to show that all of these codes (include c5) is optimal

Page 79: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

HUFFMAN OPTIMALITY PROOF

79

Page 80: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

HOW SHORT ARE THE OPTIMAL CODE? If l(x) = length(C(x)) then C is optimal if LC=E l(x)

is as small as possible.

We want to minimize

Subject to:

andl(xi) are integers

80

∑i

ii xlxp )()(

1)( ≤∑ −

i

xl iD

Page 81: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

OPTIMAL CODES (NON-INTEGER LENGTH) Minimize

subject to:

81

∑i

ii xlxp )()(

1)( ≤∑ −

i

xl iD

Use Lagrange multiplier method:

Page 82: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SHANNON CODE

Round up optimal code lengths:

li satisfy the Kraft Inequality → Prefix code exists. Construction for such code is done earlier.

Average length:

Proof:

82

)(log iDi xpl −=

1log

)(log

)(+≤≤

DXHL

DXH

S

Page 83: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SHANNON CODE EXAMPLES

x1 x2 x3 x4

p(xi) 1/2 1/4 1/8 1/8

83

x1 x2

p(xi) 0.99 0.01

Page 84: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

HUFFMAN CODE VS. SHANNON CODE

84

x1 x2 x3 x4

p(xi) 0.34 0.36 0.25 0.5Shannon 2 2 2 5Huffman 1 2 3 3

)(log ixp−

LS = 2.15 bits, LH = 1.94 bits

H(X) = 1.78 bits

Page 85: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SHANNON COMPETITIVE OPTIMALITY

l(x) is length of a uniquely decodable code, lS(x) is length of Shannon code then

85

cS cXlXlp −≤−< 12))()((

Proof:

Page 86: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

DYADIC COMPETITIVE OPTIMALITY

If p(x) is dyadic ↔ log(p(xi)) is integer for all i, then

with equality iff l(X) ≡ lS(X)Proof:

86

))()(())()(( XlXlpXlXlp SS <≤<

Page 87: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SHANNON WITH WRONG DISTRIBUTION

If the real distribution of X is p(x) but you assign Shannon lengths using the distribution q(x) what is the penalty ?Answer: D(p||q)

Proof:

87

Page 88: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SUMMARY

88

Page 89: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

LECTURE 4 Stochastic Processes Entropy Rate Markov Processes Hidden Markov Processes

89

Page 90: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

STOCHASTIC PROCESS

Stochastic process {Xi}=X1, X2, X3, … Define entropy of {Xi} as

H({Xi} ) = H(X1, X2, X3, …)= H(X1) + H(X2|X1) + H(X3|X2,X1) + ….

Define entropy rate of {Xi} as

Entropy rate estimates the additional entropy per new sample. Gives a lower bound on number of code bits per sample. If the Xi are not i.i.d. the entropy rate limit may not exist.

90

nXXXHXH n

n

),...,,(lim)( 21_

∞→=

Page 91: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

91

EXAMPLES

91

Page 92: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

LIMIT OF CESARO MEAN

92

bakk→

∞→lim ba

n

n

kkn→∑

=∞→

0

1lim

Proof:

Page 93: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

93

A stochastic process {Xi} is stationary iffnkaXaXaXPaXaXaXP nnkkknn , ),...,,(),...,,( 22112211 ∀======= +++

If {Xi} is stationary, then exists_)(XH

Proof:

STATIONARY PROCESS

93

Page 94: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

94

BLOCK CODING IS MORE EFFICIENT THANSINGLE SYMBOL CODING

94

Page 95: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

EXAMPLE OF BLOCK CODE

95

Page 96: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

Discrete –valued Stochastic Process {Xi} is Markov iff

Time independent if

)|(),...,,|( 1110 −− = nnnn XXPXXXXP

abnn PbXaXP === − )|( 1

MARKOV PROCESS

96

Page 97: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

STATIONARY MARKOV PROCESS

Finite state Markov process can be described using Markov chain diagram

97

a b

0.1

0.9

0.1

0.9

Any finite, aperiodic, irreducible Markov chain will converge to a stationary distribution regardless of starting distribution

Page 98: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

STATIONARY MARKOV PROCESS

98

As

approaches

Each row is the stationary distribution

Page 99: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

STATIONARY MARKOV PROCESS EXAMPLE

99

Pllp TTS ==

Page 100: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

CHESS BOARD

100

Page 101: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

EXAMPLE OF CODING FOR MARKOVSOURCE

a b

0.1

0.9

0.1

0.9

Find _)(XH a) Assuming the Markov model above with block code n = 2

b) Assuming i.i.d and symbol codec) Assuming i.i.d and block code with n = 2.d) Assuming i.i.d and block code with n = 3.

Which one of the above is minimum? Do you think when n = 1000, block code assuming i.i.d would be lower or higher than the answer in a)? Provide reason for your answer.

101

Page 102: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

M users share wireless transmission channel For each user independent in each time slot

If the queue is non-empty, transmit with the probability q A new packet arrive for transmission with probability p

If two packets collide, they stay in the queue At time t, queue sizes are Xt = (n1, …, nM)

{Xt} is Markov since P(Xt|Xt-1) depends only on Xt-1

Transmit vector Yt

{Yt} is not a Markov since P(Yt) depends on Xt-1, not Yt-1.

{Yt} is called Hidden Markov Process

ALOHA WIRELESS EXAMPLE

102

Page 103: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

ALOHA WIRELESS EXAMPLE

103

Page 104: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

If Xi is a Markov Process, and Y = f(X), then Yi is a stationary Hidden Markov Process

What is entropy rate ?)(_YH

HIDDEN MARKOV PROCESS

104

Page 105: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

HIDDEN MARKOV PROCESS

105

Page 106: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

HIDDEN MARKOV PROCESS

106

Page 107: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SUMMARY Entropy rate

Stationary

Stationary Markov

Hidden Markov

107

Page 108: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

LECTURE 5 Stream codes Arithmetic code Lempel-Ziv code

108

Page 109: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

WHY NOT USE HUFFMAN CODE? (SINCE IT ISOPTIMAL)

Good It is optimal, i.e., shortest possible code

Bad Bad for skewed probability Employ block coding helps, i.e., use all N symbols in a

block. The redundancy is now 1/N. However, Must re-compute the entire table of block symbols if

the probability of a single symbol changes. For N symbols, a table of |X|N must be pre-calculated

to build the treeSymbol decoding must wait until an entire block is

received109

Page 110: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

ARITHMETIC CODE

Developed by IBM (too bad, they didn’t patent it) Majority of current compression schemes use it

Good: No need for pre-calculated tables of big size Easy to adapt to change in probability Single symbol can be decoded immediately without

waiting for the entire block of symbols to be received.

110

Page 111: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

BASIC IDEA IN ARITHMETIC CODING

Sort the symbols in lexical order

Represent each string x of length n by a unique interval [F(xi-1),F(xi))in [0,1). F(xi)is the cumulative distribution.

F(xi) – F(xi-1) represents the probability of xi occurring.

The interval [F(xi-1),F(xi)) can itself be represented by any number, called a tag, within the half open interval.

The significant bits of the tag 0.t1t2t3... is the code

of x. That is, 0.t1t2t3...tk000... is in the interval [F(xi-1),F(xi))111

1)(

1log)( +

=

xPxl

Page 112: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

TAG IN ARITHMETIC CODE

Choose the tag

112

2)()(

2)()(

2)()())),(([ 1

1

1

11

_ xFxFxXPxFxXPxXPxxFT XiXiiX

i

k

ikiiX

+=

=+=

=+== −

=− ∑

Page 113: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

ARITHMETIC CODING EXAMPLE

113

0

Page 114: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

ARITHMETIC CODING EXAMPLE

114

Page 115: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

EXAMPLE OF ARITHMETIC CODING

115

Symbol Code

1234

.5

.75

.8751.0

.25

.625

.8125

.9375

.010

.101

.1101

.1111

2344

0110111011111

( )xF ( )xT X BinaryIn( ) 11log +

xP

1 2 3 4P(X) 0.5 0.25 0.125 0.125

Page 116: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

Message Code

11121314212223243132333441424344

.25

.125

.0625

.0625

.125

.0625

.03125

.03125

.0625

.03125

.015625

.015625

.0625

.03125

.015625

.015625

.125

.3125

.40625

.46875

.5625

.65625

.703125

.734375

.78125

.828125

.8515625

.8671875

.90625

.953125

.9765625

.984375

.001

.0101

.01101

.01111

.1001

.10101

.101101

.101111

.11001

.110101

.1101101

.1101111

.11101

.111101

.1111101

.1111111

3455456656775677

0010101011010111110011010110110110111111001110101110110111011111110111110111111011111111

( )xP ( )xT X ( ) BinaryinxT X( ) 11log +

xP

ARITHMETIC CODE FOR TWO-SYMBOLSEQUENCES

Page 117: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

PROGRESSIVE DECODING

117

F(x)

Page 118: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

UNIQUENESS OF THE ARITHMETIC CODE

Idea of proving arithmetic is a uniquely decodable code Show that the truncation of the tag lies entirely within

[F(xi-1, F(xi)), hence singular code Show that the truncation of the tag results in prefix code,

hence uniquely decodable. Proof:

118

Page 119: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

UNIQUENESS OF THE ARITHMETIC CODE

119

Page 120: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

UNIQUENESS OF THE ARITHMETIC CODE

120

Page 121: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

EFFICIENCY OF THE ARITHMETIC CODE

Off at most by 2/N

Proof:

121

NNXXXHl

NXXXH N

AN 2),...,,(),...,,( 2121 +<≤

Page 122: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

EFFICIENCY OF THE ARITHMETIC CODE

122

Page 123: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

ADAPTIVE ARITHMETIC CODE

123

Page 124: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

DICTIONARY CODING

index pattern

1 a2 b3 ab…n abc

Encoder codes the index

EncoderIndices

Decoder

Both encoder and decoder are assumed to have the same dictionary (table)

Page 125: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

ZIV-LEMPEL CODING (ZL OR LZ) Named after J. Ziv and A. Lempel (1977).

Adaptive dictionary technique. Store previously coded symbols in a buffer. Search for the current sequence of symbols to code. If found, transmit buffer offset and length.

Page 126: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

LZ77

a b c a b d a c a b d e e e fc

Search buffer Look-ahead buffer

Output triplet <offset, length, next>

123456783

8 0 13 d 0 e 2 f

2

Transmitted to decoder:

If the size of the search buffer is N and the size of the alphabet is Mwe need

bits to code a triplet.

Variation: Use a VLC to code the triplets!PKZip, Zip, Lharc,PNG, gzip, ARJ

Page 127: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

DRAWBACKS OF LZ77 Repetetive patterns with a period longer than the

search buffer size are not found. If the search buffer size is 4, the sequence

a b c d e a b c d e a b c d e a b c d e …will be expanded, not compressed.

Page 128: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SUMMARY

Stream codes Arithmetic Lempel-Ziv

Encoder and decoder operate sequentiallyno blocking of input symbols required

Not forced to send ≥1 bit per input symbol Can achieve entropy rate even when H(X)<1 Require a Perfect Channel

A single transmission error causes multiple wrong output symbols

Use finite length blocks to limit the damage 128

Page 129: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

LECTURE 6 Markov chains Data Processing theorem

You can’t create information from nothing (no free lunch) Fano’s inequality

Lower bound for error in estimating X from Y

129

Page 130: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

MARKOV CHAIN

Given 3 random variables X,Y,Z, they form a Markov chain denoted as X→Y →Z if

A Markov chain X→Y →Z means that The only way that X affects Z is through the value of Y I (X ;Z |Y ) = 0 ⇔ H(Z|Y ) = H(Z |X ,Y ) If you already know Y, then observing X gives you no

additional information about Z If you know Y, then observing Z gives you no additional

information about X A common special case of a Markov chain is when Z

= f(Y)

130

)|(),|()|()|()(),,( yzpyxzpyzpxypxpzyxp =⇔=

Page 131: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

MARKOV CHAIN SYMMETRY

X→Y →Z ↔ X and Z are conditionally independent given Y

X→Y →Z ↔ Z→Y →X

131

)|()|()|,( yzpyxpyzxp =

Page 132: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

DATA PROCESSING THEOREM

If X→Y→Z then I(X ;Y) ≥ I(X ; Z) Processing Y cannot add new information about X

If X→Y→Z then I(X ;Y) ≥ I(X ; Y | Z) Knowing Z can only decrease the amount X tells you

about Y Proof:

132

Page 133: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

NON-MARKOV: CONDITIONING INCREASEMUTUAL INFORMATION

Noisy channel Z = X+Y

133

Page 134: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

LONG MARKOV CHAIN

X1→ X2 → X3 → …. Xn then mutual information increases as you get closer and closer

134

);(....);();();( 1413121 nXXIXXIXXIXXI ≥≥≥≥

Page 135: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SUFFICIENT STATISTICS

If pdf of X depends on a parameter θ and you extract a statistic T(X) from your observation, then θ → X → T(X ) ⇒ I (θ ;T(X )) ≤ I (θ ;X )

T(x) is sufficient for θ if the stronger condition:

135

))(|()),(|( )(

);())(;()(

XTXpXTXpXXT

XIXTIXTX

=⇔→→→⇔

=⇔→→→

θθθ

θθθθ

Page 136: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

EXAMPLES SUFFICIENT STATISTICS

136

Page 137: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

FANO’S INEQUALITY

If we estimate X from Y, what is ?

N is the size of the outcome set of XProof:

137

)(^XXP ≠

)1log()1)|((

)1log())()|(()(

^

−−

≥−−

≥≠≡N

YXHN

PHYXHXXPP ee

Page 138: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

FANO’S INEQUALITY EXAMPLE

138

Page 139: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SUMMARY

Markov

Data Processing Theorem

Fano’s Inequality

139

Page 140: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

LECTURE 7 Weak Law of Large Numbers The Typical Set Asymptotic Equipartition Principle

140

Page 141: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

STRONG AND WEAK TYPICALITY

Strongly typical sequences: ABAACABCAABC Correct proportion H(X) = -(0.5log(0.5) + 0.5log(0.25)) = 2.5 bits

Weak typical sequences: BBBBBBBBBBAA Incorrect proportion

141

A B CP(X) 0.5 0.25 0.25

Page 142: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

CONVERGENCE OF RANDOM NUMBERS

Almost Sure Convergence:

Example:

142

1)limPr( ==∞→

XX nn

Page 143: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

CONVERGENCE OF RANDOM NUMBERS

Convergence in Probability:

Example:

143

0 ,0)|(Pr(|lim >∀→≥−∞→

εεXX nn

Page 144: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

WEAK LAW OF LARGE NUMBERS

Given i.i.d let , then

144

{ },iX ∑=

=n

iin X

ns

1

1

,][][ µ== XEsE n21)(1)( σ

nXVar

nsVar n ==

As n increases, the variance of sn decreases, i.e., values of sn become clustered around the mean

WLLN

0)|(| , →≥−→ ∞→nn

probn sPs εµµ

Page 145: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

PROOF OF WEAK LAW OF LARGENUMBERS

Chebyshev’s Inequality

WLLN

145

Page 146: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

TYPICAL SET

is the i.i.d sequence of

146

nx niX i to1for }{ =

}|)()(log:|{ 1 εε <−−Χ∈= − XHxpnxT nnnn

Page 147: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

EXAMPLE OF TYPICAL SET

Bernoulli with p

147

Page 148: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

TYPICAL SETS

148

Page 149: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

TYPICAL SET PROPERTIES

149

Page 150: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

ASYMPTOTIC EQUIPARTITION PRINCIPLE

150

surprisingequally almost isevent any almost , and ,any For εε Nn >

Page 151: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SOURCE CODING AND DATA COMPRESSION

151

Page 152: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

152

εεε −>∈ 1)( that ensure to,What nn TxPN

Page 153: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SMALLEST HIGH PROBABILITY SET

153

Page 154: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SUMMARY

154

Page 155: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

LECTURE 8 Source and Channel Coding Discrete Memoryless Channels

Binary Symmetric Channel Binary Erasure Channel Asymmetric Channel

Channel Capacity

155

Page 156: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SOURCE AND CHANNEL CODING

Source Coding

Channel Coding156

Page 157: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

DISCRETE MEMORYLESS CHANNEL

Discrete input, discrete output Channel matrix Q

Memoryless

157

Page 158: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

BINARY CHANNELS

Binary symmetric channel

Binary erasure channel

Z channel

158

Page 159: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

WEAKLY SYMMETRIC CHANNELS

1. All columns of Q have the same sum

2. Rows of Q are permutation of each other

159

Page 160: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SYMMETRIC CHANNEL

1. Rows of Q are permutations of each other

2. Columns of Q are permutations of each other

160

Page 161: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

CHANNEL CAPACITY

Capacity of discrete memoryless channel

Capacity of n uses of channel:

161

);(max)(

YXICxp

=

),...,,,;,...,,(max1321321)( nnxp

n YYYYXXXXIn

C =

Page 162: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

MUTUAL INFORMATION

162

Page 163: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

MUTUAL INFORMATION IS CONCAVE IN

Mutual information is concaved in p(x) for a fixed p(y|x)

Proof:

163

)(xp

Page 164: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

MUTUAL INFORMATION IS CONVEX IN

Mutual information is convex in given a fixed

Proof:

164

)|( xyp

)|( xyp)(xp

Page 165: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

N USE CHANNEL CAPACITY

165

Page 166: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

CAPACITY OF WEAKLY SYMMETRICCHANNEL

166

Page 167: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

BINARY ERASURE CHANNEL

167

Page 168: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

ASYMMETRIC CHANNEL CAPACITY

168

Page 169: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SUMMARY

169

Page 170: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

LECTURE 9 Jointly Typical Sets Joint AEP Channel Coding Theorem

170

Page 171: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SIGNIFICANCE OF MUTUAL INFORMATION

171

Page 172: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

JOINTLY TYPICAL SETS

172

} |),()),((log| ,|)()(log| ,|)()(log:|),{(

1

1

1

ε

ε

εε

<−−

<−−

<−−Χ∈=

YXHyxpnXHxpnXHxpnYyxJ

nn

n

nnnnn

Page 173: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

JOINTLY TYPICAL EXAMPLE

173

p(x,y) 0 10 0.5 0.1251 0.125 0.25

Page 174: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

JOINTLY TYPICAL DIAGRAM

174

Page 175: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

JOINTLY TYPICAL SET PROPERTIES

175

Page 176: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

JOINT AEP

176

)3),(()3),(( 2)),((2)1( then,t independen are ',' and ),'()(),'()( If

εε

εε −−+− ≤∈≤−

==YXInnnnYXIn Jyxp

YXypypxpxp

Page 177: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

CHANNEL CODE

177

Noisy channelEncoder

Mw ,...2,1∈ nx ny Decoder)(ˆ nygw =

Mw ,...2,1ˆ ∈

Page 178: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

ACHIEVABLE CODE RATES

178

Page 179: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

CHANNEL CODING THEOREM

179

Page 180: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

CHANNEL CODING THEOREM

180

Page 181: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SUMMARY

181

Page 182: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

LECTURE 10 Channel Coding Theorem

182

Page 183: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

CHANNEL CODING: MAIN IDEA

183

Noisy channel

xn yn

Page 184: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

RANDOM CODE

184

Page 185: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

RANDOM CODING IDEA

185

x1Noisy channel

xn yn

Page 186: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

ERROR PROBABILITY OF RANDOM CODE

186

Page 187: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

CODE SELECTION AND EXPURGATION

187

Page 188: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

PROCEDURE SUMMARY

188

Page 189: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

CONVERSE OF CHANNEL CODINGTHEOREM

189

Page 190: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

LECTURE 11 Minimum Bit Error Rate Channel with Feedback Joint Source Channel Coding

190

Page 191: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

MINIMUM BIT ERROR RATE

191

Page 192: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

CHANNEL WITH FEEDBACK

192

Page 193: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

CAPACITY WITH FEEDBACK

193

Page 194: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

EXAMPLE: BINARY ERASURE CHANNELWITH FEEDBACK

194

Page 195: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

JOINT SOURCE CHANNEL CODING

195

yn vnnv̂nynxnv

Page 196: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SOURCE CHANNEL PROOF (→)

196

Page 197: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SOURCE CHANNEL PROOF (←)

197

Page 198: ECE 566 - Information Theoryclasses.engr.oregonstate.edu/eecs/winter2020/ece566... · Thus in certain sense, Shannon information theory is somewhat unsatisfactory since it is based

SEPARATION THEOREM

198