20150101 information theory chapter 4

7/25/2019 20150101 Information Theory Chapter 4

1/27

Ch4. Zero-Error Data Compression

Yuan Luo


2/27

Content

Ch4. Zero-Error Data Compression

4.1 The Entropy Bound

4.2 Prefix Codes

4.2.1 Definition and Existence

4.2.2 Huffman Codes

4.3 Redundancy of Prefix Codes


3/27


Definition 4.1 A D-ary source code for a sourcerandom variable is a mapping from , theset of all finite length sequences of symbolstaken from a D-ary code alphabet.


4/27

Definition 4.2 A code C is uniquely decodable if for

any finite source sequence, the sequence of code

symbols corresponding to this source sequence is

different from the sequence of code symbols

corresponding to any other (finite) source sequence.



5/27

Example 1. Let , , , . Consider the code Cdefined by

Then all the three source sequence AAD,ACA, and AABA

produce the code sequence 0010. Thus from the code

sequence 0010, we cannot tell which of the three

source sequences it comes from. Therefore, C is notuniquely decodable.



6/27

Theorem (Kraft Inequality)Let C be a D-ary source code, and let , , , bethe lengths of the codewords. If C is uniquelydecodable, then

1



7/27

Example 2. Let , , , . Consider the code Cdefined by

We know ||, so

2 2 2 1



8/27

Let X be a source random variable with probability

distribution, , , ,

where 2. When we use a uniquely decodable code Cto encode the outcome of , the expected length of acodeword is given by



9/27

Theorem (Entropy Bound)

Let be a D-ary uniquely decodable code for asource random variable X with entropy (). Thenthe expected length of C is lower bounded by (),i.e. ,

()This lower bound is tight if and only if .



10/27

Definition 4.8. The redundancy R of a D-aryuniquely decodable code is the difference between

the expected length of the code and the entropy of

the source.



11/27


Definition 4.9. A code is called a prefix-free code

if no codeword is a prefix of any other codeword.

For brevity, a prefix-free code will be referred toas a prefix code.

4.2 Prefix Codes


12/27


Theorem

There exists a D-ary prefix code with codewordlengths , , , ,if and only if the Kraftinequality

1is satisfied.


13/27

A probability distribution such that for all , ,where is a positive integer, is called aD-adic distribution. When 2; is called adyadic distribution.



14/27

Corollary 4.12. There exists a D-ary prefix codewhich achieves the entropy bound for a distribution if and only if is D-adic.



15/27

4.2.2 Huffman Codes

As we have mentioned, the efficiency of a uniquely

decodable code is measured by its expected length.Thus for a given source X, we are naturally

interested in prefix codes which have the minimum

expected length. Such codes, called optimal codes,can be constructed by the Huffman procedure, andthese codes are referred to as Huffman codes.


16/27

4.2.2 Huffman Codes

The Huffman procedure is to form a code tree such

that the expected length is minimum. The procedure

is described by a

very simple rule:

Keep merging the two smallest probability massesuntil one probability mass(. . , 1) is left.


17/27

Lemma 4.15. In an optimal code, shorter codewords

are assigned to larger probabilities. Lemma 4.16. There exists an optimal code in which

the codewords assigned to the two smallest

probabilities are siblings, i.e., the two

codewords have the same length and they differonly in the last symbol.

4.2.2 Huffman Codes


18/27

Theorem

The Huffman procedure produces an optimal prefixcode.

4.2.2 Huffman Codes


19/27

Theorem

The expected length of a Huffman code, denoted by ,satisfies 1.

This bound is the tightest among all the upper boundson which depends only on the source entropy.From the entropy bound and the above theorem, we have

1

4.2.2 Huffman Codes


20/27


Let X be a source random variable with probability

distribution, , , ,

where 2. A D-ary prefix code for X can berepresented by a D-ary code tree with m leaves,where each leaf corresponds to a codeword.


21/27

: the index set of all the internal nodes( including the root ) in the code tree.: the probability of reaching an internal node during the decoding process.

The probability is called the reachingprobability of internal node . Evidently, isequal to the sum of the probabilities of all the

leaves descending from node .



22/27

,: the probability that the

branch of node

is

taken during the decoding process. The probabilities , , 0 1 , are called the branchingprobabilities of node , and

,

.



23/27

Once node k is reached, the conditional branching

distribution is

, ,

, , ,

, .

Then define the conditional entropy of node k by

, ,

, , ,

, .



24/27

Lemma 4.19.

.

Lemma 4.20. . .



25/27

Define the local redundancy of an internal node k by

( 1 )Theorem (Local Redundancy Theorem)

Let L be the expected length of a D-ary prefix codefor a source random variable X, and R be theredundancy of the code. Then



26/27

Corollary 4.22 Entropy Bound). Let

be the

redundancy of a prefix code. Then 0 withequality if and only if all the internal nodes in

the code tree are balanced.



27/27

Thank you!

20150101 information theory chapter 4

Documents