Download - 4. Quantization and Data Compression
![Page 1: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/1.jpg)
4. Quantization and Data Compression
ECE 302 Spring 2012 Purdue University, School of ECE
Prof. Ilya Pollak
![Page 2: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/2.jpg)
What is data compression? • Reducing the file size without compromising the
quality of the data stored in the file too much (lossy compression) or at all (lossless compression).
• With compression, you can fit higher-quality data (e.g., higher-resolution pictures or video) into a file of the same size as required for lower-quality uncompressed data.
Ilya Pollak
![Page 3: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/3.jpg)
Why data compression?
• Our appetite for data (high-resolution pictures, HD video, audio, documents, etc) seems to always significantly outpace hardware capabilities for storage and transmission.
Ilya Pollak
![Page 4: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/4.jpg)
Data compression: Step 0 • If the data is continuous-time (e.g., audio) or
continuous-space (e.g., picture), it first needs to be discretized.
Ilya Pollak
![Page 5: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/5.jpg)
Data compression: Step 0 • If the data is continuous-time (e.g., audio) or
continuous-space (e.g., picture), it first needs to be discretized.
• Sampling is typically done nowadays during signal acquisition (e.g., digital camera for pictures or audio recording equipment for music and speech).
Ilya Pollak
![Page 6: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/6.jpg)
Data compression: Step 0 • If the data is continuous-time (e.g., audio) or
continuous-space (e.g., picture), it first needs to be discretized.
• Sampling is typically done nowadays during signal acquisition (e.g., digital camera for pictures or audio recording equipment for music and speech).
• We will not study sampling. It is studied in ECE 301, ECE 438, and ECE 440.
• We will consider compressing discrete-time or discrete-space data.
Ilya Pollak
![Page 7: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/7.jpg)
Example: compression of grayscale images
• An eight-bit grayscale image is a rectangular array of integers between 0 (black) and 255 (white).
• Each site in the array is called a pixel.
Ilya Pollak
![Page 8: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/8.jpg)
Example: compression of grayscale images
• An eight-bit grayscale image is a rectangular array of integers between 0 (black) and 255 (white).
• Each site in the array is called a pixel. • It takes one byte (eight bits) to store one pixel value,
since it can be any number between 0 and 255.
Ilya Pollak
![Page 9: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/9.jpg)
Example: compression of grayscale images
• An eight-bit grayscale image is a rectangular array of integers between 0 (black) and 255 (white).
• Each site in the array is called a pixel. • It takes one byte (eight bits) to store one pixel value,
since it can be any number between 0 and 255. • It would take 25 bytes to store a 5x5 image.
Ilya Pollak
![Page 10: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/10.jpg)
Example: compression of grayscale images
• An eight-bit grayscale image is a rectangular array of integers between 0 (black) and 255 (white).
• Each site in the array is called a pixel. • It takes one byte (eight bits) to store one pixel value,
since it can be any number between 0 and 255. • It would take 25 bytes to store a 5x5 image. • Can we do better?
Ilya Pollak
![Page 11: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/11.jpg)
Example: compression of grayscale images
255 255 255 255 255
255 255 255 255 255
200 200 200 200 200
200 200 200 200 200
200 200 200 200 100
Can we do better than 25 bytes?
Ilya Pollak
![Page 12: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/12.jpg)
Two key ideas • Idea #1:
– Transform the data to create lots of zeros.
Ilya Pollak
![Page 13: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/13.jpg)
Two key ideas • Idea #1:
– Transform the data to create lots of zeros. For example, we could rasterize the image, compute the differences, and store the top left value along with the 24 differences [in reality, other transforms are used, but they work in a similar fashion]
Ilya Pollak
![Page 14: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/14.jpg)
Two key ideas • Idea #1:
– Transform the data to create lots of zeros. For example, we could rasterize the image, compute the differences, and store the top left value along with the 24 differences [in reality, other transforms are used, but they work in a similar fashion]:
– 255,0,0,0,0,0,0,0,0,0,−55,0,0,0,0,0,0,0,0,0,0,0,0,0,−100
Ilya Pollak
![Page 15: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/15.jpg)
Two key ideas • Idea #1:
– Transform the data to create lots of zeros. For example, we could rasterize the image, compute the differences, and store the top left value along with the 24 differences [in reality, other transforms are used, but they work in a similar fashion]:
– 255,0,0,0,0,0,0,0,0,0,−55,0,0,0,0,0,0,0,0,0,0,0,0,0,−100 – This seems to make things worse: now the numbers can
range from −255 to 255, and therefore we need two bytes per pixel!
Ilya Pollak
![Page 16: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/16.jpg)
Two key ideas • Idea #1:
– Transform the data to create lots of zeros. For example, we could rasterize the image, compute the differences, and store the top left value along with the 24 differences [in reality, other transforms are used, but they work in a similar fashion]:
– 255,0,0,0,0,0,0,0,0,0,−55,0,0,0,0,0,0,0,0,0,0,0,0,0,−100 – This seems to make things worse: now the numbers can
range from −255 to 255, and therefore we need two bytes per pixel!
• Idea #2: – when encoding the data, spend fewer bits on frequently
occurring numbers and more bits on rare numbers.
Ilya Pollak
![Page 17: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/17.jpg)
Entropy coding
value of X 0 255 −55 −100 probability 22/25 1/25 1/25 1/25
Suppose we are encoding realizations of a discrete random variable X such that
Ilya Pollak
![Page 18: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/18.jpg)
Entropy coding
value of X 0 255 −55 −100 probability 22/25 1/25 1/25 1/25
value of X 0 255 −55 −100 codeword 00 01 10 11
Suppose we are encoding realizations of a discrete random variable X such that
Consider the following fixed-length encoder:
Ilya Pollak
![Page 19: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/19.jpg)
Entropy coding
value of X 0 255 −55 −100 probability 22/25 1/25 1/25 1/25
value of X 0 255 −55 −100 codeword 00 01 10 11
Suppose we are encoding realizations of a discrete random variable X such that
Consider the following fixed-length encoder:
For a file with 25 numbers, E[file size] = 25*2*(22/25+1/25+1/25+1/25) = 50 bits
Ilya Pollak
![Page 20: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/20.jpg)
Entropy coding
value of X 0 255 −55 −100 probability 22/25 1/25 1/25 1/25
value of X 0 255 −55 −100 codeword 00 01 10 11
Suppose we are encoding realizations of a discrete random variable X such that
Consider the following fixed-length encoder:
For a file with 25 numbers, E[file size] = 25*2*(22/25+1/25+1/25+1/25) = 50 bits
value of X 0 255 −55 −100 codeword 1 01 000 001
Now consider the following encoder:
Ilya Pollak
![Page 21: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/21.jpg)
Entropy coding
value of X 0 255 −55 −100 probability 22/25 1/25 1/25 1/25
value of X 0 255 −55 −100 codeword 00 01 10 11
Suppose we are encoding realizations of a discrete random variable X such that
Consider the following fixed-length encoder:
For a file with 25 numbers, E[file size] = 25*2*(22/25+1/25+1/25+1/25) = 50 bits
value of X 0 255 −55 −100 codeword 1 01 000 001
Now consider the following encoder:
For a file with 25 numbers, E[file size] = 25(22/25 + 2/25 + 3/25 + 3/25) = 30 bits!
Ilya Pollak
![Page 22: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/22.jpg)
Entropy coding • A similar encoding scheme can be devised for a
random variable of pixel differences which takes values between −255 and 255, to result in a smaller average file size than two bytes per pixel.
Ilya Pollak
![Page 23: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/23.jpg)
Entropy coding • A similar encoding scheme can be devised for a
random variable of pixel differences which takes values between −255 and 255, to result in a smaller average file size than two bytes per pixel.
• Another commonly used idea: run-length coding. I.e., instead of encoding each 0 individually, encode the length of each string of zeros.
Ilya Pollak
![Page 24: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/24.jpg)
Back to the four-symbol example
value of X 0 255 −55 −100 probability 22/25 1/25 1/25 1/25 codeword 1 01 000 001
Can we do even better than 30 bits?
Ilya Pollak
![Page 25: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/25.jpg)
Back to the four-symbol example
value of X 0 255 −55 −100 probability 22/25 1/25 1/25 1/25 codeword 1 01 000 001
Can we do even better than 30 bits? What about this alternative encoder?
value of X 0 255 −55 −100 probability 22/25 1/25 1/25 1/25 codeword 0 01 1 10
Ilya Pollak
![Page 26: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/26.jpg)
Back to the four-symbol example
value of X 0 255 −55 −100 probability 22/25 1/25 1/25 1/25 codeword 1 01 000 001
Can we do even better than 30 bits? What about this alternative encoder?
value of X 0 255 −55 −100 probability 22/25 1/25 1/25 1/25 codeword 0 01 1 10
E[file size] = 25(22/25 + 2/25 + 1/25+2/25) = 27 bits
Ilya Pollak
![Page 27: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/27.jpg)
Back to the four-symbol example
value of X 0 255 −55 −100 probability 22/25 1/25 1/25 1/25 codeword 1 01 000 001
Can we do even better than 30 bits? What about this alternative encoder?
value of X 0 255 −55 −100 probability 22/25 1/25 1/25 1/25 codeword 0 01 1 10
E[file size] = 25(22/25 + 2/25 + 1/25+2/25) = 27 bits Is there anything wrong with this encoder?
Ilya Pollak
![Page 28: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/28.jpg)
The second encoding is not uniquely decodable!
value of X 0 255 −55 −100 probability 22/25 1/25 1/25 1/25 codeword 0 01 1 10
Encoded string ‘01’ could either be 255 or 0 followed by −55
Ilya Pollak
![Page 29: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/29.jpg)
The second encoding is not uniquely decodable!
value of X 0 255 −55 −100 probability 22/25 1/25 1/25 1/25 codeword 0 01 1 10
Encoded string ‘01’ could either be 255 or 0 followed by −55
Therefore, this code is unusable! It turns out that the first code is uniquely decodable.
Ilya Pollak
![Page 30: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/30.jpg)
What kinds of distributions are amenable to entropy coding?
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
a b c d0
0.1
0.2
0.3
a b c d
Cannot do better than two bits per symbol
Can do a lot better than two bits per symbol
Ilya Pollak
![Page 31: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/31.jpg)
What kinds of distributions are amenable to entropy coding?
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
a b c d0
0.1
0.2
0.3
a b c d
Cannot do better than two bits per symbol
Can do a lot better than two bits per symbol
Conclusion: the transform procedure should be such that the numbers fed into the entropy coder have a highly concentrated histogram (a few very likely values, most values unlikely).
Ilya Pollak
![Page 32: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/32.jpg)
What kinds of distributions are amenable to entropy coding?
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
a b c d0
0.1
0.2
0.3
a b c d
Cannot do better than two bits per symbol
Can do a lot better than two bits per symbol
Conclusion: the transform procedure should be such that the numbers fed into the entropy coder have a highly concentrated histogram (a few very likely values, most values unlikely). Also, if we are encoding each number individually, they should be independent or approximately independent.
Ilya Pollak
![Page 33: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/33.jpg)
What if we are willing to lose some information?
253 253 255 254 255
254 254 254 255 254
252 255 255 254 252
253 253 254 254 254
252 255 253 252 253
Ilya Pollak
![Page 34: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/34.jpg)
What if we are willing to lose some information?
253 253 255 254 255
254 254 254 255 254
252 255 255 254 252
253 253 254 254 254
252 255 253 252 253
Quantization
Ilya Pollak
253.5 253.5 253.5 253.5 253.5
253.5 253.5 253.5 253.5 253.5
253.5 253.5 253.5 253.5 253.5
253.5 253.5 253.5 253.5 253.5
253.5 253.5 253.5 253.5 253.5
![Page 35: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/35.jpg)
Some eight-bit images
The five stripes contain random values from (left to right): {252,253,254,255}, {188,189,190,191}, {125,126,127,128}, {61,62,63,64}, {0,1,2,3}.
The five stripes contain random integers from (left to right): {240,…,255}, {176,…,191}, {113,…,128}, {49,…,64 }, {0,…,15}.
Ilya Pollak
![Page 36: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/36.jpg)
Converting continuous-valued to discrete-valued signals
• Many real-world signals are continuous-valued. – audio signal a(t): both the time argument t and the intensity value
a(t) are continuous; – image u(x,y): both the spatial location (x,y) and the image
intensity value u(x,y) are continuous; – video v(x,y,t): x,y,t, and v(x,y,t) are all continuous.
Ilya Pollak
![Page 37: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/37.jpg)
Converting continuous-valued to discrete-valued signals
• Many real-world signals are continuous-valued. – audio signal a(t): both the time argument t and the intensity value
a(t) are continuous; – image u(x,y): both the spatial location (x,y) and the image
intensity value u(x,y) are continuous; – video v(x,y,t): x,y,t, and v(x,y,t) are all continuous.
• Discretizing the argument values t, x, and y (or sampling), is studied in ECE 301, 438, and 440.
Ilya Pollak
![Page 38: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/38.jpg)
Converting continuous-valued to discrete-valued signals
• Many real-world signals are continuous-valued. – audio signal a(t): both the time argument t and the intensity value
a(t) are continuous; – image u(x,y): both the spatial location (x,y) and the image
intensity value u(x,y) are continuous; – video v(x,y,t): x,y,t, and v(x,y,t) are all continuous.
• Discretizing the argument values t, x, and y (or sampling), is studied in ECE 301, 438, and 440.
• However, in addition to descretizing the argument values, the signal values must be discretized as well in order to be digitally stored.
Ilya Pollak
![Page 39: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/39.jpg)
Quantization • Digitizing a continuous-valued signal into a discrete and
finite set of values. • Converting a discrete-valued signal into another discrete
-valued signal, with fewer possible discrete values.
Ilya Pollak
![Page 40: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/40.jpg)
How to compare two quantizers?
Ilya Pollak
• Suppose data X(1),…,X(N) is quantized using two quantizers, to result in Y1(1),…,Y1(N) and Y2(1),…,Y2(N).
• Suppose both Y1(1),…,Y1(N) and Y2(1),…,Y2(N) can be encoded with the same number of bits.
• Which quantization is better? • The one that results in less distortion. But how to measure distortion?
– In general, measuring and modeling perceptual image similarity and similarity of audio are open research problems.
– Some useful things are known about human audio and visual systems that inform the design of quantizers.
![Page 41: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/41.jpg)
Sensitivity of the Human Visual System to Contrast Changes, as a
Function of Frequency
Ilya Pollak
![Page 42: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/42.jpg)
Sensitivity of the Human Visual System to Contrast Changes, as a
Function of Frequency
[From Mannos-Sakrison IEEE-IT 1974]
Ilya Pollak
![Page 43: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/43.jpg)
Sensitivity of the Human Visual System to Contrast Changes, as a
Function of Frequency
High and low frequencies may be quantized more coarsely
[From Mannos-Sakrison IEEE-IT 1974]
Ilya Pollak
![Page 44: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/44.jpg)
But there are many other intricacies in the way human
visual system computes similarity…
Ilya Pollak
![Page 45: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/45.jpg)
Are these two images similar?
Ilya Pollak
![Page 46: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/46.jpg)
What about these two?
Ilya Pollak
![Page 47: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/47.jpg)
What about these two?
• Performance assessment of compression algorithms and quantizers is complicated, because measuring image fidelity is complicated. • Often, very simple distortion measures are used such as mean-square error.
Ilya Pollak
![Page 48: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/48.jpg)
255
Scalar vs Vector Quantization
r 0 127
• quantize each value separately • simple thresholding
• quantize several values jointly • more complex
255 s
r 0 95 255
95 127
s 255
Ilya Pollak
![Page 49: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/49.jpg)
255
What kinds of joint distributions are amenable to scalar quantization?
r 0 127
If (r,s) are jointly uniform over green square (or, more generally, independent), knowing r does not tell us anything about s. Best thing to do: make quantization decisions independently.
127
s 255
Ilya Pollak
![Page 50: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/50.jpg)
255
What kinds of joint distributions are amenable to scalar quantization?
r 0 127
If (r,s) are jointly uniform over yellow region, knowing r tells us a lot about s.
Best thing to do: make quantization decisions jointly.
255 s
r 0 95 255
95 127
s 255
Ilya Pollak
If (r,s) are jointly uniform over green square (or, more generally, independent), knowing r does not tell us anything about s. Best thing to do: make quantization decisions independently.
![Page 51: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/51.jpg)
255
What kinds of joint distributions are amenable to scalar quantization?
r 0 127
If (r,s) are jointly uniform over yellow region, knowing r tells us a lot about s.
Best thing to do: make quantization decisions jointly.
255 s
r 0 95 255
95 127
s 255
Conclusion: if the data is transformed before quantization, the transform procedure should be such that the coefficients fed into the quantizer are independent (or at least uncorrelated, or almost uncorrelated), in order to enable the simpler scalar quantization.
Ilya Pollak
If (r,s) are jointly uniform over green square (or, more generally, independent), knowing r does not tell us anything about s. Best thing to do: make quantization decisions independently.
![Page 52: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/52.jpg)
• Does it make sense to do scalar quantization with different quantization bins for different variables?
More on Scalar Quantization
255 r 0 127
127
s 255
Ilya Pollak
![Page 53: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/53.jpg)
• Does it make sense to do scalar quantization with different quantization bins for different variables? – No reason to do this if we are
quantizing grayscale pixel values.
More on Scalar Quantization
255 r 0 127
127
s 255
Ilya Pollak
![Page 54: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/54.jpg)
• Does it make sense to do scalar quantization with different quantization bins for different variables? – No reason to do this if we are
quantizing grayscale pixel values. – However, if we can decompose the
image into components that are less perceptually important and more perceptually important, we should use larger quantization bins for the less important components.
More on Scalar Quantization
255 r 0 127
127
s 255
Ilya Pollak
![Page 55: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/55.jpg)
Structure of a Typical Lossy Compression Algorithm for Audio,
Images, or Video
transform quantization entropy coding
compressed bitstream data
Ilya Pollak
![Page 56: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/56.jpg)
Structure of a Typical Lossy Compression Algorithm for Audio,
Images, or Video
transform quantization entropy coding
compressed bitstream data
Let’s more closely consider quantization and entropy coding. (Various transforms are considered in ECE 301 and ECE 438.)
Ilya Pollak
![Page 57: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/57.jpg)
Quantization: problem statement
Ilya Pollak
Source (e.g., image, video, speech signal)
Sequence of discrete or continuous random variables X(1),…,X(N) (e.g., transformed image pixel values).
![Page 58: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/58.jpg)
Quantization: problem statement
Ilya Pollak
Source (e.g., image, video, speech signal) Quantizer
Sequence of discrete or continuous random variables X(1),…,X(N) (e.g., transformed image pixel values).
Sequence of discrete random variables Y(1),…,Y(N), each distributed over a finite set of values (quantization levels)
![Page 59: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/59.jpg)
Quantization: problem statement
Errors: D(1),…,D(N) where D(n) = X(n) − Y(n)
Ilya Pollak
Source (e.g., image, video, speech signal) Quantizer
Sequence of discrete or continuous random variables X(1),…,X(N) (e.g., transformed image pixel values).
Sequence of discrete random variables Y(1),…,Y(N), each distributed over a finite set of values (quantization levels)
![Page 60: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/60.jpg)
MSE is a widely used measure of distortion of quantizers
Ilya Pollak
• Suppose data X(1),…,X(N) are quantized, to result in Y(1),…,Y(N).
E X(n) −Y (n)( )2
n=1
N
∑⎡⎣⎢
⎤⎦⎥= E D(n)( )2
n=1
N
∑⎡⎣⎢
⎤⎦⎥
If D(1),...,D(N ) are identically distributed, this is the same as NE D(n)( )2⎡⎣ ⎤⎦, for any n.
![Page 61: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/61.jpg)
Scalar uniform quantization
• Use quantization intervals (bins) of equal size [x1,x2), [x2,x3),…[xL,xL+1].
• Quantization levels q1, q2,…, qL. • Each quantization level is in the middle of
the corresponding quantization bin: qk=(xk+xk+1)/2.
Ilya Pollak
![Page 62: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/62.jpg)
Scalar uniform quantization
• Use quantization intervals (bins) of equal size [x1,x2), [x2,x3),…[xL,xL+1].
• Quantization levels q1, q2,…, qL. • Each quantization level is in the middle of
the corresponding quantization bin: qk=(xk+xk+1)/2.
• If quantizer input X is in [xk,xk+1), the corresponding quantized value is Y = qk.
Ilya Pollak
![Page 63: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/63.jpg)
Uniform vs non-uniform quantization
• Uniform quantization is not a good strategy for distributions which significantly differ from uniform.
Ilya Pollak
![Page 64: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/64.jpg)
Uniform vs non-uniform quantization
• Uniform quantization is not a good strategy for distributions which significantly differ from uniform.
• If the distribution is non-uniform, it is better to spend more quantization levels on more probable parts of the distribution and fewer quantization levels on less probable parts.
Ilya Pollak
![Page 65: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/65.jpg)
Scalar Lloyd-Max quantizer • X = source random variable with a known distribution. We assume it to be a
continuous r.v. with PDF fX(x)>0.
Ilya Pollak
![Page 66: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/66.jpg)
Scalar Lloyd-Max quantizer • X = source random variable with a known distribution. We assume it to be a
continuous r.v. with PDF fX(x)>0. – The results can be extended to discrete or mixed random variables, and to
continuous random variables whose density can be zero for some x.
Ilya Pollak
![Page 67: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/67.jpg)
Scalar Lloyd-Max quantizer • X = source random variable with a known distribution. We assume it to be a
continuous r.v. with PDF fX(x)>0. – The results can be extended to discrete or mixed random variables, and to
continuous random variables whose density can be zero for some x. • Quantization intervals (x1,x2), [x2,x3),…[xL,xL+1) and levels q1, …, qL such that
– x1 = −∞ – xL+1 = ∞ –
Ilya Pollak
−∞ < q1 < x2 ≤ q2 < x3 ≤ q3 <… ≤ qL−1 < xL ≤ qL < +∞I.e., qk ∈k-th quantization interval
![Page 68: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/68.jpg)
Scalar Lloyd-Max quantizer • X = source random variable with a known distribution. We assume it to be a
continuous r.v. with PDF fX(x)>0. – The results can be extended to discrete or mixed random variables, and to
continuous random variables whose density can be zero for some x. • Quantization intervals (x1,x2), [x2,x3),…[xL,xL+1) and levels q1, …, qL such that
– x1 = −∞ – xL+1 = ∞ –
• Y = the result of quantizing X, a discrete random variable with L possible outcomes, q1, q2,…, qL, defined by
Ilya Pollak
Y = Y (X) =
q1 if X < x2
q2 if x2 ≤ X < x3
qL−1 if xL−1 ≤ X < xLqL X ≥ xL
⎧
⎨
⎪⎪⎪
⎩
⎪⎪⎪
−∞ < q1 < x2 ≤ q2 < x3 ≤ q3 <… ≤ qL−1 < xL ≤ qL < +∞I.e., qk ∈k-th quantization interval
![Page 69: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/69.jpg)
Scalar Lloyd-Max quantizer: goal
• Given the pdf fX(x) of the source r.v. X and the desired number L of quantization levels, find the quantization interval endpoints x2,…,xL and quantization levels q1,…, qL to minimize the mean-square error, E[(Y−X)2].
Ilya Pollak
![Page 70: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/70.jpg)
Scalar Lloyd-Max quantizer: goal
• Given the pdf fX(x) of the source r.v. X and the desired number L of quantization levels, find the quantization interval endpoints x2,…,xL and quantization levels q1,…, qL to minimize the mean-square error, E[(Y−X)2].
• To do this, express the mean-square error in terms of the quantization interval endpoints and quantization levels, and find the minimum (or minima) through differentiation.
Ilya Pollak
![Page 71: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/71.jpg)
Scalar Lloyd-Max quantizer: derivation
Ilya Pollak
E Y − X( )2⎡⎣ ⎤⎦ = y(x) − x( )2 fX (x)dx−∞
∞
∫
![Page 72: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/72.jpg)
Ilya Pollak
E Y − X( )2⎡⎣ ⎤⎦ = y(x) − x( )2 fX (x)dx−∞
∞
∫ = y(x) − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑
Scalar Lloyd-Max quantizer: derivation
![Page 73: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/73.jpg)
Ilya Pollak
E Y − X( )2⎡⎣ ⎤⎦ = y(x) − x( )2 fX (x)dx−∞
∞
∫ = y(x) − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑ = qk − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑
Scalar Lloyd-Max quantizer: derivation
![Page 74: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/74.jpg)
Ilya Pollak
E Y − X( )2⎡⎣ ⎤⎦ = y(x) − x( )2 fX (x)dx−∞
∞
∫ = y(x) − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑ = qk − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑
Minimize w.r.t. qk : ∂∂qk
E Y − X( )2⎡⎣ ⎤⎦ = 2 qk − x( ) fX (x)dxxk
xk+1
∫ = 0
Scalar Lloyd-Max quantizer: derivation
![Page 75: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/75.jpg)
Ilya Pollak
E Y − X( )2⎡⎣ ⎤⎦ = y(x) − x( )2 fX (x)dx−∞
∞
∫ = y(x) − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑ = qk − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑
Minimize w.r.t. qk : ∂∂qk
E Y − X( )2⎡⎣ ⎤⎦ = 2 qk − x( ) fX (x)dxxk
xk+1
∫ = 0
qk fX (x)dxxk
xk+1
∫ = xfX (x)dxxk
xk+1
∫
Scalar Lloyd-Max quantizer: derivation
![Page 76: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/76.jpg)
Ilya Pollak
E Y − X( )2⎡⎣ ⎤⎦ = y(x) − x( )2 fX (x)dx−∞
∞
∫ = y(x) − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑ = qk − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑
Minimize w.r.t. qk : ∂∂qk
E Y − X( )2⎡⎣ ⎤⎦ = 2 qk − x( ) fX (x)dxxk
xk+1
∫ = 0
qk fX (x)dxxk
xk+1
∫ = xfX (x)dxxk
xk+1
∫ , therefore qk =xfX (x)dx
xk
xk+1
∫
fX (x)dxxk
xk+1
∫
Scalar Lloyd-Max quantizer: derivation
![Page 77: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/77.jpg)
Ilya Pollak
E Y − X( )2⎡⎣ ⎤⎦ = y(x) − x( )2 fX (x)dx−∞
∞
∫ = y(x) − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑ = qk − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑
Minimize w.r.t. qk : ∂∂qk
E Y − X( )2⎡⎣ ⎤⎦ = 2 qk − x( ) fX (x)dxxk
xk+1
∫ = 0
qk fX (x)dxxk
xk+1
∫ = xfX (x)dxxk
xk+1
∫ , therefore qk =xfX (x)dx
xk
xk+1
∫
fX (x)dxxk
xk+1
∫= E X | X ∈k-th quantization interval[ ]
Scalar Lloyd-Max quantizer: derivation
![Page 78: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/78.jpg)
Ilya Pollak
E Y − X( )2⎡⎣ ⎤⎦ = y(x) − x( )2 fX (x)dx−∞
∞
∫ = y(x) − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑ = qk − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑
Minimize w.r.t. qk : ∂∂qk
E Y − X( )2⎡⎣ ⎤⎦ = 2 qk − x( ) fX (x)dxxk
xk+1
∫ = 0
qk fX (x)dxxk
xk+1
∫ = xfX (x)dxxk
xk+1
∫ , therefore qk =xfX (x)dx
xk
xk+1
∫
fX (x)dxxk
xk+1
∫= E X | X ∈k-th quantization interval[ ]
This is a minimum, since ∂2
∂qk2 E Y − X( )2⎡⎣ ⎤⎦ = 2 fX (x)dx
xk
xk+1
∫ > 0.
Scalar Lloyd-Max quantizer: derivation
![Page 79: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/79.jpg)
Ilya Pollak
E Y − X( )2⎡⎣ ⎤⎦ = y(x) − x( )2 fX (x)dx−∞
∞
∫ = y(x) − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑ = qk − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑
Minimize w.r.t. xk , for k = 2,…,L
Scalar Lloyd-Max quantizer: derivation
![Page 80: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/80.jpg)
Ilya Pollak
E Y − X( )2⎡⎣ ⎤⎦ = y(x) − x( )2 fX (x)dx−∞
∞
∫ = y(x) − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑ = qk − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑
Minimize w.r.t. xk , for k = 2,…,L:
∂∂xk
E Y − X( )2⎡⎣ ⎤⎦ =∂∂xk
qk−1 − x( )2 fX (x)dxxk−1
xk
∫ + qk − x( )2 fX (x)dxxk
xk+1
∫⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪
Scalar Lloyd-Max quantizer: derivation
![Page 81: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/81.jpg)
Ilya Pollak
E Y − X( )2⎡⎣ ⎤⎦ = y(x) − x( )2 fX (x)dx−∞
∞
∫ = y(x) − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑ = qk − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑
Minimize w.r.t. xk , for k = 2,…,L:
∂∂xk
E Y − X( )2⎡⎣ ⎤⎦ =∂∂xk
qk−1 − x( )2 fX (x)dxxk−1
xk
∫ + qk − x( )2 fX (x)dxxk
xk+1
∫⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪
= qk−1 − xk( )2 fX (xk ) − qk − xk( )2 fX (xk )
Scalar Lloyd-Max quantizer: derivation
![Page 82: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/82.jpg)
Ilya Pollak
E Y − X( )2⎡⎣ ⎤⎦ = y(x) − x( )2 fX (x)dx−∞
∞
∫ = y(x) − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑ = qk − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑
Minimize w.r.t. xk , for k = 2,…,L:
∂∂xk
E Y − X( )2⎡⎣ ⎤⎦ =∂∂xk
qk−1 − x( )2 fX (x)dxxk−1
xk
∫ + qk − x( )2 fX (x)dxxk
xk+1
∫⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪
= qk−1 − xk( )2 fX (xk ) − qk − xk( )2 fX (xk ) = qk−1 − qk( ) qk−1 + qk − 2xk( ) fX (xk ) = 0.By assumption, fX (x) ≠ 0 and qk−1 ≠ qk .
Scalar Lloyd-Max quantizer: derivation
![Page 83: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/83.jpg)
Ilya Pollak
E Y − X( )2⎡⎣ ⎤⎦ = y(x) − x( )2 fX (x)dx−∞
∞
∫ = y(x) − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑ = qk − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑
Minimize w.r.t. xk , for k = 2,…,L:
∂∂xk
E Y − X( )2⎡⎣ ⎤⎦ =∂∂xk
qk−1 − x( )2 fX (x)dxxk−1
xk
∫ + qk − x( )2 fX (x)dxxk
xk+1
∫⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪
= qk−1 − xk( )2 fX (xk ) − qk − xk( )2 fX (xk ) = qk−1 − qk( ) qk−1 + qk − 2xk( ) fX (xk ) = 0.By assumption, fX (x) ≠ 0 and qk−1 ≠ qk . Therefore,
xk =qk−1 + qk
2, for k = 2,…,L.
Scalar Lloyd-Max quantizer: derivation
![Page 84: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/84.jpg)
Ilya Pollak
E Y − X( )2⎡⎣ ⎤⎦ = y(x) − x( )2 fX (x)dx−∞
∞
∫ = y(x) − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑ = qk − x( )2 fX (x)dxxk
xk+1
∫k=1
L
∑
Minimize w.r.t. xk , for k = 2,…,L:
∂∂xk
E Y − X( )2⎡⎣ ⎤⎦ =∂∂xk
qk−1 − x( )2 fX (x)dxxk−1
xk
∫ + qk − x( )2 fX (x)dxxk
xk+1
∫⎧⎨⎪
⎩⎪
⎫⎬⎪
⎭⎪
= qk−1 − xk( )2 fX (xk ) − qk − xk( )2 fX (xk ) = qk−1 − qk( ) qk−1 + qk − 2xk( ) fX (xk ) = 0.By assumption, fX (x) ≠ 0 and qk−1 ≠ qk . Therefore,
xk =qk−1 + qk
2, for k = 2,…,L.
This is a minimum, since ∂2
∂xk2 E Y − X( )2⎡⎣ ⎤⎦ = 2 qk − qk−1( ) fX (xk ) > 0.
Scalar Lloyd-Max quantizer: derivation
![Page 85: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/85.jpg)
Nonlinear system to be solved
Ilya Pollak
qk =xfX (x)dx
xk
xk+1
∫
fX (x)dxxk
xk+1
∫= E X | X ∈k-th quantization interval[ ], for k = 1,…,L
xk =qk−1 + qk
2, for k = 2,…,L
⎧
⎨
⎪⎪⎪⎪
⎩
⎪⎪⎪⎪
![Page 86: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/86.jpg)
Nonlinear system to be solved
Ilya Pollak
qk =xfX (x)dx
xk
xk+1
∫
fX (x)dxxk
xk+1
∫= E X | X ∈k-th quantization interval[ ], for k = 1,…,L
xk =qk−1 + qk
2, for k = 2,…,L
⎧
⎨
⎪⎪⎪⎪
⎩
⎪⎪⎪⎪
• Closed-form solution can be found only for very simple PDFs. – E.g., if X is uniform, then Lloyd-Max quantizer = uniform quantizer.
![Page 87: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/87.jpg)
Nonlinear system to be solved
Ilya Pollak
qk =xfX (x)dx
xk
xk+1
∫
fX (x)dxxk
xk+1
∫= E X | X ∈k-th quantization interval[ ], for k = 1,…,L
xk =qk−1 + qk
2, for k = 2,…,L
⎧
⎨
⎪⎪⎪⎪
⎩
⎪⎪⎪⎪
• Closed-form solution can be found only for very simple PDFs. – E.g., if X is uniform, then Lloyd-Max quantizer = uniform quantizer.
• In general, an approximate solution can be found numerically, via an iterative algorithm (e.g., lloyds command in Matlab).
![Page 88: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/88.jpg)
Nonlinear system to be solved
Ilya Pollak
qk =xfX (x)dx
xk
xk+1
∫
fX (x)dxxk
xk+1
∫= E X | X ∈k-th quantization interval[ ], for k = 1,…,L
xk =qk−1 + qk
2, for k = 2,…,L
⎧
⎨
⎪⎪⎪⎪
⎩
⎪⎪⎪⎪
• Closed-form solution can be found only for very simple PDFs. – E.g., if X is uniform, then Lloyd-Max quantizer = uniform quantizer.
• In general, an approximate solution can be found numerically, via an iterative algorithm (e.g., lloyds command in Matlab).
• For real data, typically the PDF is not given and therefore needs to be estimated using, for example, histograms constructed from the observed data.
![Page 89: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/89.jpg)
Vector Lloyd-Max quantizer?
Ilya Pollak
X = X(1),…,X(N )( ) = source random vector with a given joint distribution.L = a desired number of quantization points.
![Page 90: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/90.jpg)
Vector Lloyd-Max quantizer?
Ilya Pollak
X = X(1),…,X(N )( ) = source random vector with a given joint distribution.L = a desired number of quantization points.We would like to find:(1) L events A1,…,AL that partition the joint sample space of X(1),…,X(N ), and(2) L quantization points q1 ∈A1,…,qL ∈AL
![Page 91: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/91.jpg)
Vector Lloyd-Max quantizer?
Ilya Pollak
X = X(1),…,X(N )( ) = source random vector with a given joint distribution.L = a desired number of quantization points.We would like to find:(1) L events A1,…,AL that partition the joint sample space of X(1),…,X(N ), and(2) L quantization points q1 ∈A1,…,qL ∈AL ,
such that the quantized random vector, defined byY = qk if X ∈Ak , for k = 1,…,L,minimizes the mean-square error,
E Y − X 2⎡⎣ ⎤⎦ = E Y (n) − X(n)( )2
n=1
N
∑⎡⎣⎢
⎤⎦⎥
![Page 92: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/92.jpg)
Vector Lloyd-Max quantizer?
Ilya Pollak
X = X(1),…,X(N )( ) = source random vector with a given joint distribution.L = a desired number of quantization points.We would like to find:(1) L events A1,…,AL that partition the joint sample space of X(1),…,X(N ), and(2) L quantization points q1 ∈A1,…,qL ∈AL ,
such that the quantized random vector, defined byY = qk if X ∈Ak , for k = 1,…,L,minimizes the mean-square error,
E Y − X 2⎡⎣ ⎤⎦ = E Y (n) − X(n)( )2
n=1
N
∑⎡⎣⎢
⎤⎦⎥
Difficulty: cannot differentiate with respect to a set Ak , and so unless the set of all allowedpartitions is somehow restricted, this cannot be solved.
![Page 93: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/93.jpg)
Hopefully, prior discussion gives you some idea about various
issues involved in quantization. And now, on to entropy coding…
Ilya Pollak
transform quantization entropy coding
compressed bitstream data
![Page 94: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/94.jpg)
Problem statement
Ilya Pollak
Source (e.g., image, video, speech signal, or quantizer output)
Sequence of discrete random variables X(1),…,X(N) (e.g., transformed image pixel values), assumed to be independent and identically distributed over a finite alphabet {a1,…,aM}.
![Page 95: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/95.jpg)
Problem statement
Requirements: • minimize the expected length of the binary string; • the binary string needs to be uniquely decodable, i.e., we need to be able
to infer X(1),…,X(N) from it!
Ilya Pollak
Source (e.g., image, video, speech signal, or quantizer output)
Encoder: mapping between source
symbols and binary strings (codewords)
Sequence of discrete random variables X(1),…,X(N) (e.g., transformed image pixel values), assumed to be independent and identically distributed over a finite alphabet {a1,…,aM}. Binary string
![Page 96: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/96.jpg)
Problem statement
• Since X(1),…,X(N) are assumed independent in this model, we will encode each of them separately.
• Each can assume any value among {a1,…,aM}. • Therefore, our code will consist of M codewords, one for each symbol
a1,…,aM.
Ilya Pollak
Source (e.g., image, video, speech signal, or quantizer output)
Encoder: mapping between source
symbols and binary strings (codewords)
Sequence of discrete random variables X(1),…,X(N) (e.g., transformed image pixel values), assumed to be independent and identically distributed over a finite alphabet {a1,…,aM}. Binary string
symbol codeword
a1 w1
… …
aM wM
![Page 97: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/97.jpg)
Unique Decodability
• How to decode the following string: 0001? • It could be aaab or aad or acb or cab or cd. • Not uniquely decodable!
Ilya Pollak
symbol codeword
a 0
b 1
c 00
d 01
![Page 98: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/98.jpg)
A condition that ensures unique decodability
Ilya Pollak
• Prefix condition: no codeword in the code is a prefix for any other codeword.
![Page 99: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/99.jpg)
A condition that ensures unique decodability
Ilya Pollak
• Prefix condition: no codeword in the code is a prefix for any other codeword.
• If the prefix condition is satisfied, then the code is uniquely decodable. – Proof. Take a bit string W that corresponds to two different
strings of symbols, A and B. If the first symbols in A and B are the same, discard them and the corresponding portion of W. Repeat until either there are no bits left in W (in this case A=B) or the first symbols in A and B are different. Then one of the codewords corresponding to these two symbols is a prefix for the other.
![Page 100: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/100.jpg)
A condition that ensures unique decodability
Ilya Pollak
• Prefix condition: no codeword in the code is a prefix for any other codeword.
• Visualizing binary strings. Form a binary tree where each branch is labeled 0 or 1. Each codeword w can be associated with the unique node of the tree such that string of 0’s and 1’s on the path from the root to the node forms w.
![Page 101: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/101.jpg)
A condition that ensures unique decodability
Ilya Pollak
• Prefix condition: no codeword in the code is a prefix for any other codeword.
• Visualizing binary strings. Form a binary tree where each branch is labeled 0 or 1. Each codeword w can be associated with the unique node of the tree such that string of 0’s and 1’s on the path from the root to the node forms w.
• Prefix condition holds if an only if all the codewords are leaves of the binary tree.
![Page 102: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/102.jpg)
A condition that ensures unique decodability
Ilya Pollak
• Prefix condition: no codeword in the code is a prefix for any other codeword.
• Visualizing binary strings. Form a binary tree where each branch is labeled 0 or 1. Each codeword w can be associated with the unique node of the tree such that string of 0’s and 1’s on the path from the root to the node forms w.
• Prefix condition holds if an only if all the codewords are leaves of the binary tree---i.e., if no codeword is a descendant of another codeword.
![Page 103: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/103.jpg)
Example: no prefix condition, no unique decodability, one word is not a leaf
Ilya Pollak
symbol codeword
a 0
b 1
c 00
d 01
• Codeword 0 is a prefix for both codeword 00 and codeword 01
![Page 104: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/104.jpg)
Example: no prefix condition, no unique decodability, one word is not a leaf
Ilya Pollak
symbol codeword
a 0
b 1
c
d
• Codeword 0 is a prefix for both codeword 00 and codeword 01
1
0 wa=0
wb=1
![Page 105: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/105.jpg)
Example: no prefix condition, no unique decodability, one word is not a leaf
Ilya Pollak
symbol codeword
a 0
b 1
c 00
d
• Codeword 0 is a prefix for both codeword 00 and codeword 01
0
1
0 wa=0
wb=1
wc=00
![Page 106: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/106.jpg)
Example: no prefix condition, no unique decodability, one word is not a leaf
Ilya Pollak
symbol codeword
a 0
b 1
c 00
d 01
• Codeword 0 is a prefix for both codeword 00 and codeword 01
0
1
1 0
wa=0
wb=1
wd=01
wc=00
![Page 107: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/107.jpg)
Example: prefix condition, all words are leaves
Ilya Pollak
symbol codeword
a 1
b
c
d
1
0
wa=1
![Page 108: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/108.jpg)
Example: prefix condition, all words are leaves
Ilya Pollak
symbol codeword
a 1
b 01
c
d
0
1
1 0
wa=1
wb=01
![Page 109: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/109.jpg)
Example: prefix condition, all words are leaves
Ilya Pollak
symbol codeword
a 1
b 01
c 000
d 001
0
1
1 0
wa=1
wb=01
wd=001
0 wc=000
1
![Page 110: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/110.jpg)
Example: prefix condition, all words are leaves
Ilya Pollak
symbol codeword
a 1
b 01
c 000
d 001
0
1
1 0
wa=1
wb=01
wd=001
0 wc=000
1
• No path from the root to a codeword contains another codeword. This is equivalent to saying that the prefix condition holds.
![Page 111: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/111.jpg)
Example: prefix condition, all words are leaves => unique decodability
Ilya Pollak
symbol codeword
a 1
b 01
c 000
d 001
0
1
1 0
wa=1
wb=01
wd=001
0 wc=000
1
Decoding: traverse the string left to right, tracing the corresponding path from the root of the binary tree. Each time a leaf is reached, output the codeword and go back to the root.
![Page 112: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/112.jpg)
Example: prefix condition, all words are leaves => unique decodability
Ilya Pollak
0
1
1 0
wa=1 wb=01
wd=001
0 wc=000
1
000001101 How to decode the following string?
![Page 113: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/113.jpg)
Example: prefix condition, all words are leaves => unique decodability
Ilya Pollak
0
1
1 0
wa=1 wb=01
wd=001
0 wc=000
1
000001101
![Page 114: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/114.jpg)
Example: prefix condition, all words are leaves => unique decodability
Ilya Pollak
0
1
1 0
wa=1 wb=01
wd=001
0 wc=000
1
000001101
![Page 115: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/115.jpg)
Example: prefix condition, all words are leaves => unique decodability
Ilya Pollak
0
1
1 0
wa=1 wb=01
wd=001
0 wc=000
1
000001101
![Page 116: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/116.jpg)
Example: prefix condition, all words are leaves => unique decodability
Ilya Pollak
0
1
1 0
wa=1 wb=01
wd=001
0 wc=000
1
000001101
output: c
![Page 117: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/117.jpg)
Example: prefix condition, all words are leaves => unique decodability
Ilya Pollak
0
1
1 0
wa=1 wb=01
wd=001
0 wc=000
1
000001101
output: c
![Page 118: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/118.jpg)
Example: prefix condition, all words are leaves => unique decodability
Ilya Pollak
0
1
1 0
wa=1 wb=01
wd=001
0 wc=000
1
000001101
output: c
![Page 119: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/119.jpg)
Example: prefix condition, all words are leaves => unique decodability
Ilya Pollak
0
1
1 0
wa=1 wb=01
wd=001
0 wc=000
1
000001101
output: c
![Page 120: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/120.jpg)
Example: prefix condition, all words are leaves => unique decodability
Ilya Pollak
0
1
1 0
wa=1 wb=01
wd=001
0 wc=000
1
000001101
output: cd
![Page 121: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/121.jpg)
Example: prefix condition, all words are leaves => unique decodability
Ilya Pollak
0
1
1 0
wa=1 wb=01
wd=001
0 wc=000
1
000001101
output: cd
![Page 122: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/122.jpg)
Example: prefix condition, all words are leaves => unique decodability
Ilya Pollak
0
1
1 0
wa=1
wb=01
wd=001
0 wc=000
1
000001101
output: cda
![Page 123: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/123.jpg)
Example: prefix condition, all words are leaves => unique decodability
Ilya Pollak
0
1
1 0
wa=1
wb=01
wd=001
0 wc=000
1
000001101
output: cda
![Page 124: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/124.jpg)
Example: prefix condition, all words are leaves => unique decodability
Ilya Pollak
0
1
1 0
wa=1
wb=01
wd=001
0 wc=000
1
000001101
output: cda
![Page 125: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/125.jpg)
Example: prefix condition, all words are leaves => unique decodability
Ilya Pollak
0
1
1 0
wa=1
wb=01
wd=001
0 wc=000
1
000001101
output: cdab
![Page 126: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/126.jpg)
Example: prefix condition, all words are leaves => unique decodability
Ilya Pollak
0
1
1 0
wa=1
wb=01
wd=001
0 wc=000
1
000001101
final output: cdab
![Page 127: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/127.jpg)
Prefix condition and unique decodability
• There are uniquely decodable codes which do not satisfy the prefix condition (e.g., {0, 01}).
Ilya Pollak
![Page 128: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/128.jpg)
Prefix condition and unique decodability
• There are uniquely decodable codes which do not satisfy the prefix condition (e.g., {0, 01}). For any such code, a prefix condition code can be constructed with an identical set of codeword lengths. (E.g., {0, 10} for {0, 01}.)
Ilya Pollak
![Page 129: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/129.jpg)
Prefix condition and unique decodability
• There are uniquely decodable codes which do not satisfy the prefix condition (e.g., {0, 01}). For any such code, a prefix condition code can be constructed with an identical set of codeword lengths. (E.g., {0, 10} for {0, 01}.)
• For this reason, we can consider just prefix condition codes.
Ilya Pollak
![Page 130: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/130.jpg)
Entropy coding • Given a discrete random variable X with M possible outcomes
(“symbols” or “letters”) a1,…,aM and with PMF pX, what is the lowest achievable expected codeword length among all the uniquely decodable codes? – Answer depends on pX; Shannon’s source coding theorem provides
bounds.
• How to construct a prefix condition code which achieves this expected codeword length? – Answer: Huffman code.
Ilya Pollak
![Page 131: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/131.jpg)
Huffman code • Consider a discrete r.v. X with M possible outcomes a1,…,aM and with PMF
pX. Assume that pX(a1) ≤ … ≤ pX(aM). (If this condition is not satisfied, reorder the outcomes so that it is satisfied.)
Ilya Pollak
![Page 132: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/132.jpg)
Huffman code • Consider a discrete r.v. X with M possible outcomes a1,…,aM and with PMF
pX. Assume that pX(a1) ≤ … ≤ pX(aM). (If this condition is not satisfied, reorder the outcomes so that it is satisfied.)
• Consider “aggregate outcome” a12 = {a1,a2} and a discrete r.v. X’ such that
Ilya Pollak
X ' = a12 if X = a1 or X = a2
X otherwise
⎧⎨⎪
⎩⎪
![Page 133: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/133.jpg)
Huffman code • Consider a discrete r.v. X with M possible outcomes a1,…,aM and with PMF
pX. Assume that pX(a1) ≤ … ≤ pX(aM). (If this condition is not satisfied, reorder the outcomes so that it is satisfied.)
• Consider “aggregate outcome” a12 = {a1,a2} and a discrete r.v. X’ such that
Ilya Pollak
X ' = a12 if X = a1 or X = a2
X otherwise
⎧⎨⎪
⎩⎪
pX ' a( ) =pX a1( ) + pX a2( ) if a = a12
pX a( ) if a = a3,…,aM
⎧⎨⎪
⎩⎪
![Page 134: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/134.jpg)
Huffman code • Consider a discrete r.v. X with M possible outcomes a1,…,aM and with PMF
pX. Assume that pX(a1) ≤ … ≤ pX(aM). (If this condition is not satisfied, reorder the outcomes so that it is satisfied.)
• Consider “aggregate outcome” a12 = {a1,a2} and a discrete r.v. X’ such that
Ilya Pollak
X ' = a12 if X = a1 or X = a2
X otherwise
⎧⎨⎪
⎩⎪
• Suppose we have a tree, T’, for an optimal prefix condition code for X’. A tree T for an optimal prefix condition code for X can be obtained from T’ by splitting the leaf a12 into two leaves corresponding to a1 and a2.
pX ' a( ) =pX a1( ) + pX a2( ) if a = a12
pX a( ) if a = a3,…,aM
⎧⎨⎪
⎩⎪
![Page 135: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/135.jpg)
Huffman code • Consider a discrete r.v. X with M possible outcomes a1,…,aM and with PMF
pX. Assume that pX(a1) ≤ … ≤ pX(aM). (If this condition is not satisfied, reorder the outcomes so that it is satisfied.)
• Consider “aggregate outcome” a12 = {a1,a2} and a discrete r.v. X’ such that
Ilya Pollak
X ' = a12 if X = a1 or X = a2
X otherwise
⎧⎨⎪
⎩⎪
• Suppose we have a tree, T’, for an optimal prefix condition code for X’. A tree T for an optimal prefix condition code for X can be obtained from T’ by splitting the leaf a12 into two leaves corresponding to a1 and a2.
• We won’t prove this.
pX ' a( ) =pX a1( ) + pX a2( ) if a = a12
pX a( ) if a = a3,…,aM
⎧⎨⎪
⎩⎪
![Page 136: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/136.jpg)
Example
Ilya Pollak
letter pX(letter)
a1 0.10
a2 0.10
a3 0.25
a4 0.25
a5 0.30
![Page 137: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/137.jpg)
Example
Ilya Pollak
letter pX(letter)
a1 0.10
a2 0.10
a3 0.25
a4 0.25
a5 0.30
Step 1: combine the two least likely letters.
letter pX’(letter)
a12 0.20
a3 0.25
a4 0.25
a5 0.30
![Page 138: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/138.jpg)
Example
Ilya Pollak
letter pX(letter)
a1 0.10
a2 0.10
a3 0.25
a4 0.25
a5 0.30
Step 1: combine the two least likely letters.
letter pX’(letter)
a12 0.20
a3 0.25
a4 0.25
a5 0.30
0
1 a1
a2
a12
![Page 139: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/139.jpg)
Example
Ilya Pollak
letter pX(letter)
a1 0.10
a2 0.10
a3 0.25
a4 0.25
a5 0.30
Step 1: combine the two least likely letters.
letter pX’(letter)
a12 0.20
a3 0.25
a4 0.25
a5 0.30
0
1 a1
a2
a12 Tree for X:
Tree for X’ (still to be constructed)
![Page 140: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/140.jpg)
Example
Ilya Pollak
Step 2: combine the two least likely letters from the new alphabet.
letter pX’’(letter)
a123 0.45
a4 0.25
a5 0.30
letter pX’(letter)
a12 0.20
a3 0.25
a4 0.25
a5 0.30
![Page 141: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/141.jpg)
Example
Ilya Pollak
Step 2: combine the two least likely letters from the new alphabet.
letter pX’’(letter)
a123 0.45
a4 0.25
a5 0.30
letter pX’(letter)
a12 0.20
a3 0.25
a4 0.25
a5 0.30
0
1 a1
a2
a12
a123
a3
1
0
![Page 142: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/142.jpg)
Example
Ilya Pollak
Step 2: combine the two least likely letters from the new alphabet.
letter pX’’(letter)
a123 0.45
a4 0.25
a5 0.30
letter pX’(letter)
a12 0.20
a3 0.25
a4 0.25
a5 0.30
0
1 a1
a2
a12 Tree for X:
Tree for X’’
a123
a3
1
0
![Page 143: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/143.jpg)
Example
Ilya Pollak
Step 2: combine the two least likely letters from the new alphabet.
letter pX’’(letter)
a123 0.45
a4 0.25
a5 0.30
letter pX’(letter)
a12 0.20
a3 0.25
a4 0.25
a5 0.30
0
1 a1
a2
a12 Tree for X:
Tree for X’’
a123
a3
Tree for X’
1
0
![Page 144: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/144.jpg)
Example
Ilya Pollak
Step 3: again combine the two least likely letters
letter pX’’’(letter)
a123 0.45
a45 0.55
0
1 a1
a2
a12
a123
a3
letter pX’’(letter)
a123 0.45
a4 0.25
a5 0.30
1
0
a4 a45
a5
1
0
![Page 145: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/145.jpg)
Example
Ilya Pollak
Step 3: again combine the two least likely letters
letter pX’’’(letter)
a123 0.45
a45 0.55
0
1 a1
a2
a12 Tree for X:
a123
a3
letter pX’’(letter)
a123 0.45
a4 0.25
a5 0.30
1
0
a4 a45
a5
1
0
Tree for X’’’
![Page 146: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/146.jpg)
Example
Ilya Pollak
Step 3: again combine the two least likely letters
letter pX’’’(letter)
a123 0.45
a45 0.55
0
1 a1
a2
a12 Tree for X:
a123
a3
letter pX’’(letter)
a123 0.45
a4 0.25
a5 0.30
1
0
a4 a45
a5
1
0
Tree for X’’
Tree for X’’’
![Page 147: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/147.jpg)
Example
Ilya Pollak
Step 3: again combine the two least likely letters
letter pX’’’(letter)
a123 0.45
a45 0.55
0
1 a1
a2
a12 Tree for X:
a123
a3
Tree for X’
letter pX’’(letter)
a123 0.45
a4 0.25
a5 0.30
1
0
a4 a45
a5
1
0
Tree for X’’
Tree for X’’’
![Page 148: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/148.jpg)
Example
Ilya Pollak
Step 4: combine the last two remaining letters letter pX’’’(letter)
a123 0.45
a45 0.55 Done!
0
1 a1
a2
a12 Tree for X:
a123
a3
1
0
a4 a45
a5
1
0
a12345 1
0
![Page 149: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/149.jpg)
Example
Ilya Pollak
Step 4: combine the last two remaining letters letter pX’’’(letter)
a123 0.45
a45 0.55
Done! The codeword for each leaf is the sequence of 0’1 and 1’s along the path from the root to that leaf.
0
1 a1
a2
Tree for X:
a3
1
0
a4
a5
1
0
1
0
![Page 150: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/150.jpg)
Example
Ilya Pollak
0
1 a1
a2
Tree for X:
a3
1
0
a4
a5
1
0
1
0
letter pX(letter) codeword
a1 0.10 111
a2 0.10
a3 0.25
a4 0.25
a5 0.30
![Page 151: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/151.jpg)
Example
Ilya Pollak
0
1 a1
a2
Tree for X:
a3
1
0
a4
a5
1
0
1
0
letter pX(letter) codeword
a1 0.10 111
a2 0.10 110
a3 0.25
a4 0.25
a5 0.30
![Page 152: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/152.jpg)
Example
Ilya Pollak
0
1 a1
a2
Tree for X:
a3
1
0
a4
a5
1
0
1
0
letter pX(letter) codeword
a1 0.10 111
a2 0.10 110
a3 0.25 10
a4 0.25
a5 0.30
![Page 153: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/153.jpg)
Example
Ilya Pollak
0
1 a1
a2
Tree for X:
a3
1
0
a4
a5
1
0
1
0
letter pX(letter) codeword
a1 0.10 111
a2 0.10 110
a3 0.25 10
a4 0.25 01
a5 0.30
![Page 154: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/154.jpg)
Example
Ilya Pollak
0
1 a1
a2
Tree for X:
a3
1
0
a4
a5
1
0
1
0
letter pX(letter) codeword
a1 0.10 111
a2 0.10 110
a3 0.25 10
a4 0.25 01
a5 0.30 00
![Page 155: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/155.jpg)
Example
Ilya Pollak
0
1 a1
a2
Tree for X:
a3
1
0
a4
a5
1
0
1
0
letter pX(letter) codeword
a1 0.10 111
a2 0.10 110
a3 0.25 10
a4 0.25 01
a5 0.30 00
Expected codeword length: 3(0.1) + 3(0.1) + 2(0.25) + 2(0.25) + 2(0.3) = 2.2 bits
![Page 156: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/156.jpg)
Self-information • Consider again a discrete random variable X with M possible
outcomes a1,…,aM and with PMF pX.
Ilya Pollak
![Page 157: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/157.jpg)
Self-information • Consider again a discrete random variable X with M possible
outcomes a1,…,aM and with PMF pX. • Self-information of outcome am is I(am) = −log2 pX(am) bits.
Ilya Pollak
![Page 158: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/158.jpg)
Self-information • Consider again a discrete random variable X with M possible
outcomes a1,…,aM and with PMF pX. • Self-information of outcome am is I(am) = −log2 pX(am) bits. • E.g., pX(am) = 1 then I(am) = 0. The occurrence of am is not at
all informative, since it had to occur. The smaller the probability of an outcome, the larger its self-information.
Ilya Pollak
![Page 159: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/159.jpg)
Self-information • Consider again a discrete random variable X with M possible
outcomes a1,…,aM and with PMF pX. • Self-information of outcome am is I(am) = −log2 pX(am) bits. • E.g., pX(am) = 1 then I(am) = 0. The occurrence of am is not at
all informative, since it had to occur. The smaller the probability of an outcome, the larger its self-information.
• Self-information of X is I(X) = −log2 pX(X) and is a random variable.
Ilya Pollak
![Page 160: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/160.jpg)
Self-information • Consider again a discrete random variable X with M possible
outcomes a1,…,aM and with PMF pX. • Self-information of outcome am is I(am) = −log2 pX(am) bits. • E.g., pX(am) = 1 then I(am) = 0. The occurrence of am is not at
all informative, since it had to occur. The smaller the probability of an outcome, the larger its self-information.
• Self-information of X is I(X) = −log2 pX(X) and is a random variable.
• Entropy of X is the expected value of its self-information:
Ilya Pollak
H (X) = E I(X)[ ] = − pX (am )log2m=1
M
∑ pX (am )
![Page 161: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/161.jpg)
Source coding theorem (Shannon)
For any uniquely decodable code, the expected codeword length is ≥ H (X).Moreover, there exists a prefix condition code for which the expected codewordlength is < H (X) +1.
Ilya Pollak
![Page 162: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/162.jpg)
Example
Ilya Pollak
• Suppose that X has M=2K possible outcomes a1,…,aM.
![Page 163: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/163.jpg)
Example
Ilya Pollak
• Suppose that X has M=2K possible outcomes a1,…,aM. • Suppose that X is uniform, i.e., pX (a1) = … = pX (aM) = 2−K.
![Page 164: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/164.jpg)
Example
Ilya Pollak
• Suppose that X has M=2K possible outcomes a1,…,aM. • Suppose that X is uniform, i.e., pX (a1) = … = pX (aM) = 2−K. Then
H (X) = E I(X)[ ] = − 2−K log2k=1
2K
∑ 2−K( ) = 2K −2−K( ) −K( ) = K
![Page 165: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/165.jpg)
Example
Ilya Pollak
• Suppose that X has M=2K possible outcomes a1,…,aM. • Suppose that X is uniform, i.e., pX (a1) = … = pX (aM) = 2−K. Then
H (X) = E I(X)[ ] = − 2−K log2k=1
2K
∑ 2−K( ) = 2K −2−K( ) −K( ) = K
• On the other hand, observe that there exist 2K different K-bit sequences. Thus, a fixed-length code for X that uses all these 2K K-bit sequences as codewords for all the 2K outcomes of X, will have expected codeword length of K.
![Page 166: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/166.jpg)
Example
Ilya Pollak
• Suppose that X has M=2K possible outcomes a1,…,aM. • Suppose that X is uniform, i.e., pX (a1) = … = pX (aM) = 2−K. Then
H (X) = E I(X)[ ] = − 2−K log2k=1
2K
∑ 2−K( ) = 2K −2−K( ) −K( ) = K
• On the other hand, observe that there exist 2K different K-bit sequences. Thus, a fixed-length code for X that uses all these 2K K-bit sequences as codewords for all the 2K outcomes of X, will have expected codeword length of K.
• I.e., for this particular random variable, this fixed-length code achieves the entropy of X, which is the lower bound given by the source coding theorem.
![Page 167: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/167.jpg)
Example
Ilya Pollak
• Suppose that X has M=2K possible outcomes a1,…,aM. • Suppose that X is uniform, i.e., pX (a1) = … = pX (aM) = 2−K. Then
H (X) = E I(X)[ ] = − 2−K log2k=1
2K
∑ 2−K( ) = 2K −2−K( ) −K( ) = K
• On the other hand, observe that there exist 2K different K-bit sequences. Thus, a fixed-length code for X that uses all these 2K K-bit sequences as codewords for all the 2K outcomes of X, will have expected codeword length of K.
• I.e., for this particular random variable, this fixed-length code achieves the entropy of X, which is the lower bound given by the source coding theorem.
• Therefore, the K-bit fixed-length code is optimal for this X.
![Page 168: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/168.jpg)
Lemma 1: An auxiliary result helpful for proving the source coding theorem
Ilya Pollak
• log2α ≤ (α−1) log2e for log2 α > 0. • Proof: differentiate g(α) = (α−1) log2e − log2α and show that
g(1) = 0 is its minimum.
![Page 169: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/169.jpg)
Another auxiliary result: Kraft inequality
Ilya Pollak
If integers d1,…,dM satisfy the inequality
2−dm
m=1
M
∑ ≤ 1, (1)
then there exists a prefix condition code whose codeword lengths are these integers.Conversely, the codeword lengths of any prefix condition code satisfy this inequality.
![Page 170: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/170.jpg)
Some useful facts about full binary trees
Ilya Pollak
A full binary tree of depth D has 2D leaves.
![Page 171: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/171.jpg)
Some useful facts about full binary trees
Ilya Pollak
A full binary tree of depth D has 2D leaves. (Here, depth is D=4 and the number of leaves is 24=16.)
Tree depth D = 4
![Page 172: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/172.jpg)
Some useful facts about full binary trees
Ilya Pollak
A full binary tree of depth D has 2D leaves. (Here, depth is D=4 and the number of leaves is 24=16.)
In a full binary tree of depth D, each node at depth d has 2D−d leaf descendants. (Here, D=4, the red node is at depth d=2, and so it has 24−2 = 4 leaf descendants.)
Tree depth D = 4
Depth of red node = 2
![Page 173: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/173.jpg)
Kraft inequality: proof of
Ilya Pollak
Suppose d1 ≤… ≤ dM satisfy (1). Consider the full binary tree of depth dM , and consider all itsnodes at depth d1. Assign one of these nodes to symbol a1.
⇒
![Page 174: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/174.jpg)
Kraft inequality: proof of
Ilya Pollak
Suppose d1 ≤… ≤ dM satisfy (1). Consider the full binary tree of depth dM , and consider all itsnodes at depth d1. Assign one of these nodes to symbol a1. Consider all the nodes at depth d2 whichare not a1 and not descendants of a1. Assign one of them to symbol a2 .
⇒
![Page 175: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/175.jpg)
Kraft inequality: proof of
Ilya Pollak
Suppose d1 ≤… ≤ dM satisfy (1). Consider the full binary tree of depth dM , and consider all itsnodes at depth d1. Assign one of these nodes to symbol a1. Consider all the nodes at depth d2 whichare not a1 and not descendants of a1. Assign one of them to symbol a2 . Iterate like this M times.
⇒
![Page 176: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/176.jpg)
Kraft inequality: proof of
Ilya Pollak
Suppose d1 ≤… ≤ dM satisfy (1). Consider the full binary tree of depth dM , and consider all itsnodes at depth d1. Assign one of these nodes to symbol a1. Consider all the nodes at depth d2 whichare not a1 and not descendants of a1. Assign one of them to symbol a2 . Iterate like this M times.If we have run out of tree nodes to assign after r < M iterations, it means that every leaf in the fullbinary tree of depth dM is a descendant of one of the first m symbols, a1,…,ar .
⇒
![Page 177: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/177.jpg)
Kraft inequality: proof of
Ilya Pollak
Suppose d1 ≤… ≤ dM satisfy (1). Consider the full binary tree of depth dM , and consider all itsnodes at depth d1. Assign one of these nodes to symbol a1. Consider all the nodes at depth d2 whichare not a1 and not descendants of a1. Assign one of them to symbol a2 . Iterate like this M times.If we have run out of tree nodes to assign after r < M iterations, it means that every leaf in the fullbinary tree of depth dM is a descendant of one of the first m symbols, a1,…,ar . But note that everynode at depth dm has 2dM −dm descendants. Note also that the full tree has 2dM leaves. Therefore, ifevery leaf in the tree is a descendant of a1,…,ar , then
2dM −dm
m=1
r
∑ = 2dM
⇒
![Page 178: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/178.jpg)
Kraft inequality: proof of
Ilya Pollak
Suppose d1 ≤… ≤ dM satisfy (1). Consider the full binary tree of depth dM , and consider all itsnodes at depth d1. Assign one of these nodes to symbol a1. Consider all the nodes at depth d2 whichare not a1 and not descendants of a1. Assign one of them to symbol a2 . Iterate like this M times.If we have run out of tree nodes to assign after r < M iterations, it means that every leaf in the fullbinary tree of depth dM is a descendant of one of the first m symbols, a1,…,ar . But note that everynode at depth dm has 2dM −dm descendants. Note also that the full tree has 2dM leaves. Therefore, ifevery leaf in the tree is a descendant of a1,…,ar , then
2dM −dm
m=1
r
∑ = 2dM ⇔ 2−dm
m=1
r
∑ = 1
⇒
![Page 179: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/179.jpg)
Kraft inequality: proof of
Ilya Pollak
Suppose d1 ≤… ≤ dM satisfy (1). Consider the full binary tree of depth dM , and consider all itsnodes at depth d1. Assign one of these nodes to symbol a1. Consider all the nodes at depth d2 whichare not a1 and not descendants of a1. Assign one of them to symbol a2 . Iterate like this M times.If we have run out of tree nodes to assign after r < M iterations, it means that every leaf in the fullbinary tree of depth dM is a descendant of one of the first m symbols, a1,…,ar . But note that everynode at depth dm has 2dM −dm descendants. Note also that the full tree has 2dM leaves. Therefore, ifevery leaf in the tree is a descendant of a1,…,ar , then
2dM −dm
m=1
r
∑ = 2dM ⇔ 2−dm
m=1
r
∑ = 1
Therefore, 2−dm
m=1
M
∑ = 2−dm
m=1
r
∑ + 2−dm
m= r+1
M
∑ > 1. This violates (1).
⇒
![Page 180: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/180.jpg)
Kraft inequality: proof of
Ilya Pollak
Suppose d1 ≤… ≤ dM satisfy (1). Consider the full binary tree of depth dM , and consider all itsnodes at depth d1. Assign one of these nodes to symbol a1. Consider all the nodes at depth d2 whichare not a1 and not descendants of a1. Assign one of them to symbol a2 . Iterate like this M times.If we have run out of tree nodes to assign after r < M iterations, it means that every leaf in the fullbinary tree of depth dM is a descendant of one of the first m symbols, a1,…,ar . But note that everynode at depth dm has 2dM −dm descendants. Note also that the full tree has 2dM leaves. Therefore, ifevery leaf in the tree is a descendant of a1,…,ar , then
2dM −dm
m=1
r
∑ = 2dM ⇔ 2−dm
m=1
r
∑ = 1
Therefore, 2−dm
m=1
M
∑ = 2−dm
m=1
r
∑ + 2−dm
m= r+1
M
∑ > 1. This violates (1).
Thus, our procedure can in fact go on for M iterations. After the M -th iteration, we will haveconstructed a prefix condition code with codeword lengths d1,…,dM .
⇒
![Page 181: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/181.jpg)
Kraft inequality: proof of
Ilya Pollak
Suppose d1 ≤… ≤ dM , and suppose we have a prefix condition code with there codeword lengths.Consider the binary tree corresponding to this code.
⇐
![Page 182: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/182.jpg)
Kraft inequality: proof of
Ilya Pollak
Suppose d1 ≤… ≤ dM , and suppose we have a prefix condition code with there codeword lengths.Consider the binary tree corresponding to this code. Complete this tree to obtain a full tree ofdepth dM .
⇐
![Page 183: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/183.jpg)
Kraft inequality: proof of
Ilya Pollak
Suppose d1 ≤… ≤ dM , and suppose we have a prefix condition code with there codeword lengths.Consider the binary tree corresponding to this code. Complete this tree to obtain a full tree ofdepth dM . Again use the following facts:the full tree has 2dM leaves;the number of leaf descendants of the codeword of length dm is 2dM −dm .
⇐
![Page 184: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/184.jpg)
Kraft inequality: proof of
Ilya Pollak
Suppose d1 ≤… ≤ dM , and suppose we have a prefix condition code with there codeword lengths.Consider the binary tree corresponding to this code. Complete this tree to obtain a full tree ofdepth dM . Again use the following facts:the full tree has 2dM leaves;the number of leaf descendants of the codeword of length dm is 2dM −dm .The combined number of all leaf descendants of all codewords must be less than or equal tothe total number of leaves in the full tree:
2dM −dm
m=1
M
∑ ≤ 2dM
⇐
![Page 185: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/185.jpg)
Kraft inequality: proof of
Ilya Pollak
Suppose d1 ≤… ≤ dM , and suppose we have a prefix condition code with there codeword lengths.Consider the binary tree corresponding to this code. Complete this tree to obtain a full tree ofdepth dM . Again use the following facts:the full tree has 2dM leaves;the number of leaf descendants of the codeword of length dm is 2dM −dm .The combined number of all leaf descendants of all codewords must be less than or equal tothe total number of leaves in the full tree:
2dM −dm
m=1
M
∑ ≤ 2dM ⇔ 2−dm
m=1
M
∑ ≤ 1.
⇐
![Page 186: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/186.jpg)
Source coding theorem: proof of H(X)≤E[C]
Ilya Pollak
Let dm be the codeword length for am , and let random variable C be the codeword length for X.
![Page 187: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/187.jpg)
Source coding theorem: proof of H(X)≤E[C]
Ilya Pollak
Let dm be the codeword length for am , and let random variable C be the codeword length for X.
H (X) − E[C] = − pX (am )log2m=1
M
∑ pX (am ) − pX (am )dmm=1
M
∑
![Page 188: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/188.jpg)
Source coding theorem: proof of H(X)≤E[C]
Ilya Pollak
Let dm be the codeword length for am , and let random variable C be the codeword length for X.
H (X) − E[C] = − pX (am )log2m=1
M
∑ pX (am ) − pX (am )dmm=1
M
∑ = pX (am ) log21
pX (am )⎛⎝⎜
⎞⎠⎟− log2 2dm
⎡
⎣⎢
⎤
⎦⎥
m=1
M
∑
![Page 189: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/189.jpg)
Source coding theorem: proof of H(X)≤E[C]
Ilya Pollak
Let dm be the codeword length for am , and let random variable C be the codeword length for X.
H (X) − E[C] = − pX (am )log2m=1
M
∑ pX (am ) − pX (am )dmm=1
M
∑ = pX (am ) log21
pX (am )⎛⎝⎜
⎞⎠⎟− log2 2dm
⎡
⎣⎢
⎤
⎦⎥
m=1
M
∑
= pX (am ) log21
pX (am )2dm
⎛⎝⎜
⎞⎠⎟
⎡
⎣⎢
⎤
⎦⎥
m=1
M
∑
![Page 190: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/190.jpg)
Source coding theorem: proof of H(X)≤E[C]
Ilya Pollak
Let dm be the codeword length for am , and let random variable C be the codeword length for X.
H (X) − E[C] = − pX (am )log2m=1
M
∑ pX (am ) − pX (am )dmm=1
M
∑ = pX (am ) log21
pX (am )⎛⎝⎜
⎞⎠⎟− log2 2dm
⎡
⎣⎢
⎤
⎦⎥
m=1
M
∑
= pX (am ) log21
pX (am )2dm
⎛⎝⎜
⎞⎠⎟
⎡
⎣⎢
⎤
⎦⎥
m=1
M
∑
≤ pX (am ) 1pX (am )2dm
−1⎛⎝⎜
⎞⎠⎟m=1
M
∑ log2 e (by Lemma 1)
![Page 191: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/191.jpg)
Source coding theorem: proof of H(X)≤E[C]
Ilya Pollak
Let dm be the codeword length for am , and let random variable C be the codeword length for X.
H (X) − E[C] = − pX (am )log2m=1
M
∑ pX (am ) − pX (am )dmm=1
M
∑ = pX (am ) log21
pX (am )⎛⎝⎜
⎞⎠⎟− log2 2dm
⎡
⎣⎢
⎤
⎦⎥
m=1
M
∑
= pX (am ) log21
pX (am )2dm
⎛⎝⎜
⎞⎠⎟
⎡
⎣⎢
⎤
⎦⎥
m=1
M
∑
≤ pX (am ) 1pX (am )2dm
−1⎛⎝⎜
⎞⎠⎟m=1
M
∑ log2 e (by Lemma 1)
= 12dm
m=1
M
∑ − pX (am )m=1
M
∑⎛⎝⎜
⎞⎠⎟
log2 e
![Page 192: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/192.jpg)
Source coding theorem: proof of H(X)≤E[C]
Ilya Pollak
Let dm be the codeword length for am , and let random variable C be the codeword length for X.
H (X) − E[C] = − pX (am )log2m=1
M
∑ pX (am ) − pX (am )dmm=1
M
∑ = pX (am ) log21
pX (am )⎛⎝⎜
⎞⎠⎟− log2 2dm
⎡
⎣⎢
⎤
⎦⎥
m=1
M
∑
= pX (am ) log21
pX (am )2dm
⎛⎝⎜
⎞⎠⎟
⎡
⎣⎢
⎤
⎦⎥
m=1
M
∑
≤ pX (am ) 1pX (am )2dm
−1⎛⎝⎜
⎞⎠⎟m=1
M
∑ log2 e (by Lemma 1)
= 12dm
m=1
M
∑ − pX (am )m=1
M
∑⎛⎝⎜
⎞⎠⎟
log2 e
= 2−dm
m=1
M
∑ −1⎛⎝⎜
⎞⎠⎟
log2 e ≤ 0
![Page 193: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/193.jpg)
Source coding theorem: proof of H(X)≤E[C]
Ilya Pollak
Let dm be the codeword length for am , and let random variable C be the codeword length for X.
H (X) − E[C] = − pX (am )log2m=1
M
∑ pX (am ) − pX (am )dmm=1
M
∑ = pX (am ) log21
pX (am )⎛⎝⎜
⎞⎠⎟− log2 2dm
⎡
⎣⎢
⎤
⎦⎥
m=1
M
∑
= pX (am ) log21
pX (am )2dm
⎛⎝⎜
⎞⎠⎟
⎡
⎣⎢
⎤
⎦⎥
m=1
M
∑
≤ pX (am ) 1pX (am )2dm
−1⎛⎝⎜
⎞⎠⎟m=1
M
∑ log2 e (by Lemma 1)
= 12dm
m=1
M
∑ − pX (am )m=1
M
∑⎛⎝⎜
⎞⎠⎟
log2 e
= 2−dm
m=1
M
∑ −1⎛⎝⎜
⎞⎠⎟
log2 e ≤ 0
By Kraft inequality, this holds for any prefix condition code. But it is also true for any uniquelydecodable code.
![Page 194: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/194.jpg)
Source coding theorem: how to satisfy E[C] < H(X)+1?
Ilya Pollak
Choose dm = − log2 pX (am )⎡⎢ ⎤⎥ (where x⎡⎢ ⎤⎥ stands for the smallest integer which is ≥ x). Thendm ≥ − log2 pX (am )
![Page 195: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/195.jpg)
Source coding theorem: how to satisfy E[C] < H(X)+1?
Ilya Pollak
Choose dm = − log2 pX (am )⎡⎢ ⎤⎥ (where x⎡⎢ ⎤⎥ stands for the smallest integer which is ≥ x). Then
dm ≥ − log2 pX (am ) ⇒ − dm ≤ log2 pX (am ) ⇒ 2−dm ≤ pX (am )
![Page 196: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/196.jpg)
Source coding theorem: how to satisfy E[C] < H(X)+1?
Ilya Pollak
Choose dm = − log2 pX (am )⎡⎢ ⎤⎥ (where x⎡⎢ ⎤⎥ stands for the smallest integer which is ≥ x). Then
dm ≥ − log2 pX (am ) ⇒ − dm ≤ log2 pX (am ) ⇒ 2−dm ≤ pX (am ) ⇒ 2−dm
m=1
M
∑ ≤ pX (am )m=1
M
∑ = 1.
![Page 197: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/197.jpg)
Source coding theorem: how to satisfy E[C] < H(X)+1?
Ilya Pollak
Choose dm = − log2 pX (am )⎡⎢ ⎤⎥ (where x⎡⎢ ⎤⎥ stands for the smallest integer which is ≥ x). Then
dm ≥ − log2 pX (am ) ⇒ − dm ≤ log2 pX (am ) ⇒ 2−dm ≤ pX (am ) ⇒ 2−dm
m=1
M
∑ ≤ pX (am )m=1
M
∑ = 1.
Therefore, Kraft inequality is satisfied, and we can construct a prefix condition code with codewordlengths d1,…,dM .
![Page 198: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/198.jpg)
Source coding theorem: how to satisfy E[C] < H(X)+1?
Ilya Pollak
Choose dm = − log2 pX (am )⎡⎢ ⎤⎥ (where x⎡⎢ ⎤⎥ stands for the smallest integer which is ≥ x). Then
dm ≥ − log2 pX (am ) ⇒ − dm ≤ log2 pX (am ) ⇒ 2−dm ≤ pX (am ) ⇒ 2−dm
m=1
M
∑ ≤ pX (am )m=1
M
∑ = 1.
Therefore, Kraft inequality is satisfied, and we can construct a prefix condition code with codewordlengths d1,…,dM . Also, by construction,dm −1 < − log2 pX (am ) ⇒ dm < − log2 pX (am ) +1
![Page 199: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/199.jpg)
Source coding theorem: how to satisfy E[C] < H(X)+1?
Ilya Pollak
Choose dm = − log2 pX (am )⎡⎢ ⎤⎥ (where x⎡⎢ ⎤⎥ stands for the smallest integer which is ≥ x). Then
dm ≥ − log2 pX (am ) ⇒ − dm ≤ log2 pX (am ) ⇒ 2−dm ≤ pX (am ) ⇒ 2−dm
m=1
M
∑ ≤ pX (am )m=1
M
∑ = 1.
Therefore, Kraft inequality is satisfied, and we can construct a prefix condition code with codewordlengths d1,…,dM . Also, by construction,dm −1 < − log2 pX (am ) ⇒ dm < − log2 pX (am ) +1 ⇒ pX (am )dm < − pX (am )log2 pX (am ) + pX (am )
![Page 200: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/200.jpg)
Source coding theorem: how to satisfy E[C] < H(X)+1?
Ilya Pollak
Choose dm = − log2 pX (am )⎡⎢ ⎤⎥ (where x⎡⎢ ⎤⎥ stands for the smallest integer which is ≥ x). Then
dm ≥ − log2 pX (am ) ⇒ − dm ≤ log2 pX (am ) ⇒ 2−dm ≤ pX (am ) ⇒ 2−dm
m=1
M
∑ ≤ pX (am )m=1
M
∑ = 1.
Therefore, Kraft inequality is satisfied, and we can construct a prefix condition code with codewordlengths d1,…,dM . Also, by construction,dm −1 < − log2 pX (am ) ⇒ dm < − log2 pX (am ) +1 ⇒ pX (am )dm < − pX (am )log2 pX (am ) + pX (am )
⇒ pX (am )dmm=1
M
∑ < − pX (am )log2 pX (am ) + pX (am )( )m=1
M
∑
![Page 201: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/201.jpg)
Source coding theorem: how to satisfy E[C] < H(X)+1?
Ilya Pollak
Choose dm = − log2 pX (am )⎡⎢ ⎤⎥ (where x⎡⎢ ⎤⎥ stands for the smallest integer which is ≥ x). Then
dm ≥ − log2 pX (am ) ⇒ − dm ≤ log2 pX (am ) ⇒ 2−dm ≤ pX (am ) ⇒ 2−dm
m=1
M
∑ ≤ pX (am )m=1
M
∑ = 1.
Therefore, Kraft inequality is satisfied, and we can construct a prefix condition code with codewordlengths d1,…,dM . Also, by construction,dm −1 < − log2 pX (am ) ⇒ dm < − log2 pX (am ) +1 ⇒ pX (am )dm < − pX (am )log2 pX (am ) + pX (am )
⇒ pX (am )dmm=1
M
∑ < − pX (am )log2 pX (am ) + pX (am )( )m=1
M
∑
⇒ E[C] < − pX (am )log2 pX (am )( )m=1
M
∑ + pX (am )m=1
M
∑
![Page 202: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/202.jpg)
Source coding theorem: how to satisfy E[C] < H(X)+1?
Ilya Pollak
Choose dm = − log2 pX (am )⎡⎢ ⎤⎥ (where x⎡⎢ ⎤⎥ stands for the smallest integer which is ≥ x). Then
dm ≥ − log2 pX (am ) ⇒ − dm ≤ log2 pX (am ) ⇒ 2−dm ≤ pX (am ) ⇒ 2−dm
m=1
M
∑ ≤ pX (am )m=1
M
∑ = 1.
Therefore, Kraft inequality is satisfied, and we can construct a prefix condition code with codewordlengths d1,…,dM . Also, by construction,dm −1 < − log2 pX (am ) ⇒ dm < − log2 pX (am ) +1 ⇒ pX (am )dm < − pX (am )log2 pX (am ) + pX (am )
⇒ pX (am )dmm=1
M
∑ < − pX (am )log2 pX (am ) + pX (am )( )m=1
M
∑
⇒ E[C] < − pX (am )log2 pX (am )( )m=1
M
∑ + pX (am )m=1
M
∑ = H (X) +1
![Page 203: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/203.jpg)
Note: Huffman code may often be very far from the entropy
Ilya Pollak
• Let X have two outcomes, a1 and a2, with probabilities 1−2−d
and 2−d, respectively.
![Page 204: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/204.jpg)
Note: Huffman code may often be very far from the entropy
Ilya Pollak
• Let X have two outcomes, a1 and a2, with probabilities 1−2−d
and 2−d, respectively. • Huffman code: 0 for a1; 1 for a2. • Expected codeword length: 1.
![Page 205: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/205.jpg)
Note: Huffman code may often be very far from the entropy
Ilya Pollak
• Let X have two outcomes, a1 and a2, with probabilities 1−2−d
and 2−d, respectively. • Huffman code: 0 for a1; 1 for a2. • Expected codeword length: 1. • Entropy: −(1−2−d) log2(1−2−d) + d2−d ≈ 0 for large d. For
example, if d=20, this is 0.0000204493.
![Page 206: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/206.jpg)
Note: Huffman code may often be very far from the entropy
Ilya Pollak
• Let X have two outcomes, a1 and a2, with probabilities 1−2−d
and 2−d, respectively. • Huffman code: 0 for a1; 1 for a2. • Expected codeword length: 1. • Entropy: −(1−2−d) log2(1−2−d) + d2−d ≈ 0 for large d. For
example, if d=20, this is 0.0000204493. • Problem: no codeword can have fractional numbers of bits!
![Page 207: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/207.jpg)
Note: Huffman code may often be very far from the entropy
Ilya Pollak
• Let X have two outcomes, a1 and a2, with probabilities 1−2−d
and 2−d, respectively. • Huffman code: 0 for a1; 1 for a2. • Expected codeword length: 1. • Entropy: −(1−2−d) log2(1−2−d) + d2−d ≈ 0 for large d. For
example, if d=20, this is 0.0000204493. • Problem: no codeword can have fractional numbers of bits! • If we have a source which produces independent random
variables X1, X2 , …, all identically distributed to X, a single Huffman code can be constructed for several of them, effectively resulting in fractional numbers of bits per random variable.
![Page 208: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/208.jpg)
Example
Ilya Pollak
• (X1,X2) will have four outcomes, (a1,a1), (a1,a2), (a2,a1), (a2,a2), with probabilities 1−2−d+1+2−2d, 2−d−2−2d, 2−d−2−2d, and 2−2d, respectively.
![Page 209: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/209.jpg)
Example
Ilya Pollak
• (X1,X2) will have four outcomes, (a1,a1), (a1,a2), (a2,a1), (a2,a2), with probabilities 1−2−d+1+2−2d, 2−d−2−2d, 2−d−2−2d, and 2−2d, respectively.
• Huffman code: 0 for (a1,a1); 10 for (a1,a2); 110 for (a2,a1); 111 for (a2,a2).
![Page 210: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/210.jpg)
Example
Ilya Pollak
• (X1,X2) will have four outcomes, (a1,a1), (a1,a2), (a2,a1), (a2,a2), with probabilities 1−2−d+1+2−2d, 2−d−2−2d, 2−d−2−2d, and 2−2d, respectively.
• Huffman code: 0 for (a1,a1); 10 for (a1,a2); 110 for (a2,a1); 111 for (a2,a2).
• Expected codeword length per random variable: – [1−2−d+1+2−2d + 2(2−d−2−2d) + 3(2−d−2−2d)+ 3(2−2d)]/2
![Page 211: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/211.jpg)
Example
Ilya Pollak
• (X1,X2) will have four outcomes, (a1,a1), (a1,a2), (a2,a1), (a2,a2), with probabilities 1−2−d+1+2−2d, 2−d−2−2d, 2−d−2−2d, and 2−2d, respectively.
• Huffman code: 0 for (a1,a1); 10 for (a1,a2); 110 for (a2,a1); 111 for (a2,a2).
• Expected codeword length per random variable: – [1−2−d+1+2−2d + 2(2−d−2−2d) + 3(2−d−2−2d)+ 3(2−2d)]/2 – This is 0.500001 for d=20
![Page 212: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/212.jpg)
Example
Ilya Pollak
• (X1,X2) will have four outcomes, (a1,a1), (a1,a2), (a2,a1), (a2,a2), with probabilities 1−2−d+1+2−2d, 2−d−2−2d, 2−d−2−2d, and 2−2d, respectively.
• Huffman code: 0 for (a1,a1); 10 for (a1,a2); 110 for (a2,a1); 111 for (a2,a2).
• Expected codeword length per random variable: – [1−2−d+1+2−2d + 2(2−d−2−2d) + 3(2−d−2−2d)+ 3(2−2d)]/2 – This is 0.500001 for d=20
• Can get arbitrarily close to entropy by encoding longer sequences of Xk’s.
![Page 213: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/213.jpg)
Source coding theorem for sequences of independent, identically distributed
random variables
Suppose we are jointly encoding independent, identically distributed discreterandom variables X1,…,XN , each taking values in {a1,…,aN}.For any uniquely decodable code, the expected codeword length is ≥ H (Xn ).Moreover, there exists a prefix condition code for which the expected codeword
length is < H (Xn ) + 1N
.
Ilya Pollak
![Page 214: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/214.jpg)
Proof of the source coding theorem for iid sequences
Consider random vector X = X1,…,XN( ). The self-information of its outcome x = x1,…, xN( ) isI(x) = − log2 pX1 ,…,XN
x1,…, xN( )
Ilya Pollak
![Page 215: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/215.jpg)
Proof of the source coding theorem for iid sequences
Consider random vector X = X1,…,XN( ). The self-information of its outcome x = x1,…, xN( ) is
I(x) = − log2 pX1 ,…,XNx1,…, xN( ) = − log2 pXn xn( )
n=1
N
∑ = I xn( )n=1
N
∑ .
Ilya Pollak
![Page 216: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/216.jpg)
Proof of the source coding theorem for iid sequences
Consider random vector X = X1,…,XN( ). The self-information of its outcome x = x1,…, xN( ) is
I(x) = − log2 pX1 ,…,XNx1,…, xN( ) = − log2 pXn xn( )
n=1
N
∑ = I xn( )n=1
N
∑ .
Therefore, the entropy of X is
H X( ) = E I X( )⎡⎣ ⎤⎦ = E I Xn( )n=1
N
∑⎡⎣⎢
⎤⎦⎥= H Xn( )
n=1
N
∑ = NH Xn( ).
Ilya Pollak
![Page 217: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/217.jpg)
Proof of the source coding theorem for iid sequences
Consider random vector X = X1,…,XN( ). The self-information of its outcome x = x1,…, xN( ) is
I(x) = − log2 pX1 ,…,XNx1,…, xN( ) = − log2 pXn xn( )
n=1
N
∑ = I xn( )n=1
N
∑ .
Therefore, the entropy of X is
H X( ) = E I X( )⎡⎣ ⎤⎦ = E I Xn( )n=1
N
∑⎡⎣⎢
⎤⎦⎥= H Xn( )
n=1
N
∑ = NH Xn( ).
Therefore, applying the single-symbol source coding theorem to X, we have:H X( ) ≤ E CN[ ] < H X( ) +1,
where E CN[ ] is the expected codeword length for the optimal uniquely decodable code for X
Ilya Pollak
![Page 218: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/218.jpg)
Proof of the source coding theorem for iid sequences
Consider random vector X = X1,…,XN( ). The self-information of its outcome x = x1,…, xN( ) is
I(x) = − log2 pX1 ,…,XNx1,…, xN( ) = − log2 pXn xn( )
n=1
N
∑ = I xn( )n=1
N
∑ .
Therefore, the entropy of X is
H X( ) = E I X( )⎡⎣ ⎤⎦ = E I Xn( )n=1
N
∑⎡⎣⎢
⎤⎦⎥= H Xn( )
n=1
N
∑ = NH Xn( ).
Therefore, applying the single-symbol source coding theorem to X, we have:H X( ) ≤ E CN[ ] < H X( ) +1,
NH Xn( ) ≤ E CN[ ] < NH Xn( ) +1,
where E CN[ ] is the expected codeword length for the optimal uniquely decodable code for X
Ilya Pollak
![Page 219: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/219.jpg)
Proof of the source coding theorem for iid sequences
Consider random vector X = X1,…,XN( ). The self-information of its outcome x = x1,…, xN( ) is
I(x) = − log2 pX1 ,…,XNx1,…, xN( ) = − log2 pXn xn( )
n=1
N
∑ = I xn( )n=1
N
∑ .
Therefore, the entropy of X is
H X( ) = E I X( )⎡⎣ ⎤⎦ = E I Xn( )n=1
N
∑⎡⎣⎢
⎤⎦⎥= H Xn( )
n=1
N
∑ = NH Xn( ).
Therefore, applying the single-symbol source coding theorem to X, we have:H X( ) ≤ E CN[ ] < H X( ) +1,
NH Xn( ) ≤ E CN[ ] < NH Xn( ) +1,
H Xn( ) ≤ E C[ ] < H Xn( ) + 1N
,
where E CN[ ] is the expected codeword length for the optimal uniquely decodable code for X,
and E C[ ] = E CN[ ]N
is the corresponding expected codeword length per symbol.
Ilya Pollak
![Page 220: 4. Quantization and Data Compression](https://reader034.vdocuments.mx/reader034/viewer/2022052406/589d791f1a28ab6d4a8b9363/html5/thumbnails/220.jpg)
Arithmetic coding
Ilya Pollak
• Another form of entropy coding. • More amenable to coding long sequences of symbols than
Huffman coding. • Can be used in conjunction with on-line learning of conditional
probabilities to encode dependent sequences of symbols: – Q-coder in JPEG (JPEG also has a Huffman coding option) – QM-coder in JBIG – MQ-coder in JPEG-2000 – CABAC coder in H.264/MPEG-4 AVC