d ata c ommunications compression techniques. d ata c ompression whether data, fax, video, audio,...
TRANSCRIPT
DATA COMMUNICATIONS
Compression
Techniques
DATA COMPRESSION
Whether data, fax, video, audio, etc., compression can work wonders
Compression can be loss-less, or lossy
2
HUFFMAN CODES A frequency dependent code Usually a smaller alphabet Must know frequency of occurrence of each
character in the alphabet Order characters from highest to lowest or
vice versa Select two smallest percentages; must be
adjacent
3
HUFFMAN CODES
4
Example: a frequency dependent code
A 8%
B 12%
C 10%
D 6%
E 18%
F 7%
G 20%
H 16%
I 3%
HUFFMAN CODES
5
Now send string A B C D E F G
RUN-LENGTH ENCODING
6
Replace runs of 0s with a count of how many 0s.
00000000000000100000000011000000000000000000001000…001100000000000 ^ (30 0s)
14 9 0 20 30 0 11
1110 1001 0000 1111 0101 1111 1111 0000 0000 1011
LEMPEL-ZIV ENCODING
Replace character strings with codes Problems:
How do we make it dynamic? (Whatever the most frequently occurring strings are, compress those.)
How do we find those strings? How does receiver know is THE?
Very popular algorithm – used in PKZIP, V.42bis modems and others
7
LEMPEL-ZIV ENCODING
Works best on large files Typical performance of Lempel-Ziv:
Program file: reduces to 44% original size Text file: reduces to 64% original size Image file: reduces to 88% original size
8
LEMPEL-ZIV ENCODING
To begin store each character with its ASCII value (127 values)
Then we will set the variable Buff = first character from the text file and set Next = the next character from the file. Then we will perform the following steps:
9
LEMPEL-ZIV ENCODING
Temp = concat(Buff, Next)
is Temp in code table?
Yes? Buff = Temp and get next Next
No? send the code associated with Buff
assign a code to Temp and store both in code table
Buff = Next
get next character from Input String and assign to Next
repeat all steps until end-of-file
10
LEMPEL-ZIV ENCODING
Try the string: “the thing in this is this”
t = 116h = 104e = 101 = 32
11
AN LZ ENCODING EXAMPLE
Initialize: Store each character with its ASCII value
(127 values)So, first multiple-character code will be 128
String = “the thing in this is this”Send code 116 (‘t’), Store “th” as code 128Send code 104 (‘h’), Store “he” as code 129Send code 101 (‘e’), Store “e_“ as code 130Send code 32 (‘_‘), Store “_t” as code 131Send code 128 (‘th’), Store “thi” as code 13212
LZ ENCODING EXAMPLE (CONT’D)
String = “the thing in this is this”Send code 105 (‘i’), Store “in” as code 133Send code 110 (‘n’), Store “ng” as code 134Send code 103 (‘g’), Store “g_” as code 135Send code 32 (‘_‘), Store “_i” as code 136Send code 133 (‘in’) Store “in_” as code 137Send code 131 (‘_t‘), Store “_th” as code 138Send code 104 (‘h’), Store “hi” as code 139Send code 105 (‘i’), Store “is” as code 140Send code 115 (‘s’), Store “s_” as code 14113
LZ ENCODING EXAMPLE (CONT’D)
String = “the thing in this is this”Send code 136 (‘_i’), Store “_is” as code 142Send code 110 (‘s_’), Store “s_t” as code 143Send code 132 (‘thi’), Store “this” as code
144Send code 115 (‘s‘)
14
LEMPEL-ZIV DECODING
After you transmit the string, how is the compressed code decoded?
Note - the ONLY thing transmitted are the code values
15
LZ DECODING
LZ Decoding Algorithm: Initialize dictionary to contain all single characters
and their codes (ASCII) Repeat
Receive code Look up associated character block, B, in the dictionary. Take the last received block plus first character of block B
and add this block with a new code to the dictionary Output the character block B.
Until no more codes are received.
16
LZ DECODING EXAMPLE
Receive code 116, Output t Receive code 104, Store “th” as code 128, Output h Receive code 101, Store “he” as code 129, Output e Receive code 32, Store “e_“ as code 130, Output _ Receive code 128, Store “_t” as code 131, Output th Receive code 105, Store “thi” as code 132, Output i Receive code 110, Store “in” as code 133, Output n Receive code 103, Store “ng” as code 134, Output g Receive code 32, Store “g_” as code 135, Output _ Receive code 133, Store “_i” as code 136, Output in
17
LZ DECODING EXAMPLE (CONT’D)
Receive code 131, Store “in_” as code 137, Output _t Receive code 104, Store “_th” as code 138, Output h Receive code 105, Store “hi” as code 139, Output i Receive code 115, Store “is” as code 140, Output s Receive code 136, Store “s_” as code 141, Output _i Receive code 110, Store “_is” as code 142, Output s_ Receive code 132, Store “s_t” as code 143, Output thi Receive code 115, Store “this” as code 144, Output s Resulting output: the thing in this is this
18
RELATIVE OR DIFFERENTIAL ENCODING
Video does not compress well using Huffman or run-length encoding
In one color video frame, not much is alike But what about from frame to frame? Send a frame, store it in a buffer Next frame is just difference from previous
frame Then store that frame in buffer, etc.
19
20
5 7 6 2 8 6 6 3 5 6
6 5 7 5 5 6 3 2 4 7
8 4 6 8 5 6 4 8 8 5
5 1 2 9 8 6 5 5 6 6
First Frame
5 7 6 2 8 6 6 3 5 6
6 5 7 6 5 6 3 2 3 7
8 4 6 8 5 6 4 8 8 5
5 1 3 9 8 6 5 5 7 6
Second Frame
0 0 0 0 0 0 0 0 0 00 0 0 1 0 0 0 0 -1 00 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0Difference
IMAGE COMPRESSION One image - JPEG, or continuous images such
as video - MPEG A color picture can be defined by
red/green/blue, or luminance / chrominance / chrominance which are based on RGB values
Either way, you have 3 values, each 8 bits, or 24 bits total (224 colors!)
21
IMAGE COMPRESSION
A VGA screen is 640 x 480 pixels 24 bits x 640 x 480 = 7,372,800 bits. Ouch! And video comes at you 30 images per
second. Double Ouch! We need compression!
22
JPEG
Joint Photographic Experts Group Compresses still images Lossy JPEG compression consists of 3 phases:
Discrete cosine transformations (DCT) Quantization Encoding
23
JPEG STEP 1 - DCT
Divide image into a series of 8x8 pixel blocks If the original image was 640x480 pixels, the
new picture would be 80 blocks x 60 blocks (next slide)
If B&W, each pixel in 8x8 block is an 8-bit value (0-255)
24
25
80 blocks
60 blocks
640 x 480 VGA Screen ImageDivided into 8 x 8 Pixel Blocks
JPEG STEP 1 - DCT
If color, each pixel is 24 bits, or 3 8-bit groups
Thus, each pixel value is represented by three 8x8 arrays
B&W or color, the DCT is applied to these 8x8 arrays
26
JPEG STEP 1 - DCT
So what does DCT do? Takes an 8x8 array (P) and produces a new 8x8 array (T) using cosines
T matrix contains a collection of values called spatial frequencies. These spatial frequencies relate directly to how much the pixel values change as a function of their positions in the block
27
JPEG STEP 1 - DCT
An image with uniform color changes (little fine detail) has a P array with closely similar values and a corresponding T array with many zero values (next slide)
An image with large color changes over a small area (lots of fine detail) has a P array with widely changing values, and thus a T array with many non-zero values
28
29
JPEG STEP 2 - QUANTIZATION
The human eye can’t see small differences in color
So take T matrix and divide all values by 10. This will give us more zero entries. More 0s means more compression!
But this is too lossy. And dividing all values by 10 doesn’t take into account that upper left of matrix has more action (the less subtle features of the image, or low spatial frequencies)
30
JPEG STEP 2 - QUANTIZATION
So divide T matrix by another matrix (U) with smaller values in upper left corner and larger values in lower right corner (next slide)
Result is matrix Q
31
32
1 3 5 7 9 11 13 153 5 7 9 11 13 15 175 7 9 11 13 15 17 197 9 11 13 15 17 19 219 11 13 15 17 19 21 2311 13 15 17 19 21 23 2513 15 17 19 21 23 25 2715 17 19 21 23 25 27 29
U matrix
Q[i][j] = Round(T[i][j] / U[i][j]), for i = 0, 1, 2, …7 andj = 0, 1, 2, …7
JPEG STEP 3 - ENCODING
Now take the quantized matrix Q and perform run-length encoding on it
But don’t just go across the rows. Longer runs of zeros if you perform the run-length encoding in a diagonal fashion (next slide, from White text)
33
JPEG STEP 3 - ENCODING
34
JPEG
How do you get the image back? Undo run-length encoding Multiply matrix Q by matrix U yielding matrix T Apply similar cosine calculations to get original P
matrix back
35
MPEG Motion Pictures Expert Group
MPEG-1: CD-ROM video, early broadcast satellite systems
MPEG-2: multimedia entertainment and HDTV MPEG-3: originally intended for HDTV MPEG-4: videoconferencing
MPEG like JPEG but uses temporal redundancy
36
MPEG
Don’t transmit complete frames, just what has changed from last frame
But what happens when a scene changes? Or someone walks thru a door? Or you turn on the TV part way thru the broadcast?
37
MPEG
I-frame - intrapicture frame - self-contained frame
P-frame - predicted frame - just the differences from the last frame
B-frame - bidirectional frame - similar to P-frame but difference between previous frame and future frame
38
MPEG
39
I B B P B B I^
diff from prior I frame
B frames are interpolated from previous I frameand next P frame
Order of transmission: I P B B I B B
MPEG
P-frames are coded using motion-compensated prediction
Screen reduced to macroblocks Macroblocks contain values of luminance,
chrominance, chrominance
40
MPEG MPEG algorithm compares a macroblock from
previous screen with one from current screen (the 2 macroblocks that look the most alike) and computes a motion vector
Motion vector (what’s changed between macroblocks and how) stores in matrix similar to JPEG
Intel MMX technology really helps
41
REVIEW QUESTIONS
Given the following: A (18%), B (36%), C (24%), D (13%), F (9%), calculate a Huffman Code
Perform a run-length encoding on the following: 10000000001000000111000000000000000100
What are the advantages of Lempel-Ziv? What are the three steps in JPEG? What are the different frames in MPEG and how
are they used to create a video?
42