d ata c ommunications compression techniques. d ata c ompression whether data, fax, video, audio,...

DATA COMMUNICATIONS

Compression

Techniques

DATA COMPRESSION

Whether data, fax, video, audio, etc., compression can work wonders

Compression can be loss-less, or lossy

2

HUFFMAN CODES A frequency dependent code Usually a smaller alphabet Must know frequency of occurrence of each

character in the alphabet Order characters from highest to lowest or

vice versa Select two smallest percentages; must be

adjacent

3

HUFFMAN CODES

4

Example: a frequency dependent code

A 8%

B 12%

C 10%

D 6%

E 18%

F 7%

G 20%

H 16%

I 3%

HUFFMAN CODES

5

Now send string A B C D E F G

RUN-LENGTH ENCODING

6

Replace runs of 0s with a count of how many 0s.

00000000000000100000000011000000000000000000001000…001100000000000 ^ (30 0s)

14 9 0 20 30 0 11

1110 1001 0000 1111 0101 1111 1111 0000 0000 1011

LEMPEL-ZIV ENCODING

Replace character strings with codes Problems:

How do we make it dynamic? (Whatever the most frequently occurring strings are, compress those.)

How do we find those strings? How does receiver know is THE?

Very popular algorithm – used in PKZIP, V.42bis modems and others

7

LEMPEL-ZIV ENCODING

Works best on large files Typical performance of Lempel-Ziv:

Program file: reduces to 44% original size Text file: reduces to 64% original size Image file: reduces to 88% original size

8

LEMPEL-ZIV ENCODING

To begin store each character with its ASCII value (127 values)

Then we will set the variable Buff = first character from the text file and set Next = the next character from the file. Then we will perform the following steps:

9

LEMPEL-ZIV ENCODING

Temp = concat(Buff, Next)

is Temp in code table?

Yes? Buff = Temp and get next Next

No? send the code associated with Buff

assign a code to Temp and store both in code table

Buff = Next

get next character from Input String and assign to Next

repeat all steps until end-of-file

10

LEMPEL-ZIV ENCODING

Try the string: “the thing in this is this”

t = 116h = 104e = 101 = 32

11

AN LZ ENCODING EXAMPLE

Initialize: Store each character with its ASCII value

(127 values)So, first multiple-character code will be 128

String = “the thing in this is this”Send code 116 (‘t’), Store “th” as code 128Send code 104 (‘h’), Store “he” as code 129Send code 101 (‘e’), Store “e_“ as code 130Send code 32 (‘_‘), Store “_t” as code 131Send code 128 (‘th’), Store “thi” as code 13212

LZ ENCODING EXAMPLE (CONT’D)

String = “the thing in this is this”Send code 105 (‘i’), Store “in” as code 133Send code 110 (‘n’), Store “ng” as code 134Send code 103 (‘g’), Store “g_” as code 135Send code 32 (‘_‘), Store “_i” as code 136Send code 133 (‘in’) Store “in_” as code 137Send code 131 (‘_t‘), Store “_th” as code 138Send code 104 (‘h’), Store “hi” as code 139Send code 105 (‘i’), Store “is” as code 140Send code 115 (‘s’), Store “s_” as code 14113

LZ ENCODING EXAMPLE (CONT’D)

String = “the thing in this is this”Send code 136 (‘_i’), Store “_is” as code 142Send code 110 (‘s_’), Store “s_t” as code 143Send code 132 (‘thi’), Store “this” as code

144Send code 115 (‘s‘)

14

LEMPEL-ZIV DECODING

After you transmit the string, how is the compressed code decoded?

Note - the ONLY thing transmitted are the code values

15

LZ DECODING

LZ Decoding Algorithm: Initialize dictionary to contain all single characters

and their codes (ASCII) Repeat

Receive code Look up associated character block, B, in the dictionary. Take the last received block plus first character of block B

and add this block with a new code to the dictionary Output the character block B.

Until no more codes are received.

16

LZ DECODING EXAMPLE

Receive code 116, Output t Receive code 104, Store “th” as code 128, Output h Receive code 101, Store “he” as code 129, Output e Receive code 32, Store “e_“ as code 130, Output _ Receive code 128, Store “_t” as code 131, Output th Receive code 105, Store “thi” as code 132, Output i Receive code 110, Store “in” as code 133, Output n Receive code 103, Store “ng” as code 134, Output g Receive code 32, Store “g_” as code 135, Output _ Receive code 133, Store “_i” as code 136, Output in

17

LZ DECODING EXAMPLE (CONT’D)

Receive code 131, Store “in_” as code 137, Output _t Receive code 104, Store “_th” as code 138, Output h Receive code 105, Store “hi” as code 139, Output i Receive code 115, Store “is” as code 140, Output s Receive code 136, Store “s_” as code 141, Output _i Receive code 110, Store “_is” as code 142, Output s_ Receive code 132, Store “s_t” as code 143, Output thi Receive code 115, Store “this” as code 144, Output s Resulting output: the thing in this is this

18

RELATIVE OR DIFFERENTIAL ENCODING

Video does not compress well using Huffman or run-length encoding

In one color video frame, not much is alike But what about from frame to frame? Send a frame, store it in a buffer Next frame is just difference from previous

frame Then store that frame in buffer, etc.

19

20

5 7 6 2 8 6 6 3 5 6

6 5 7 5 5 6 3 2 4 7

8 4 6 8 5 6 4 8 8 5

5 1 2 9 8 6 5 5 6 6

First Frame

5 7 6 2 8 6 6 3 5 6

6 5 7 6 5 6 3 2 3 7

8 4 6 8 5 6 4 8 8 5

5 1 3 9 8 6 5 5 7 6

Second Frame

0 0 0 0 0 0 0 0 0 00 0 0 1 0 0 0 0 -1 00 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 0Difference

IMAGE COMPRESSION One image - JPEG, or continuous images such

as video - MPEG A color picture can be defined by

red/green/blue, or luminance / chrominance / chrominance which are based on RGB values

Either way, you have 3 values, each 8 bits, or 24 bits total (224 colors!)

21

IMAGE COMPRESSION

A VGA screen is 640 x 480 pixels 24 bits x 640 x 480 = 7,372,800 bits. Ouch! And video comes at you 30 images per

second. Double Ouch! We need compression!

22

JPEG

Joint Photographic Experts Group Compresses still images Lossy JPEG compression consists of 3 phases:

Discrete cosine transformations (DCT) Quantization Encoding

23

JPEG STEP 1 - DCT

Divide image into a series of 8x8 pixel blocks If the original image was 640x480 pixels, the

new picture would be 80 blocks x 60 blocks (next slide)

If B&W, each pixel in 8x8 block is an 8-bit value (0-255)

24

25

80 blocks

60 blocks

640 x 480 VGA Screen ImageDivided into 8 x 8 Pixel Blocks

JPEG STEP 1 - DCT

If color, each pixel is 24 bits, or 3 8-bit groups

Thus, each pixel value is represented by three 8x8 arrays

B&W or color, the DCT is applied to these 8x8 arrays

26

JPEG STEP 1 - DCT

So what does DCT do? Takes an 8x8 array (P) and produces a new 8x8 array (T) using cosines

T matrix contains a collection of values called spatial frequencies. These spatial frequencies relate directly to how much the pixel values change as a function of their positions in the block

27

JPEG STEP 1 - DCT

An image with uniform color changes (little fine detail) has a P array with closely similar values and a corresponding T array with many zero values (next slide)

An image with large color changes over a small area (lots of fine detail) has a P array with widely changing values, and thus a T array with many non-zero values

28

JPEG STEP 2 - QUANTIZATION

The human eye can’t see small differences in color

So take T matrix and divide all values by 10. This will give us more zero entries. More 0s means more compression!

But this is too lossy. And dividing all values by 10 doesn’t take into account that upper left of matrix has more action (the less subtle features of the image, or low spatial frequencies)

30

JPEG STEP 2 - QUANTIZATION

So divide T matrix by another matrix (U) with smaller values in upper left corner and larger values in lower right corner (next slide)

Result is matrix Q

31

32

1 3 5 7 9 11 13 153 5 7 9 11 13 15 175 7 9 11 13 15 17 197 9 11 13 15 17 19 219 11 13 15 17 19 21 2311 13 15 17 19 21 23 2513 15 17 19 21 23 25 2715 17 19 21 23 25 27 29

U matrix

Q[i][j] = Round(T[i][j] / U[i][j]), for i = 0, 1, 2, …7 andj = 0, 1, 2, …7

JPEG STEP 3 - ENCODING

Now take the quantized matrix Q and perform run-length encoding on it

But don’t just go across the rows. Longer runs of zeros if you perform the run-length encoding in a diagonal fashion (next slide, from White text)

33

JPEG STEP 3 - ENCODING

34

JPEG

How do you get the image back? Undo run-length encoding Multiply matrix Q by matrix U yielding matrix T Apply similar cosine calculations to get original P

matrix back

35

MPEG Motion Pictures Expert Group

MPEG-1: CD-ROM video, early broadcast satellite systems

MPEG-2: multimedia entertainment and HDTV MPEG-3: originally intended for HDTV MPEG-4: videoconferencing

MPEG like JPEG but uses temporal redundancy

36

MPEG

Don’t transmit complete frames, just what has changed from last frame

But what happens when a scene changes? Or someone walks thru a door? Or you turn on the TV part way thru the broadcast?

37

MPEG

I-frame - intrapicture frame - self-contained frame

P-frame - predicted frame - just the differences from the last frame

B-frame - bidirectional frame - similar to P-frame but difference between previous frame and future frame

38

MPEG

39

I B B P B B I^

diff from prior I frame

B frames are interpolated from previous I frameand next P frame

Order of transmission: I P B B I B B

MPEG

P-frames are coded using motion-compensated prediction

Screen reduced to macroblocks Macroblocks contain values of luminance,

chrominance, chrominance

40

MPEG MPEG algorithm compares a macroblock from

previous screen with one from current screen (the 2 macroblocks that look the most alike) and computes a motion vector

Motion vector (what’s changed between macroblocks and how) stores in matrix similar to JPEG

Intel MMX technology really helps

41

REVIEW QUESTIONS

Given the following: A (18%), B (36%), C (24%), D (13%), F (9%), calculate a Huffman Code

Perform a run-length encoding on the following: 10000000001000000111000000000000000100

What are the advantages of Lempel-Ziv? What are the three steps in JPEG? What are the different frames in MPEG and how

are they used to create a video?

42

d ata c ommunications compression techniques. d ata c ompression whether data, fax, video, audio,...

Documents

thissend code

code values

code tablebuff

new code

compressed code

multiplecharacter code

store s

store e