swe 423: multimedia systems

SWE 423: Multimedia Systems

Chapter 7: Data Compression (1)

References

• Chapter 7 from our Textbook: “Multimedia Fundamentals: Media Coding and Content Processing”

• Slides from the reference book: Fundamentals of Multimedia

Outline

• Introduction• Motivation for compression• Coding requirements• Compression types• General Data Compression Scheme• Compression Techniques• Entropy Encoding

– Run Length Encoding– Huffman Coding

Introduction

• Video and audio have much higher storage requirements than text

• Data transmission rates (in terms of bandwidth requirements) for sending continuous media are considerably higher than text

• Efficient compression of audio and video data, including some compression standards, will be considered in this chapter

Motivation for Compression• Terminology

– 1 kbit = 1000 bit– 1 Kbit = 1024 bit (= 210)– 1 Mbit = 1024 x 1024 bit (= 210 * 210 = 220)

• Discrete Data: Considering a small window of 640 x 480 pixels on a display– Text – Vector Image– Bitmap Image

• Continuous Data: Required storage space per second– Uncompressed speech of telephone quality– Uncompressed stereo audio signal of CD quality– Video sequence

Motivation for Compression: Discrete Data

• Text– Assuming 2 bytes are used for every 8 x 8 pixel character,

• Character per screen page = ...• Storage required per screen page = ...

• Vector Image– Assuming that a typical image consists of 500 lines, each of which

is defined by its coordinates in the x direction and the y direction, and an 8-bit attribute field

• Coordinates in the x direction require ...• Coordinates in the y direction require ...• Bits per line = ...• Storage required per screen page

• Bitmap Image– Assuming using 256 colors requiring a single byte per pixel

• Storage required per screen page = ...

Motivation for Compression: Continuous Data

• Uncompressed speech of telephone quality– Assuming being sampled at 8 kHz and quantized

using 8 bit per sample yielding a data stream of 64 Kbit/second

• Storage space required per second = ...

• Uncompressed stereo audio signal of CD quality– Assuming being sampled at 44.1 kHz and

quantized using 16 bits• Data rate = ...• Storage space required per second = ...


• Video sequence– Assuming 25 full frames per second, luminance

and chrominance of each pixel are coded using 3 bytes, luminance sampled at 13.5 MHz while chrominance (R-Y and B-Y) is sampled at 6.75 MHz, each, and samples are uniformly coded using 8 bits.

• Bandwidth = ...• Data Rate = ...• Storage space required per second = ...


• Processing uncompressed video data streams requires– Storage space in the gigabyte– Buffer space in the megabyte– Data transfer rates of 140 Mbit/s [per

unidirectional connection]• These requirements can be considerably

lowered by employing compression

Can Multimedia Data be Significantly Compressed?

• Redundancy can be exploited to do compression

• Spatial redundancy– correlation between neighboring pixels in

image/video• Spectral redundancy

– correlation among colors• Psycho-visual redundancy

– Perceptual properties of human visual system

What Makes “Good” Compression

• Quality of compressed and decompressed data should be as good as possible

• Compression/decompression process should be as simple as possible

• Decompression time must not exceed certain thresholds

• [De]/Compression requirements can be divided into– Dialogue mode (video conferencing)– Retrieval mode (digital libraries)– Both

Coding Requirements: Dialogue Mode

• End-to-end delay does not exceed 150 ms for compression and decompression alone.– Ideally, compression and decompression should

not exceed 50ms in order to ensure natural dialogue.

• In addition – delay in the network, – communications protocol processing in the end system,– data transfer to and from the respective input and output

devices.

Coding Requirements: Retrieval Mode

• Fast forward and fast rewind with simultaneous display (or playback) of the data should be possible

• Random access to single images or audio passages in a data stream should be possible in less than 0.5 s.– Maintains interaction aspects in retrieval systems

• Decompression of images, video or audio passages should be possible without interpreting all preceding data.– Allows random access and editing

Coding Requirements: Both Modes

• Support display of the same data in different systems– Formats have to be independent of frame size and video frame rate

• Audio and video compression should support different data rates at different qualities

• Precisely synchronize audio and video• Support for economical solution

– Software– Few VLSI chips

• Enable cooperation of different systems– Data generated on a multimedia system can be reproduced on

another system (e.g. course materials).

Compression Types• Physical versus logical Compression

– Physical• Performed on data regardless of what information it contains• Translates a series of bits to another series of bits

– Logical• Knowledge-based

– e.g. United Kingdom to UK

• Spatial Compression – 2D or single image• Temporal Compression – 3D or video• Codec – Compression / Decompression• Color / intensity … same thing

Compression Types• Symmetric

– Compression and decompression roughly use the same techniques and take just as long

– Data transmission which requires compression and decompression on-the-fly will require these types of algorithms

• Asymmetric– Most common is where compression takes a lot more time

than decompression• In an image database, each image will be compressed once and

decompressed many times

– Less common is where decompression takes a lot more time than compression

• Creating many backup files which will hardly ever be read

Compression Types

• Non-adaptive– Contain a static dictionary of predefined

substrings to encode which are known to occur with high frequency

• Adaptive– Dictionary is built from scratch

Compression Types

• Lossless– decompress(compress(data)) = data– Used for computer data, medical images, etc.

• Lossy– decompress(compress(data)) data– Some distortion– A small change in pixel values may be invisible– Suited for audio and video

General Data Compression Scheme

Encoder(compression)

Storage orNetworks

Decoder(decompression)

Input Data

Output Data

Codes / Codewords

Codes / Codewords

B0 = # bits required before compression

B1 = # bits required after compression

Compression Ratio = B0 / B1.

Compression TechniquesCoding Type Basis Technique

EntropyEncoding

Run-length Coding

Huffman Coding

Arithmetic Coding

Source Coding

PredictionDPCM

DM

TransformationFFT

DCT

Layered Coding

Bit Position

Subsampling

Sub-band Coding

Vector Quantization

Hybrid Coding

JPEG

MPEG

H.263

Many Proprietary Systems

Compression Techniques• Entropy Coding

– Semantics of the information to be encoded are ignored– Lossless compression technique– Can be used for different media regardless of their

characteristics• Source Coding

– Takes into account the semantics of the information to be encoded.

– Often lossy compression technique– Characteristics of medium are exploited

• Hybrid Coding– Most multimedia compression algorithms are hybrid

techniques

Entropy Encoding• Information theory is a discipline in applied mathematics

involving the quantification of data with the goal of enabling as much data as possible to be reliably stored on a medium and/or communicated over a channel.

• According to Claude E. Shannon, the entropy (eta) of an information source with alphabet S = {s1, s2, ..., sn} is defined as

where pi is the probability that symbol si in S will occur.

i

n

ii

n

i ii pp

ppSH

1

21

2 log1log)(

Entropy Encoding• In science, entropy is a measure of the disorder of a

system.– More entropy means more disorder– Negative entropy is added to a system when more order is

given to the system.• The measure of data, known as information entropy, is

usually expressed by the average number of bits needed for storage or communication.– The Shannon Coding Theorem states that the entropy is the

best we can do (under certain conditions). i.e., for the average length of the codewords produced by the encoder, l’,

l’

Entropy Encoding

• Example 1: What is the entropy of an image with uniform distributions of gray-level intensities (i.e. pi = 1/256 for all i)?

• Example 2: What is the entropy of an image whose histogram shows that one third of the pixels are dark and two thirds are bright?

Entropy Encoding: Run-Length• Data often contains sequences of identical bytes.

Replacing these repeated byte sequences with the number of occurrences reduces considerably the overall data size.

• Many variations of RLE– One form of RLE is to use a special marker M-byte that will

indicate the number of occurrences of a character• “c”!#

– How many bytes are used above? When do you think the M-byte should be used?

• ABCCCCCCCCDEFGGGis encoded asABC!8DEFGGG

– What if the string contains the “!” character?– How much is the compression ratio for this example

Note: This encoding is DIFFERENT from what is mentioned in your book

Entropy Encoding: Run-Length

• Many variations of RLE :– Zero-suppression: In this case, one character

that is repeated very often is the only character used in the RLE. In this case, the M-byte and the number of additional occurrences are stored.

• When do you think the M-byte should be used, as opposed to using the regular representation without any encoding?

Entropy Encoding: Run-Length

• Many variations of RLE :– If we are encoding black and white images (e.g.

Faxes), one such version is as follows:(row#, col# run1 begin, col# run1 end, col# run2 begin, col#

run2 end, ... , col# runk begin, col# runk end)

(row#, col# run1 begin, col# run1 end, col# run2 begin, col# run2 end, ... , col# runr begin, col# runr end)

...(row#, col# run1 begin, col# run1 end, col# run2 begin, col#

run2 end, ... , col# runs begin, col# runs end)

Entropy Encoding: Huffman Coding

• One form of variable length coding• Greedy algorithm• Has been used in fax machines, JPEG and

MPEG

Entropy Encoding: Huffman CodingAlgorithm huffmanInput: A set C = {c1 , c2 , ... , cn} of n characters and their

frequencies {f(c1) , f(c2 ) , ... , f(cn )}Output: A Huffman tree (V, T) for C.1. Insert all characters into a min-heap H according to their

frequencies.2. V = C; T = {} 3. for j = 1 to n – 14. c = deletemin(H)5. c’ = deletemin(H)6. f(v) = f(c) + f(c’) // v is a new node7. Insert v into the minheap H8. Add (v,c) and (v,c’) to tree T making c and c’ children of v

in T9. end for


• Example


• Most important properties of Huffman Coding– Unique Prefix Property: No Huffman code is a prefix of

any other Huffman code• For example, 101 and 1010 cannot be Huffman codes. Why?

– Optimality: The Huffman code is a minimum-redundancy code (given an accurate data model)

• The two least frequent symbols will have the same length for their Huffman code, whereas symbols occurring more frequently will have shorter Huffman codes

• It has been shown that the average code length of an information source S is strictly less than + 1, i.e.

l’ < + 1

Entropy Encoding: Adaptive Huffman Coding

• The Huffman method assumes that the frequencies of occurrence of all the symbols of the alphabet are known apriori.– This is rarely the case in practice– Semi-adaptive Huffman coding has been

employed where data is read twice, the first pass being to determine the frequencies

• Disadvantage: Too slow for real-time applications– Another solution is Adaptive Huffman Coding

• Employed by Unix’s “compact” program.


• Decoder “mirrors” the operations of the encoder, as they both may occur at different times

• Main idea of the algorithm is as follows– Encoder and Decoder both start with an empty Huffman Coding

Tree• No symbol is assigned codes yet.

– First symbol read is written on the output stream in its uncompressed form

• In fact, each uncompressed character being read for the first time is read this way

– That is why we need an escape character to determine when we read an uncompressed character for the first time.

– This escape character is denoted by NEW and given frequency 0 all the time

– This symbol is then added to the tree and a code is assigned to it.


– Next time this symbol is encountered, its code will be written in the output screen and its frequency is increased by 1.

– Since this modifies the tree, it is checked whether it is a Huffman tree or not

• If not, it will be rearranged, through swaps, and new codes will be assigned

• Sibling Property– Must be preserved during swaps– All nodes are arranged in the order of increasing counts, left to

right and bottom to top.– During a swap, the farthest node with count N is swapped with

the node whose count has just been increased to N + 1.


• Example

swe 423: multimedia systems

Documents

decompressed data

compressioncan multimedia

compression standards

x direction

y direction

higher storage requirements

gigabytebuffer space

kbitsecondstorage space