Download - Compression and Decompression techniques
-
7/27/2019 Compression and Decompression techniques
1/68
By:
Shubhra goyal
-
7/27/2019 Compression and Decompression techniques
2/68
Definition
Why Compression
Types of compression Binary image compression scheme
Video image compression
Audio image compression Fractal image compression
-
7/27/2019 Compression and Decompression techniques
3/68
Compression: the process of coding that will
effectively reduce the total number of bits
needed to represent certain information.
Encoder
(compression)storage
decoder
(decompression)
Input
data
output
data
-
7/27/2019 Compression and Decompression techniques
4/68
compression is possible due to:
Redundancy in digital audio, image, and video data(silence removal, spatial redundancy, temporalredundancy)
Properties of human perception.
(Compressed version of digital audio, image, video
need not represent the original information exactly
Perception sensitivities are different for different signal
patterns
Human eye is less sensitive to the higher spatial
frequency components than the lower frequencies
(transform coding)
)
-
7/27/2019 Compression and Decompression techniques
5/68
Video and audio have much higher storage
requirements than text.
Data transmission rates (in terms of bandwidthrequirements) for sending continuous media
are considerably higher than text.
Efficient compression of audio and video data,
including some compression standards, will be
considered in this lesson.
-
7/27/2019 Compression and Decompression techniques
6/68
-
7/27/2019 Compression and Decompression techniques
7/68
Data compression is about storing and sending a smaller number
of bits. Therere two major categories for methods to compress data:
lossless and lossy methods
Data compression
methods
Lossless
methodsLossy methods
Run-
lengt
h
LZW
Ccitt
grp 3
2D
Ccitt
grp 4
Ccitt grp
3 1 DJPEG Ccitt
h.261
fractals
Intel
DVIMPEG
-
7/27/2019 Compression and Decompression techniques
8/68
In lossless methods, original data and the data after compression anddecompression are exactly the same.
Redundant data is removed in compression and added during decompression.
It achieve reduction in size in the range of 1/10 to 1/50 of the originaluncompressed size.
Lossless methods are used when we cant afford to lose any data: eg. TextCompression like legal and medical documents, computer programs.
-
7/27/2019 Compression and Decompression techniques
9/68
There are five common lossless methods:
Run length encoding
CCITT Group 3 1D
CCITT Group 3 2D
CCITT Group 4
Lempel-Ziv and welch algorithm LZW
-
7/27/2019 Compression and Decompression techniques
10/68
Run-length encoding is the simplest method of
compression.
It can be used to compress data made of any
combination of symbols.
The general idea behind this method is to replace
consecutive repeating occurrences of a symbol by one
occurrence of the symbol followed by the number of
occurrences.
The method can be even more efficient if the data uses
only two symbols (for example 0 and 1) in its bit
pattern and one symbol is more frequent than the other.
-
7/27/2019 Compression and Decompression techniques
11/68
-
7/27/2019 Compression and Decompression techniques
12/68
This is disadvantageous for a busy image. In busyimage, adjacent pixels or groups of adjacent pixelschange rapidly. These lead to shorter run lengths of
black pixels or white pixels.
In this, it could take more bits for the code torepresent the run length which generates more
bytes than the original number of bytes in animage.
This effect is called as reverse compression ornegative compression.
It was designed for black and white images only,not for gray scale or color images.
-
7/27/2019 Compression and Decompression techniques
13/68
Many facsimile and document imaging file formatssupport a form of lossless data compression oftendescribed as CCITT encoding.
The CCITT (International Telegraph and Telephone
Consultative Committee) is a standards organizationthat has developed a series of communications
protocols for the facsimile transmission of black-and-white images over telephone lines and data networks.
The CCITT actually defines three algorithms for the
encoding of image data:
Group 3 One-Dimensional (G31D)
Group 3 Two-Dimensional (G32D)
Group 4 (G4)
-
7/27/2019 Compression and Decompression techniques
14/68
Huffman encoding is used for encoding the pixel run length in CCITT
Group 3 and Group 4.
It is a variablelength encoding scheme generating the shortest code for
frequently occurring run lengths and longer code for less frequently
occurring run lengths.
Algorithm:
Make a leaf node for each code symbol
Add the generation probability of each symbol to the leaf node
Take the two leaf nodes with the smallest probability and connectthem into a new node
Add 1 or 0 to each of the two branchesThe probability of the new node is the sum of the probabilitiesof the two connecting nodes
If there is only one node left, the code construction is completed. Ifnot, go back to (2)
-
7/27/2019 Compression and Decompression techniques
15/68
-
7/27/2019 Compression and Decompression techniques
16/68
-
7/27/2019 Compression and Decompression techniques
17/68
-
7/27/2019 Compression and Decompression techniques
18/68
-
7/27/2019 Compression and Decompression techniques
19/68
-
7/27/2019 Compression and Decompression techniques
20/68
Advantages
- it is simple to implement in both hardware
and software.
Disadvantages
- it is one-dimensional as it encodes each row or
line separately.
- it assumes a reliable communication link and does
not provide any protection mechanism.
-
7/27/2019 Compression and Decompression techniques
21/68
CCITT Group 3 - 2D compression scheme is also known asmodified run-length encoding. This scheme is morecommonly used for software based document imagingsystem.
While CCITT Group 3- 2D scheme provides fairly goodcompression, it is easier to compress in software thanCCITT Group 4 standard. The compression ratio averagessomewhere between 10-25, between Group 3 and Group 4.
The compression scheme is based on statistical nature ofimages. For example, the image data across the adjacentscan line may normally be redundant, if black and whitetransitions occur within plus or minus 3 pixels in the nextline as well. Depending upon the scan resolution one line oftext may consist of 20-30 scan lines.
-
7/27/2019 Compression and Decompression techniques
22/68
2 dimensional coding
Images are divided into several groups of K lines
the first line of each group is encoded using CCITT Group 3 1D method
The rest of lines are encoded using some "differential schemes"
Typically compression ratio 10 ~ 20
The "K-factor" allows more error-free transmission World-wide fassimile standard
The 2D scheme uses a combination of additional codes called vertical code,
pass code, and horizontal code.
Only one pass code, i.e. 0001 and one horizontal code, i.e. 001
If vertical code and horizontal code are not applied, then the horizontal code is
applied.
Horizontal Code + Group 3 1D Code = 001 + markup code + terminating code
-
7/27/2019 Compression and Decompression techniques
23/68
-
7/27/2019 Compression and Decompression techniques
24/68
Parse the coding line and look for the change in the
pixel value. The pixel value change is found at the
a1 location (a1 is the indicator that the pixel
changed from binary 0 to binary 1.)
Parse the reference line and look for the change in
the pixel value. The change is found at the b1
location.
Find the difference in the location between b1 anda1: Delta = b1-a1.
-
7/27/2019 Compression and Decompression techniques
25/68
-
7/27/2019 Compression and Decompression techniques
26/68
-
7/27/2019 Compression and Decompression techniques
27/68
-
7/27/2019 Compression and Decompression techniques
28/68
Advantages
- the implementation of the K factor allows
error-free transmission.
- it is a worldwide facsimile standard.- due to its 2-dimensional nature, the compressionratios achieved with this scheme are better thanCCITT Group 3 1D.
Disadvantages- it does not provide a dense compression
- it is complex and relatively difficult to implementin software.
-
7/27/2019 Compression and Decompression techniques
29/68
The compression ratio was not sufficient for serious, high-
resolution document imaging.
This is a 2D coding scheme without the k-factor, the k-factor in
this scheme is the entire page of lines.
Here, the first reference line is an imaginary all-white line abovethe top of the image.
The first group of pixels is encoded using the imaginary white line
as reference line.
The new coded line becomes the reference line for the next scan
line. Each successive line is coded relative to the previous line.
This provides very large level of compression
-
7/27/2019 Compression and Decompression techniques
30/68
-
7/27/2019 Compression and Decompression techniques
31/68
There are no EOL markers before the start of
the compressed data.
Fillers are not used for the scan line .There is an EOP (End-Of-Page) mark consisting
of
- two concatenated EOLs
- padding bits are added immediately
after the end of compressed data
-
7/27/2019 Compression and Decompression techniques
32/68
-
7/27/2019 Compression and Decompression techniques
33/68
Advantages:
- Better resolution
Disadvantages:
- Slow- Complex
- As there is no reference line, a single error
error can result in the rest of the pagebeing skewed.
-
7/27/2019 Compression and Decompression techniques
34/68
-
7/27/2019 Compression and Decompression techniques
35/68
It is dictionary-basedencoding algorithm.
It creates a dictionary (a table) of strings used during
the communication session.
If both the sender and the receiver have a copy of the
dictionary, then previously-encountered strings can be
substituted by their index in the dictionary to reduce
the amount of information transmitted.
-
7/27/2019 Compression and Decompression techniques
36/68
In this phase there are two concurrent events:
- building an indexed dictionary and
- compressing a string of symbols.
The algorithm extracts the smallest substring that
cannot be found in the dictionary from the remaininguncompressed string.
It then stores a copy of this substring in the dictionary
as a new entry and assigns it an index value.
Compression occurs when the substring, except for thelast character, is replaced with the index found in the
dictionary.
The process then inserts the index and the last
character of the substring into the compressed string.
-
7/27/2019 Compression and Decompression techniques
37/68
-
7/27/2019 Compression and Decompression techniques
38/68
Decompression is the inverse of the compression
process.
The process extracts the substrings from the
compressed string and tries to replace the indexes with
the corresponding entry in the dictionary, which is
empty at first and built up gradually.
The idea is that when an index is received, there is
already an entry in the dictionary corresponding to that
index.
-
7/27/2019 Compression and Decompression techniques
39/68
-
7/27/2019 Compression and Decompression techniques
40/68
Used for compressing images and video
files (our eyes cannot distinguish subtle
changes, so lossy data is acceptable).
These methods are cheaper, less time andspace.
Several methods: JPEG: compress pictures and graphics
MPEG: compress video MP3: compress audio
-
7/27/2019 Compression and Decompression techniques
41/68
Color characteristics:
Luminance : This is the measure of the light
emitted or reflected by an object.
Hue: This is the color sensation producedin an observer due to the presence of certain
wavelengths of color.
Saturation: Depth of a color
Difference between red and pink
-
7/27/2019 Compression and Decompression techniques
42/68
A color model is an orderly system for creating a
whole range of colors from a small set of primary
colors.
There are two types of color models,
- Subtractive
- Additive
Additive color models use light to display color while
subtractive models use printing inks.
Colors perceived in additive models are the result oftransmitted light. the typical technique
on color displays.
Colors perceived in subtractive models are the result of
reflected light, the typical technique in printers/plotters.
-
7/27/2019 Compression and Decompression techniques
43/68
CMYK model:
The Cyan ,Magenta ,Yellow and Black (CMYK)
model is used in color printing devices.
It is a color subtractive model.HSI MODEL (HSB MODEL):
The Hue, Saturation and Intensity model
represents tint, shade and tone.
This model is used in IP for filtering and
smoothing images.
Requires high level of computation
-
7/27/2019 Compression and Decompression techniques
44/68
YUV MODEL:
Its a 3-D and Subtractive model.
Y is Luminance component
UV is chrominance components.
Used in full motion video.
Black and white
information
Coloured information
U = red-cyan
V= magenta-green
-
7/27/2019 Compression and Decompression techniques
45/68
RGB Model:
This model is additive in
nature.intensities of Red, Green and Blue
are added to generate various colors.
Used in design of image capture devices,
television, and color monitors.
No color model is better than the other,the choice depends on the application
-
7/27/2019 Compression and Decompression techniques
46/68
Color component conversion
Y 0.299R + 0.587G + 0.114B
U 0.596R 0.247G 0.322B
V 0.211R 0.523G + 0.312B
Color component conversion
R 1.0Y + 0.956U + 0.621V
G 1.0Y 0.272U 0.647V
B 1.0Y -1.1061U 1.703V
-
7/27/2019 Compression and Decompression techniques
47/68
JPEG standard is a collaboration among :
International Telecommunication Union (ITU)
International Organization for Standardization
(ISO)
International Electrotechnical Commission
(IEC)
The official names of JPEG :
Joint Photographic Experts Group ISO/IEC 10918-1 Digital compression and coding
of continuous-tone still image
ITU-T Recommendation T.81
-
7/27/2019 Compression and Decompression techniques
48/68
It should address image quality where visual
fidelity is very high and an encoder can be
parameterized to allow the user to set the
compression or the quality level.
Should compress any kind of continuous-tone
digital source image and is not restricted by
dimensions, color, aspect ratios etc.
Should be scalable from completely losslessto lossy.
-
7/27/2019 Compression and Decompression techniques
49/68
Four operation modes: Sequential encoding component is encoded
in left to right and top to bottom scan.
Progressive encoding the image is
decompressed so that a coarser image isdisplayed first and filled in with more
components when decompressed to a finer
version of the image.
Hierarchical encoding- the image iscompressed to multiple resolution levels.
Lossless encoding the image can be
guaranteed to provide full detail at the
selected resolution when decompressed.
-
7/27/2019 Compression and Decompression techniques
50/68
A codecis a device or computer program
capable of encoding and decoding a digital
stream of data or signal.
They differ within an operation modeaccording to the precision of source image
they can handle or the entropy coding
method they use.
-
7/27/2019 Compression and Decompression techniques
51/68
It have three levels of defination:Baseline System- it decompress color images,
maintain a high compression ratio, and handle
from 4 bits/pixels to 16 bits/pixels. It ensures s/w
implementation are cost effective.Extended System it covers various encoding
aspects such as variable-length, progressive and
hierarchical mode of encoding.
Special Lossless function it ensures there is noloss of detail in the compression and
decompression process, but there is some loss in
scanning process.
-
7/27/2019 Compression and Decompression techniques
52/68
Baseline sequential codec- It consist of three steps:
formation of DCT coefficients, quantization, andentropy encoding. Itsa rich compression scheme.
DCT progressive mode- key steps of DCT coefficients
and quantization are same as above with a diff. that
each component is coded in multiple scans instead ofsingle scan.
Predictive lossless encoding-defines a means of
approaching lossless continuous-tone compression.
Predictor combines sample areas and predictsneighboring areas.
Hierarchical mode- provides a means of carrying
multiple resolutions. Each successive encoding is
reduced by a factor of two, either in horizontal orvertical dimension.
-
7/27/2019 Compression and Decompression techniques
53/68
-
7/27/2019 Compression and Decompression techniques
54/68
The main steps in JPEG encoding are the following
Transform RGB to YUV or YIQ and subsample color
DCT on 8x8 image blocks
Quantization
Zig-zag ordering and run-length encoding
Entropy coding
-
7/27/2019 Compression and Decompression techniques
55/68
The image is divided up into 8x8 blocks
2D DCT is performed on each block
The DCT is performed independently for each
block
This is why, when a high degree of compression is
requested, JPEG gives a blocky image result
-
7/27/2019 Compression and Decompression techniques
56/68
7 7
0 0
1 (2 1) (2 1)( , ) ( ) ( ) ( , )cos cos
4 16 16
for 0,...,7 and 0,...,7
x y
x u y vF u v C u C v f x y
u v
1/ 2 for 0where ( )
1 otherwise
kC k
7 7
0 0
1 (2 1) (2 1)( , ) ( ) ( ) ( , )cos cos
4 16 16
for 0,...,7 and 0,...,7
u v
x u y vf x y C u C v F u v
x y
Forward DCT:
Inverse DCT:
-
7/27/2019 Compression and Decompression techniques
57/68
-
7/27/2019 Compression and Decompression techniques
58/68
0 1 2 3 4 5 6 70
1
2
3
4
5
6
7
u
v
-
7/27/2019 Compression and Decompression techniques
59/68
Y
the luminance of an image
W
H
8x8 values of luminance
48 39 40 68 60 38 50 121
149 82 79 101 113 106 27 62
58 63 77 69 124 107 74 125
80 97 74 54 59 71 91 66
18 34 33 46 64 61 32 37
149 108 80 106 116 61 73 92
211 233 159 88 107 158 161 109
212 104 40 44 71 136 113 66
DCT699.25 43.18 55.25 72.11 24.00 -25.51 11.21 -4.14
-129.78 -71.50 -70.26 -73.35 59.43 -24.02 22.61 -2.05
85.71 30.32 61.78 44.87 14.84 17.35 15.51 -13.19
-40.81 10.17 -17.53 -55.81 30.50 -2.28 -21.00 -1.26
-157.50 -49.39 13.27 -1.78 -8.75 22.47 -8.47 -9.23
92.49 -9.03 45.72 -48.13 -58.51 -9.01 -28.54 10.38
-53.09 -62.97 -3.49 -19.62 56.09 -2.25 -3.28 11.91
-20.54 -55.90 -20.59 -18.19 -26.58 -27.07 8.47 0.31
-
7/27/2019 Compression and Decompression techniques
60/68
Quantization in JPEG aims at reducing the
total number of bits in the compressed image
Divide each entry in the frequency space block
by an integer, then round
Use a quantization matrix Q(u, v)
-
7/27/2019 Compression and Decompression techniques
61/68
Use larger entries in Q for the higher spatialfrequencies These are entries to the lower right part of the
matrix
The following slide shows the default Q(u, v)values for luminance and chrominance Based on psychophysical studies intended to maximize
compression ratios while minimizing perceptualdistortion
Since after division the entries are smaller, we can usefewer bits to encode them
-
7/27/2019 Compression and Decompression techniques
62/68
-
7/27/2019 Compression and Decompression techniques
63/68
F(u,v)8x8 DCT coefficiences
16 11 10 16 24 40 51 61
12 12 14 19 26 58 60 55
14 13 16 24 40 57 69 56
14 17 22 29 51 87 80 62
18 22 37 56 68 109 103 77
24 35 55 64 81 104 113 92
49 64 78 87 103 121 120 101
72 92 95 98 112 100 103 99
Q(u,v)Quantization matrix
699.25 43.18 55.25 72.11 24.00 -25.51 11.21 -4.14
-129.78 -71.50 -70.26 -73.35 59.43 -24.02 22.61 -2.05
85.71 30.32 61.78 44.87 14.84 17.35 15.51 -13.19
-40.81 10.17 -17.53 -55.81 30.50 -2.28 -21.00 -1.26
-157.50 -49.39 13.27 -1.78 -8.75 22.47 -8.47 -9.23
92.49 -9.03 45.72 -48.13 -58.51 -9.01 -28.54 10.38
-53.09 -62.97 -3.49 -19.62 56.09 -2.25 -3.28 11.91-20.54 -55.90 -20.59 -18.19 -26.58 -27.07 8.47 0.31
-
7/27/2019 Compression and Decompression techniques
64/68
43.70 3.93 5.52 4.51 1.00 -0.64 0.22 -0.07-10.82 -5.96 -5.02 -3.86 2.29 -0.41 0.38 -0.04
6.12 2.33 3.86 1.87 0.37 0.30 0.22 -0.24
-2.91 0.60 -0.80 -1.92 0.60 -0.03 -0.26 -0.02
-8.75 -2.25 0.36 -0.03 -0.13 0.21 -0.08 -0.12
3.85 -0.26 0.83 -0.75 -0.72 -0.09 -0.25 0.11
-1.08 -0.98 -0.04 -0.23 0.54 -0.02 -0.03 0.12
-0.29 -0.61 -0.22 -0.19 -0.24 -0.27 0.08 0.00
( , )( , )
F u vQ u v
44 4 6 5 1 -1 0 0
-11 -6 -5 -4 2 0 0 0
6 2 4 2 0 0 0 0
-3 1 -1 -2 1 0 0 0
-9 -2 0 0 0 0 0 0
4 0 1 -1 -1 0 0 0
-1 -1 0 0 1 0 0 0
0 -1 0 0 0 0 0 0
( , )
( , )( , )
qF u v
F u vRoundQ u v
-
7/27/2019 Compression and Decompression techniques
65/68
0 1 5 6 14 15 27 28
2 4 7 13 16 26 29 42
3 8 12 17 25 30 41 43
9 11 18 24 31 40 44 53
10 19 23 32 39 45 52 54
20 22 33 38 46 51 55 60
21 34 37 47 50 56 59 61
35 36 48 49 57 58 62 63
The zigzag sequence starts at the DC coefficient value.It is designed to facilitate entropy coding by placing low-frequency
coefficients (which are non- zero) before high frequency coefficients.
-
7/27/2019 Compression and Decompression techniques
66/68
44 4 6 5 1 -1 0 0
-11 -6 -5 -4 2 0 0 0
6 2 4 2 0 0 0 0
-3 1 -1 -2 1 0 0 0
-9 -2 0 0 0 0 0 0
4 0 1 -1 -1 0 0 0
-1 -1 0 0 1 0 0 0
0 -1 0 0 0 0 0 0
( , )q
F u v
Zig-Zag Reordering :
44,
4,-11,
6,-6,6,5,-5,2,-3,
-9,1,4,-4,1,
-1,2,2,-1,-2,4,
-1,0,0,-2,0,0,0,
0,0,0,1,0,1,-1,0,
-1,0,-1,0,0,0,0,
0,0,0,-1,0,0,
0,1,0,0,0,0,0,0,0,
0,0,0,
0,0,
0
-
7/27/2019 Compression and Decompression techniques
67/68
Entropy is used in thermodynamics for the study of heat and work.
In data compression, it is a measure of the information content ofa message in number of bits.
Entropy in no. of bits = log2(probabilityofobject)
Object can be a character
Eg. If the probability of character T present in a string is 1/8, the
entropy is 3 bits. i.e. If there are 7 Ts in a text string, then the
message can be represented by 21 bits.
JPEG uses two entropy coding schemes: huffman and arithmetic
coding.
Huffman coding requires one or more sets of huffman code tablesfor coding as well as decoding.
Arithmetic coding uses DC and AC coefficients. The coefficient at
0,0 position in the matrix is called DC coefficients and other 63
are called as AC coefficients.
-
7/27/2019 Compression and Decompression techniques
68/68