transform coding 2005

7/27/2019 Transform Coding 2005

1/80

Lecture notes of Image Compression and Video

Compression series 2005

2. Transform Coding

Prof. Amar Mukherjee

Weifeng Sun


2/80

#2

Topics

z Introduction to Image Compression

z Transform Coding

z Subband Coding, Filter Banks

z Haar Wavelet Transformz SPIHT, EZW, JPEG 2000

z

Motion Compensationz Wireless Video Compression


3/80

#3

Transform Codingz Why transform Coding?

z Purpose of transformation is to convert the data intoa form where compression is easier. Thistransformation will transform the pixels which arecorrelated into a representation where they aredecorrelated. The new values are usually smaller

on average than the original values. The net effectis to reduce the redundancy of representation.

z For lossy compression, the transformcoefficients can now be quantized according to

their statistical properties, producing a muchcompressed representation of the originalimage data.


4/80

#4

Transform Coding Block Diagram

z Transmitter

z Receiver

Segment into

n*n BlocksForward

Transform

Quantization

and Coder

Original Image f(j,k) F(u,v) F*(uv)

Channel

Combine n*nBlocks InverseTransform Decoder

Reconstructed Image f(j,k) F(u,v) F*(uv)


5/80

#5

How Transform Coders Work

z Divide the image into 1x2 blocks

z Typical transforms are 8x8 or 16x16

x1

x1

x2

x2


6/80

#6

Joint Probability Distribution

z Observe the Joint Probability Distribution

or the Joint Histogram.

x1x2

Probability


7/80

#7

Pixel Correlation in Image[Amar]

z Rotate 45o clockwise

Source Image: Amar

Before Rotation After Rotation

=

=

2

1

2

1

45cos45sin

45sin45cos

X

X

Y

YY

oo

oo


8/80

#8

Pixel Correlation Map in [Amar]

-- coordinate distribution

z Upper:Before Rotation

z Lower:After Rotation

z Notice thevariance of Y2 issmaller than thevariance of X2.

z Compression:apply entropycoder on Y2.


9/80

#9

Pixel Correlation in Image[Lenna]

z Lets look at another example

Before Rotation After Rotation

Source Image: Lenna

=

=

2

1

2

1

45cos45sin

45sin45cos

X

X

Y

YY

oo

oo


10/80

#10

Pixel Correlation Map in [Lenna]

-- coordinate distributionz Upper:

Before Rotation

z Lower:After Rotation

z Notice thevariance of Y2 issmaller than thevariance of X2.

z Compression:apply entropycoder on Y2.


11/80

#11

Rotation Matrix

z Rotated 45 degrees clockwise

z Rotation matrix A

=

==

=

2

1

2

1

2

1

2222

2222

45cos45sin

45sin45cos

X

X

X

XAX

Y

YY

oo

oo

=

=

=

11

11

2

2

2222

2222

45cos45sin

45sin45cos

oo

oo

A


12/80

#12

Orthogonal/orthonormal Matrixz Rotation matrix is orthogonal.

z The dot product of a rowwith itself is nonzero.

z The dot product of different

rows is 0.

z Futhermore, the rotationmatrix is orthonormal.

z The dot product of a row

with itself is 1.

z Example:

=

11

11

2

2A

==

else0

if0 jiAA ji

=

=else0

if1 jiAA ji


13/80

#13

Reconstruct the Image

z Goal: recover X from Y.

z Since Y = AX, so X = A-1Y

z Because the inverse of an orthonormal matrix

is its transpose, we have A-1 = AT

z So, Y = A-1X = ATX

z We have inverse matrix

=

==11

11

2

2

45cos45sin

45sin45cos1oo

ooT

AA


14/80

#14

Energy Compaction

z Rotation matrix [A] compacted the energy into Y1.

z Energy is the variance (http://davidmlane.com/hyperstat/A16252.html)

z Given:

z The total variance of X equals to that of Y. It is 41.

z Transformation makes Y2 (0.707) very small.z If we discard min{X}, we have error 42/41 =0.39

z If we discard min{Y}, we have error 0.7072/41 =0.012

z Conclusion: we are more confident to discard min{Y}.

=

=

=

11

11

2

2

2

1

2

1A

X

XA

Y

YY

meantheiswhere)(

1

1 2

1

2 =

=N

i

ix

N

=

=

==

=

0.707

6.364

1

9

2

2

5

4

11

11

2

2AXthen,

5

4X Y


15/80

#15

Idea of Transform Codingz Transform the input pixels X0,X1,X2,...,Xn-1 into

coefficients Y0,Y1,...,Yn-1 (real values)z The coefficients have the property that most of

them are near zero.

z Most of the energy is compacted into a few

coefficients.

z Scalar quantize the coefficientz This is bit allocation.

z Important coefficients should have morequantization levels.

z Entropy encode the quantization symbols.


16/80

#16

Forward transform (1D)

z Get the sequence Y from the sequence X.

z Each element of Y is a linear combination ofelements in X.

The element of the matrix are also called the weight of the linear transform,

and they should be independent of the data (except for the KLT transform).

BasisVectors

0 0,0 0, 1 0

1 1,0 1, 1 1

n

n n n n n

Y a a X

Y a a X

=

L

M M O M M

L

=

==1

0

, 1,1,0n

i

iijj njXaY L

AXY =


17/80

#17

Choosing the Weights of the Basis

Vector

z The general guideline to determine the values

of A is to make Y0 large, while remainingY1,...,Yn-1 to be small.

z The value of the coefficient will be large ifweights a

ij

reinforce the corresponding dataitems Xj. This requires the weights and thedata values to have similar signs. Theconverse is also true: Yi will be small if the

weights and the data values to have dissimilarsigns.


18/80

#18

Extracting Features of Dataz Thus, the basis vectors should extract distinct

features of the data vectors and must beindependent orthogonal). Note the pattern ofdistribution of +1 and -1 in the matrix. Theyare intended to pick up the low and high

frequency components of data.z Normally, the coefficients decrease in the

order of Y0,Y1,...,Yn-1.

z So, Y is more amenable to compression thanX.

AXY =


19/80

#19

Energy Preserving (1D)z Another consideration to choose rotation matrix is to conserve

energy.

z For example, we have orthogonal matrix

z Energy before rotation: 42+62+52+22=81

z Energy after rotation: 172+32+(-5)2+12=324

z Energy changed!z Solution: scale W by scale factor. The scaling does not change the fact

that most of the energy is concentrated at the low frequency components.

=

11111111

1111

1111

A

=

25

6

4

X

==

25

3

17

AXY


20/80

#20

Energy Preserving, Formal Proof

z The sum of the squares

of the transformedsequence is the same

as the sum of the

squares of the originalsequence.

z Most of the energy are

concentrated in the lowfrequency coefficients.

=

=

=

=

==

=

=

1

1

2

1

1

2

)(

)()(

Energy

n

ii

T

TT

TT

T

n

i

T

i

X

XX

XAAXAXAX

AXAX

YYY

Orthonormal

Matrix;See page #12


21/80

#21

Why we are interested in the

orthonormal matrix?

z Normally, it is

computationally difficultto get the inverse matrix.

z The inverse of the

transformation matrix issimply its transpose.

A-1=AT

1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 11 1 1 1 1 1 1 11

1 1 1 1 1 1 1 18

1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1

A

=

1

1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 11

1 1 1 1 1 1 1 18

1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1

TB A A

= = =


22/80

#22

Two Dimensional Transform

z From input Image I, we get D.

z Given Transform matrix A

z 2D transformation goes as:

z Notice the energy compaction.

=

1111

1111

1111

1111

2

1A

=

9542

6745

6386

9674

D9657

69364425

8764

=

==

75.175.025.125.125.225.025.325.0

75.125.025.375.1

75.375.075.275.22

11111111

1111

1111

2

1

95426745

6386

9674

11111111

1111

1111

2

1TAXAY


23/80

#23

Two Dimensional Transform

z Because transformation matrix is

orthonormal,z So, we have

z

Forward transform

z Backward transform

1= AAT

TAXAAXAY == 1

YAAYAAXT== 1


24/80

#24

Linear Separable transformz Two dimensional transform is simplified as two

iterations of one-dimensional transform.

z Column-wise transform and row-wise transform.

1 1

, , , ,

0 0

N N

k l k i i j l j

i j

Y a X a

= =

=


25/80

#25

Transform and Filteringz Consider the orthonormal

transform.

z If A is used to transform a vector of 2 identicalelements x = [x,x]T, the transformed sequence willindicating the low frequency or the average

value is and the high frequency component is0 because the signal value do not vary.

z If x = [3,1]T or [3,-1]T, the output sequence will beand respectively. Now, the high frequencycomponent has positive value and it is bigger for

[3,-1]T,indicating a much large variation. Thus, thetwo coefficients behave like output of a low-passand a high-pass filters, respectively.

Tx )0,2(

=11

11

2

2A

x2

T)2,22(T

)22,2(


26/80

#26

Transform and Functional

Approximation

z Transform is a kind of function approximation.

z Image is a data set. Any data set is a function.z Transform is to approximate the image function

by a combination of simpler, well defined

waveforms (basis functions).z Not all basis sets are equal in terms of

compression.

z DCT and Wavelets are computationally easierthan Fourier.


27/80

#27

Comparison of various transforms

z The KLT is optimal in the sense of decorrelating and energy-packing.

z Walsh-Hadamard Transform is especially easy for implementation.z basis functions are either -1 or +1, only add/sub is necessary.


28/80

#28

Two-Dimensional Basis Matrixz The outer product of two vectors V1 and V2 is

defined as V1TV2. For example,

z For a matrix A of size n*n, the outer product ofith row and jth column is defined as

[ ]

=

=

cfcecd

bfbebd

afaead

fed

c

b

a

VVT

21

[ ]

=

=

1,1,1,1,0,1,

1,1,1,1,0,1,

1,0,1,0,0,0,

1,1,0,

1,

1,

0,

njnijnijni

njijiji

njijiji

njjj

ni

i

i

ij

aaaaaa

aaaaaa

aaaaaa

aaa

a

a

a

L

MOMML

L

LM


29/80

#29

Outer Product

z For example, if

z We have:

=

11

11

2

2A

[ ]

=

=

11

11

2

111

2

2

1

1

2

200

[ ]

=

=

1111

2111

22

11

22

01

[ ]

=

=

11

11

2

111

2

2

1

1

2

210

[ ]

=

=

11

11

2

111

2

2

1

1

2

211


30/80

#30

Outer Product (2)

z We have shown earlier that

z Consider X to be a 2*2 matrix:

YAX T=

1111101001010000

11011000

1101100011011000

1101100011011000

11011000

11011000

1110

0100

1110

0100

11

11

11

11

11

11

11

11

2

1

2

1

1111

21

11

11

11

11

2

1

yyyy

yyyy

yyyyyyyy

yyyyyyyy

yyyyyyyy

yy

yy

xx

xx

+++=

+

+

+

=

++

++++=

++=

=


31/80

#31

Basis Matrix

z The quantities are called the

basis matrices in 2-D space.z In general, the outer products of a n*n

orhtonormal matrix form a basis matrix set in 2dimension. The quantity is called the DCcoefficient (note all the elements for the DCcoeeficient are 1, indicating an averageoperation), and other coefficients have

alternating values and are called ACcoefficients.

11100100 ,,,

00


32/80

#32

Fast Cosine Transform

z 2D 8X8 basis functionsof the DCT:

z The horizontal frequencyof the basis functionsincreases from left toright and the vertical

frequency of the basisfunctions increases fromtop to bottom.

.


33/80

#33

Amplitude distribution of the DCT

coefficients

z Histograms for 8x8 DCT

coefficient amplitudesmeasured for natural

images

z

DC coefficient is typicallyuniformly distributed.

z The distribution of the AC

coefficients have a

Laplacian distributionwith zero-mean.


34/80

#34

Discrete Cosine Transform (DCT)

z Conventional image data have

reasonably high inter-element correlation.z DCT avoids the generation of the

spurious spectral components which is a

problem with DFT and has a fast

implementation which avoids complex

algebra.


35/80

#35

One-dimensional DCT

z The basis idea is to decompose the image intoa set of waveforms, each with a particularspecial frequency.

z To human eyes, high spatial frequencies areimperceptible and a good approximation of the

image can be created by keeping only thelower frequencies.

z Consider the one-dimensional case first. The 8arbitrary grayscale values (with range 0 to 255,

shown in the next slide) are level shifted by128 (as is done by JPEG).


36/80

#36

(b)

-150

-100

-50

0

50

100

150

1 2 3 4 5 6 7 8

x

f(x)

(c)

-150

-100

-50

0

50

100

150

1 2 3 4 5 6 7 8

u

S(u)

-1

0

1

1 2 3 4 5 6 7 8

U=7

Amplitude

-1

0

1

1 2 3 4 5 6 7 8

U=6

Amplitu

de

-1

0

1

1 2 3 4 5 6 7 8

U=5

Amplitude

-1

0

1

1 2 3 4 5 6 7 8

U=4

Amplitude

-1

0

1

1 2 3 4 5 6 7 8

U=3

Amplitude

-1

0

1

1 2 3 4 5 6 7 8

U=2

Amplitu

de

-1

0

1

1 2 3 4 5 6 7 8

U=1

Am

plitude

-1

0

1

1 2 3 4 5 6 7 8

U=0

Amplitude

Before DCT (image data) After DCT (coefficients

An example of 1-D DCT decomposition

The 8 basis functions for 1-D DCT

One-dimensional DCT


37/80

#37

One-dimensional DCT

z The waveforms cam be denoted as

with frequencies f=0, 1, , 7. Each wave is sampled at 8 points

to form a basis vector.

z The eight basis vector constitutes a matrix A:

= 0with),cos()( ffw

16

15,

16

13,

16

11,

16

9,

16

7,

16

5,

16

3,

16

=

-0.1950.556-0.8310.981-0.9810.831-0.556-0.195

0.383-0.9240.924-0.383-0.3830.924-0.9240.383

-0.5560.981-0.195-0.8310.8310.195-0.9810.556

0.707-0.707-0.7070.7070.707-0.707-0.7070.707

-0.8310.1950.9810.556-0.556-0.981-0.1950.831

0.9240.383-0.383-0.924-0.924-0.3830.3830.924

-0.981-0.831-0.556-0.1950.1950.5560.8310.981

11111111


38/80

#38


39/80

#39

One-dimensional DCT

z The output of the DCT transform is:

where A is the 8*8 transformation matrix

defined in the previous slide, and I(x) isthe input signal.

z

S(u) are called the coefficients for theDCT transform for input signal I(x).

[ ]TxIAuS )()( =


40/80

#40

One-dimensional FDCT and IDCT with N=8

Sample points

z The 1-D DCT in JPEG is defined as:

z FDCT

z IDCT

z Where (u is frequency)z I(x) is the 1-D sample

z S(u) is the 1-D DCT coefficientzAnd

0,1,..7for]16

)12(cos[)(

2

)()(

7

0=

+= = u

uxxI

uCuS

x

0,1,..,7for]16

)12(cos[)(

2

)()(

7

0

=+

= =

xux

uSuC

xIu

>

==

0for1

0for2

2

)(

u

uuC

O di i l FDCT d IDCT ith N


41/80

#41

One-dimensional FDCT and IDCT with N

Sample points


z FDCT

z IDCT

z Where (u is frequency)

z I(x) is the 1-D sample

z S(u) is the 1-D DCT coefficient and

1-1,2,..Nfor]2

)12(cos[)()(

2)(

1

0

=+=

=

uN

uxxIuC

NuS

N

x

1-0,1,..Nfor]2

)12(cos[)()(

2)(

1

0=

+

=

=xN

uxuSuCNxI

n

u

>

==

0for1

0for2

2

)(

u

uuC


42/80

#42

One-dimensional FDCT and IDCT

z As an example, let I(x)=[12,10,8,10,12,10,8,11]

z If we now apply IDCT, we will get back I(x).

We can quantize the coefficient S(u) and still

obtain a very good approximation of I(x).z For example,

z While)9989.10,99354.7,9932.9,0164.12,93097.9,96054.7,0233.10,0254.12(

)3.0,2.0,8.1,2.3,8.1,5.0,6.0,6.28(

=

IDCT

[ ] ].309.0,191.0,729.1,182.3,757.1,4619.0,5712.0,6375.28[)()( ==T

xIAuS

)684.10,053.8,014.10,347.12,573.9,6628.7,6244.9,236.11(

)0,0,2,3,2,0,0,28(

=

IDCT

T di i l FDCT d IDCT f 8X8


43/80

#43

Two-dimensional FDCT and IDCT for 8X8

block


z FDCT

z IDCT

z Wherez I(y,x) is the 2-D sample

z S(v,u) is the 2-D DCT coefficientzAnd

==

++=

7

0

7

0

]16

)12(cos[]

16

)12(cos[),(

2

)(

2

)(),(

xy

uxvyxyI

uCvCuvS

==++

=

7

0

7

0]16

)12(

cos[]16

)12(

cos[),(2

)(

2

)(

),(uv

uxvy

uvS

uCvC

xyI

>

==

0for1

0for2

2

)(

u

uuC

>

==

0for1

0for2

2

)(

v

vvC


44/80

#44

Fast Cosine Transform

z 2D 8X8 basis functionsof the DCT:

z The horizontal frequencyof the basis functionsincreases from left toright and the verticalfrequency of the basisfunctions increases fromtop to bottom.

.


45/80

#45

Two-dimensional FDCT for NXM block

z Where I(y,x) is the 2-D sample, S(v,u) is the 2-D DCT

coefficient and

10,1,...,and10,1,...,for

]2

)12(

cos[]2

)12(

cos[),(2

)(

2

)(2

),(

FDCT

1

0

1

0

==

++

=

=

=

MvN-u

N

ux

M

vy

xyI

uCvC

NMuvS

M

y

N

x

>

==

0for1

0for22)(

u

uuC

>==

0for1

0for2

2

)(

v

vvC


46/80

#46

Separable Function

z The 2-D FDCT is a separable function because we can express

the formula as

10,1,...,and10,1,...,for

2

1)(2cos]

2

)12(cos[),()(

2),( ][}{

1

0

)(2

1

0

==

++=

=

=

MvN-u

M

uy

N

uxxyIvC

MuvS

M

y

uCN

N

x

This means that we can compute a 2-D FDCT by first computing a row-wise

1-D FDCT and taking the output and perform on it a column-wise 1-D FDCTtransform. This is possible, as we explained earlier, because the basis

matrices are orthonormal.


47/80

#47

Two-dimensional IDCT for NXM block

.10,1,...,and10,1,...,for

]2

)12(cos[]

2

)12(cos[),()()(

2),(

1

0

1

0

==

++=

=

=

MyNx

N

ux

M

vyuvSvCuC

NMxyI

N

u

M

v

This function is again separable and the computation can be done in

two steps: a row-wise 1-D IDCT followed by a column-wise 1-D IDCT.A straightforward algorithm to compute both FDCT and IDCT will need

( for a block of NXN pixels) an order ofO(N3) multiplication/ addition

operations. After the ground-breaking discovery of O(N logN) algorithm

for Fast Fourier Transform, several researchers proposed efficient

algorithms for DCT computation.


48/80

#48

JPEG Introduction - The background

z JPEG stands for Joint Photographic Expert Group

z

A standard image compression method is needed toenable interoperability of equipment from different

manufacturer

z It is the first international digital image compression

standard for continuous-tone images (grayscale orcolor)

z The history of JPEG the selection process


49/80

#49

JPEG Introduction whats the objective?

z

very good or excellent compression rate,reconstructed image quality, transmission rate

z be applicable to practically any kind of continuous-

tone digital source image

z good complexity

z have the following modes of operations:

z sequential encoding

z progressive encodingz lossless encoding

z hierarchical encoding


50/80

#50

encoder decoder Source

image data

compressed

image data

reconstructed

image data

Image compression system

encoder

statistical

model

entropy

encoder

Encoder

model

Source

image data

compressed

image datadescriptors symbols

model

tables

entropy

coding tables

The basic parts of an JPEG encoder

JPEG Architecture Standard


51/80

#51

JPEG has the following Operation Modes:

Sequential DCT-based mode

Progressive DCT-based mode

Sequential lossless mode

Hierarchical mode

JPEG entropy coding supports:

Huffman encoding

Arithmetic encoding

JPEG Overview (cont.)


52/80

#52

JPEG Baseline System

JPEG Baseline system is composed of:

Sequential DCT-based mode Huffman coding

The basic architecture of JPEG Baseline system

Source

image data

quantizerentropy

encoder

compressed

image data

table

specification

table

specification

88 blocksDCT-based encoder

statistical

modelFDCT


53/80

#53

The Baseline System

The DCT coefficient values can be regarded as the relative amounts ofthe 2-D spatial frequencies contained in the 88 block

the upper-left corner coefficient is called the DC coefficient, which is ameasure of the average of the energy of the block

Other coefficients are calledAC coefficients, coefficients correspondto high frequencies tend to be zero or near zero for most naturalimages

DC coefficient

Q ti ti T bl i DCT


54/80

#54

Quantization Tables in DCT

z Human eyes are less sensitive to highfrequencies.

z We use different quantization value fordifferent frequencies.z Higher frequency, bigger quantization value.

z

Lower frequency, smaller quantization value.z Each DCT coefficient corresponds to a certain

frequency.

z High-level HVS is much more sensitive to the

variations in the achromatic channel than inthe chromatic channels.


55/80

#55

The Baseline System Quantization

z Why quantization? .

z to achieve further compression by representing DCT

coefficients with no greater precision than is necessary toachieve the desired image quality

z Generally, the high frequency coefficients has larger quantizationvalues

z Quantization makes most coefficients to be zero, it makes the

compression system efficient, but its the main source that make thesystem lossy and introduces distortion in the reconstructed image.

z The quantization step-size parameters for JPEG are given by two

Quantization Matrices, each element of the matrix being an integer

of size between 1 to 255. The DCT coeeficients are divided by the

corresponding quantization parameters and rounded to nearestinteger.


56/80

#56

)),(

),((),('

vuQ

vuFRoundvuF =

F(u,v): original DCT coefficient

F(u,v): DCT coefficient after quantization

Q(u,v): quantization value

There are two quantization tables: one for the luminancecomponent and the other for the Chrominance component. JPEG

standard does not prescribe the values of the table; it is left upto

the user, but they recommend the following two tables under

normal circumstances. These tables have been created by

analysing data of human perception and a lot of trial and error.



57/80

#57


z So, we have two quantization tables.z Measured for an average person.

z Higher frequency, bigger quantization value.

z Lower frequency, smaller quantization value.

9910310011298959272

10112012110387786449

921131048164553524

771031096856372218

6280875129221714

5669574024161314

5560582619141212

6151402416101116

9999999999999999

9999999999999999

9999999999999999

9999999999999999

9999999999996647

9999999999562624

9999999966262118

9999999947241817

Luminance quantization table Chrominance quantization table

Two dimensional DCT


58/80

#58

Two-dimensional DCT

z The image samples are shifted from unsigned integer with range[0, 2 N-1] to signed integers with range [- 2 N-1, 2 N-1-1]. Thussamples in the range 0-255 are converted in the range -128 to

127 and those in the range 0 to 4095 are converted in the range -2048 to 2047. This zero-shift done for JPEG to reduce theinternal precision requirements in the DCT calculations.

z How to interpret the DCT coefficients?z The DCT coefficient values can be regarded as the relative amounts

of the 2-D spatial frequencies contained in the 88 block.z F(0,0) is called DC coefficient, which is a measure of the average of

the energy of the block.

z Other coefficients are called AC coefficients, coefficients correspondto high frequencies tend to be zero or near zero for most images.

z

The energy is concentrated in the upper-left corner.


59/80

#59

Baseline System - DC coefficient coding

z After quantization step, the DC coefficient are encoded separatelyby using differential encoding. The DC difference sequence isconstituted by concatenating the DC coefficient of the first block

followed by the difference values of DC coefficients of thesucceeding blocks. The average values of the succeeding blocksare usually correlated.

DPCMquantized DCcoefficients

DC difference

Differential pulse code modulation


60/80

#60

Baseline System - AC coefficient coding

z AC coefficients are arranged into a zig-zag sequence. This is

because most of the significant coefficients are located in the

upper left corner of the matrix. The high frequency componentsare mostly 0s and can be efficiently coded by run-length

encoding.

0 1 5 6 14 15 27 28

2 4 7 13 16 26 29 42

3 8 12 17 25 30 41 43

9 11 18 24 31 40 44 53

10 19 23 32 39 45 52 54

20 22 33 38 46 51 55 60

21 34 37 47 50 56 59 61

35 36 49 48 57 58 62 63

Horizontal frequency

Verticalfrequency

An Example (Ref:T. Acharya, P. Tsai,JPEG 2000

Standard for Image Compression) Wiley Iterscience


61/80

#61

Standard for Image Compression), Wiley-Iterscience,

2005

The 8X8 Image Data Block

110 110 118 118 121 126 131 131108 111 125 122 120 125 134 135

106 119 129 127 125 127 138 144

110 126 130 133 133 131 141 148115 116 119 120 122 125 137 139

115 106 99 110 107 116 130 127

110 91 82 101 99 104 120 118103 76 70 95 92 91 107 106


62/80

#62


63/80

#63

Baseline System - Statistical modeling

z Statistical modeling translate the inputs

to a sequence of symbols for Huffmancoding to use

z Statistical modeling on DC coefficients:

z Category: different size (SSSS)

z Magnitude: amplitude of difference

(additional bits)

Coding the DC coeeficients


64/80

#64

Coding the DC coeeficients

The prediction errors are classified into 16 categories in modulo 216. Category

C contains the range of integers [-(2c

-1), + (2c

-1)] excluding the numbers in therange [-(2c-1-1), + (2c-1-1)] which falls in the previous category. The category

number is encoded using either a fixed code (SSSS), unary code (0, 10, 110, )

or a variable length Huffman code.

Coding the DC/AC coefficients


65/80

#65

Coding the DC/AC coefficients

If the magnitude of the error is positive, it is encoded using category

number of bits with a leading 1. If it is negative, Is complement of

the positive number is used. Thus, if error is -6 and if the Huffman

code for category 3 is 1110, it will get a code (1110001). The JPEG

Standard recommends Huffman two codes for the category

numbers: one for Luminance DC and the other for Chrominance DC

( See Salomon, p.289; it is also described in Annex K ( Table k.3

and K.4 of the Standard)

z Statistical modeling on AC coefficients:

z Run-length: RUN-SIZE=16*RRRR+SSSS

z Amplitude: amplitude of difference (additional bits)


66/80

#66

The AC coefficients contains just a few nonzero numbers, with runs of

zeros followed by a long run of trailing zeros. For each nonzero

number, two numbers are generated: 1)16*RRRR which gives the

number of consecutive zeros preceding the nonzero coefficient being

encoded, 2)SSSS gives the category corresponding to number of bits

required to encode the AC coefficient using variable length Huffman

code. This is specified in the form of a table (see next slide). The

second attribute of the AC coefficient is its amplitude which is

encoded exactly the same way as the DC values. Note the code (0,0)

is reserved for EOB ( signifying that the remaining AC coefficients are

just a single trailing run of 0s). The code (15,0) is reserved for a

maximum run of 16 zeros. If it is more than 16, it is broken down into

runs of 16 plus possibly a single run less than 16. JPEG recommends

two tables for Huffman codes for AC coefficients, Tables K5 and K6

for luminance and Chrominance values.

Encoding the AC Coefficients


67/80

#67

Huffman AC statistical model

run-length/amplitude combinations Huffman coding of AC coefficients

Encoding the AC Coefficients

Example (Contd.)


68/80

#68

Example (Contd.)

z Let us finish our example. The DC coeeficient is -6 has a code o

1110001. The AC coeeficients will be encoded as:

z (0,3)(-6), (0,3)(6), (0,3)(-5), (1,2)(2), (1,1)(-1), (5,1)(-1),(2,1)(-1),(0,1)(1),(0,0). The Table K.5 gives the codes

z 1010, 00, 100, 1100, 11011, 11100 and 11110101 for the pairs of

symbols (0,0), (0,1), (0,3), (1,1), (1,2), (2,1) and (5,1),

respectively. The variable length code for AC coefficients 1, -1, 2,-5, 6 and -6 are 1, 0, 10, 010,110 and 001, respectively. Thus

JPEG will give a compressed representation of the example

image using only 55 bits as:

z 1110001 100001 100110 100010 1101110 11000 11110100

111000 001 1010

z The uncompressed representation would have required 512 bits!

Example to illustrate encoding/decoding


69/80

#69

p g g

JPEG Progressive Model


70/80

#70

JPEG Progressive Model

z Why progressive model?

z Quick transmission

z Image built up in a coarse-to-fine passes

z First stage: encode a rough but recognizable version of the image

z Later stage(s): the image refined by successive scans till get the

final image

z Two ways to do this:

z Spectral selection send DC, AC coefficients separately

z Successive approximationsend the most significant bits first

and then the least significant bits

JPEG Lossless Model


71/80

#71

JPEG Lossless Model

DPCMSample values Descriptors

Differential pulse code modulation

C B

A X

selection value prediction strategy

7

0

1

2

3

(A+B)/2

no prediction

A

B

C

Predictors for lossless coding

B+(A-C)/2

A+B-C

A+(B-C)/2

4

56

JPEG Hierarchical Model


72/80

#72

JPEG Hierarchical Model

z Hierarchical model is an alternative of progressive

model (pyramid)

z Steps:

z filter and down-sample the original images by the desired

number of multiplies of 2 in each dimension

z Encode the reduced-size image using one of the above

coding model

z Use the up-sampled image as a prediction of the origin at

this resolution, encode the difference

z Repeat till the full resolution image has been encode

The Effect of Segmentation


73/80

#73

g

z The image samples are grouped into 88 blocks. 2-D DCT isapplied on each 88 blocks.

z Because of blocking, the spatial frequencies in the image and thespatial frequencies of the cosine basis functions are not preciselyequivalent. According to Fouriers theorem, all the harmonics ofthe fundamental frequencies must be present in the basisfunctions to be precise. Nonetheless, the relationship between

the DCT frequency and the spatial frequency is a closeapproximation if we take into account the sensitivity of human eyefor detecting contrast in frequency.

z The segmentation also introduces what is called the blockingartifacts. This becomes very pronounced if the DC coefficients

from block to block vary considerably. These artifacts appear asedges in the image, and abrupt edges imply high frequency. Theeffect can be minimized if the non-zero AC coefficients are kept.

Some other transforms


74/80

#74

z Discrete Fourier Transform (DFT)

z Haar Transformz Karhunen Love Transform (KLT)

z Walsh-Hadamard Transform (WHT)

Discrete Fourier Transform (DFT)


75/80

#75

( )

z Well-known for its connection to spectral

analysis and filtering.

z Extensive study done on its fast

implementation (O(Nlog2N) for N-point DFT).

z

Has the disadvantage of storage andmanipulation of complex quantities and

creation of spurious spectral components due

to the assumed periodicity of image blocks.

Haar Transform


76/80

#76

z Very fast transform.

z The easiest wavelet

transform.z Useful in edge

detection, imagecoding, and image

analysis problems.z Energy Compaction

is fair, not the bestcompression

algorithms.2D basis function of Haar transform

Karhunen Love Transform (KLT)


77/80

#77

z Karhunen Love Transform (KLT) yieldsdecorrelated transform coefficients.

z Basis functions are eigenvectors of thecovariance matrix of the input signal.

z KLT achieves optimum energy concentration.

z Disadvantages:z KLT dependent on signal statistics

z KLT not separable for image blocks

z Transform matrix cannot be factored into sparsematrices

Walsh-Hadamard Transform (WHT)


78/80

#78

z Although far from

optimum in an

energy packing

sense for typical

imagery, its simple

implementation(basis functions are

either -1 or +1) has

made it widelypopular.

Transformation matrix:

1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 111 1 1 1 1 1 1 18

1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1

1 1 1 1 1 1 1 1

A

=

Walsh-Hadamard Transform (WHT)


79/80

#79

z Walsh-Hadamard transform requires

adds and subtractsz Use of high speed signal processing has

reduced its use due to improvements

with DCT.

Transform Coding: Summary


80/80

#80

z Purpose of transformz de-correlation

z

energy concentrationz KLT is optimum, but signal dependent and, hence,

without a fast algorithm

z DCT reduces blocking artifacts appeared in DFT

z

Threshold coding + zig-zag-scan + 8x8 block size iswidely used todayz JPEG, MPEG, ITU-T H.263.

z Fast algorithm for scaled 8-DCTz 5 multiplications, 29 additions

z Audio Codingz MP3 = MPEG 1- Layer 3 uses DCT

transform coding 2005

Documents