transform coding - kursused - arvutiteaduse instituut · 2015. 12. 15. · transform coding...

Transform coding

Transform coding is the second approach to exploiting

redundancy by using scalar quantization with linear

preprocessing.

The source samples are collected into a vector that is linearly

transformed (multiplied by a transform matrix) and the

resulting coefficients are scalar quantized.

Notice that each coefficient can be quantized by different

quantizer. The method was introduced in 1956 by Kramer

and Mathews and then popularized in 1962-1963 by

Huang and Schultheiss.

It was developed for coding images and video. More

recently, transform coding has also been widely used in

high-fidelity audio coding.

Transform coding

1x

2x

Nx

T

1y

2y

Ny

1q

2q

Nq

1T

1x

2x

Nx

2y

Ny

1y

Transform coding

A typical discrete linear transform is a decomposition of the

input discrete-time signal over a system of basis functions.

Harmonic functions are often used as basis functions. In this

case coefficients represent intensities of the corresponding

harmonics.

Transformations of this type are called conversion to the

frequency domain.

Let x be an input column vector of dimension then linear Ntransform of can be expressed as follows x

,xy T

where T is NN transform matrix.

Properties of transforms

Usually it is required that transform would have the following

properties:

Localization of the essential part of the signal energy in a

small number of coefficients. After quantization we can

exclude from consideration the least informative coefficients .

Coefficients should be uncorrelated. In this case scalar

quantization followed by symbol-by-symbol variable-length

coding provides close to

Transform should be orthonormal. If this property holds then

the MSE introduced by quantizing the transform coefficients

coincides with MSE in the input vector. Preserves H(D)!

Low computational complexity. It is desirable to use the

separable transforms.

)(DR ).(DH


A transform is called the orthonormal if

,*1 TT (6.1)

where T is a matrix with complex elements. If is real T

then (6.1) reduces to the condition .1 TTT (6.2)

(6.2) is equivalent to the following relation for rows of

,iT t Ni ,...,2,1

,ij

T

ji tt

where

ji

jiij

,0

,1 is Kronecker’s delta function.


Vectors it are the orthonormal basis vectors of transform .T

For orthonormal transform

,xy T ,1yyx

TTT .

1

N

i

T

iiy tx

The input vector can be represented as a weighted sum

of basis vectors, are transform coefficients.

x

iy

The orthonormal transform preserves the signal energy

.1 1

22

N

i

N

j

ji yx

This property is known as the discrete Parseval’s theorem

N

j

j

TTTTTTTN

i

T

i yTTTTTTx1

21

1

2 )()( yyyyyyyyxx


Noncorrelatedness of transform coefficients.

This property implies that the transform coefficients ,iyNi ,...,2,1 satisfy the condition

,ijijjii yyyy ji,

where i is a variance of ,iy denotes mathematical

expectation. For simplicity we assume .0yx

Localization of most part of signal energy in a small number of

transform coefficients.

Let

Nyy ...,1 be sorted in such a manner that

....21 N

Properties of transforms Assume that only first ,pN 10 p coefficients are

transmitted. The receiver uses truncated vector T

pNyy )0,...,0,,...,(ˆ1y to reconstruct .ˆˆˆ 1

yyxTTT

The MSE occurred when we replace by x x is

xxxx ˆˆ1

ˆ1

1

2 TN

i

iiN

xxN

N

j

jj

TTTT yyN

TTTTN 1

2)ˆ(1

ˆˆ1

yyyy

N

pNj

j

N

pNj

jN

yEN 11

2 .11

(6.3)

We would like to find the orthonormal transform which

minimizes the error (6.3)


Low computational complexity.

Any orthonormal transform preserves the achievable rate-

distortion function !!! )(DH

To apply nonseparable 2-D transform we rearrange

input matrix into a vector with components X

and multiply it by the transform matrix of size

x 2N

2T 22 NN

xy 2T

For separable transform the matrix of the transform

coefficients can be obtained as NN

TTXTY

The matrix is the Kronecker product of

two matrices of a 1-D transform

22 NN 2TNN T

We reduce the computational complexity to instead of 32N4N

The Karhunen-Loeve transform

The most efficient in terms of listed properties is the

Karhunen-Loeve transform.

•This transform is orthonormal.

•Its coefficients are uncorrelated.

•The KL transform minimizes (among all orthonormal

transforms) the MSE (6.3) occured because of rejecting

transform coefficients with small variances.

KL transform is optimal in terms of localization signal

energy and it maximizes the number of transform

coefficients which are insignificant and might be

quantized to 0


Let KLT be the matrix of KL transform.

The covariance matrix of the input vector

x is expressed

via covariance matrix of transform coefficients as

.KL

TT

KL

T TTR yyxx

Multiplying by we obtain

T

KLT

TT

KL

T

KL TRT yy

Since coefficients are uncorrelated

,

00

0...0

001

N

T

yy

where is the variance of i .iy


,T

KLii

T

KLiR tt Ni ,...,1

Thus the basis vectors of the Karhunen-Loeve transform are

eigenvectors of the covariance matrix normalized to

satisfy

R.ij

T

KLjKLi tt

The variances of the transform coefficients are

eigenvalues of

i.R

Since is symmetric and positive definite matrix (

for any nonzero ) then eigenvalues are real and

positive. As the result the basis vectors and the transform

coefficients of KL transform are real.

R

0TAxx

x


It can be shown that among all possible orthonormal transforms

applied to stationary vectors of dimension the KL

transform minimizes the MSE (6.3) occurred due to truncation,

that is, the KL transform is optimal in terms of localization

signal energy.

Equation (6.3) for the KL transform has the form

N

.11

11

2

N

pNj

j

N

pNj

jN

yEN

The main shortcoming of the KL transform is that its basis

functions depend on the transformed signal. We have to store

not only quantized transform coefficients but also the basis

functions which can require many more bits for storing than

the quantized coefficients.

The discrete Fourier transform

The DFT is the counterpart of the continuous Fourier transform.

It is defined for the discrete-time signals. The transformed

signal represents samples of the signal spectrum.

Let )( snTx be input sequence then the DFT of is )( snTx

,)()(1

0

sTjknN

n

s enTxkX

.10 Nk

The inverse transform is defined as follows

,)(1

)(1

0

sTjknN

k

s ekXN

nTx

,10 Nn

where 2 / sNT is the base frequency of the transform

or the distance between samples of the signal spectrum.

(6.4)

(6.5)


Notice that (6.4) determines a periodical sequence of

numbers with period .N

Expressions (6.4),(6.5) can be rewritten in the form

,)()(1

0

N

n

kn

NWnxkX 10 Nk

,)(1

)(1

0

kn

N

N

k

WkXN

nx

,10 Nn

./2 Nj

N eW where

In the matrix form (6.4), (6.5) can be rewritten as

,xX FT ,1 *1

XXx FF TN

T


,))1(),...,1(),0(( TNXXX X

,))1(),...,1(),0(( TNxxx x

)1)(1()2)(1(1

)1)(2()2)(2(2

12

...1

...1

...............

...1

11...11

NN

N

NN

N

N

N

NN

N

NN

N

N

N

N

N

N

NN

F

WWW

WWW

WWW

T

is the transform matrix. The basis functions are powers of

./2 Nj

N eW


Since

*1 1FF T

NT the DFT is orthogonal transform. It is

easy to normalize the DFT in order to obtain the orthonormal

transform. For this purpose we should use factors in

the forward and in the inverse transform instead of using factor N/1

N/1 in the inverse transform.

The 2-dimensional DFT is defined as follows 2 21 1 ( )

0 0

( , ) ( , )N M j kn lm

N M

n m

k l n m e

X x

,10 Nk

,10 Ml

),( mnx is the element of the input matrix

,x

),( lkX is

the element of .X


Since

2 22 21 1 1 1

0 0 0 0

( , ) ( , ) ( , )M N N Mjkn jknjlm jlm

N NM M

m n n m

k l n m e e n m e e

X x x

the 2-dimensional DFT can be split into two 1-dimensional

transforms, that is, the DFT is a separable transform.

Linearity: )()()()( nybDFTnxaDFTnbynaxDFT

Circular convolution: Let and be DFTs of )(kX

)(nx

)(kY

and )(ny , respectively. Then the inverse transform of

the product

)()( kYkX is

.)()(1

)(1

0

2

N

k

klN

j

ekYkXN

lv


By inserting definitions for )(kX )(kYand we obtain

.)()(1

)(

21

0

1

0

21

0

2kl

NjN

k

N

m

kmN

jN

n

knN

j

eemyenxN

lv

Changing the order of summation we get

.)()(1

)(1

0

1

0

1

0

)(2

N

n

N

m

N

k

mnlN

kj

emynxN

lv

The sum in brackets is equal to 0 for all and m n except

Nnlm mod)( for which it is equal to .N

1

0

1

0

)mod)(()()mod)(()()(N

n

N

n

NnlxnyNnlynxlv

is circular or periodical convolution.

Example

Let

( ) (2, 3, 1)x n and ( ) (1, 2, 4)y n We extend

)(nx

periodically (…, 2, 3, 1, 2, 3, 1, 2, 3, 1,…)

Then is determined as follows )(lv

16341221)1()2()2()1()0()0()0( xyxyxyv

11142231)2()2()0()1()1()0()1( xyxyxyv

15243211)0()2()1()1()2()0()2( xyxyxyv

The periodical sequence is reversed in time and

multiplied by the corresponding term in Then the

reversed is shifted by 1 sample to the right and again

multiplied by generating the new sample of

)(nx).(ny

)(nx)(ny ).(lv

)(nx

Example

The periodical reversed )(nx

1 3 2 1 3 2 1 3 2 1 3 2 )(ny 1 2 4

The reversed shifted to the right )(nx

2 1 3 2 1 3 2 1 3 2 1 3 2

3 2 1 3 2 1 3 2 1 3 2 1 3


lk

NWkXNlnxDFT )())mod)(((

That is displacement of l samples from the end to the beginning

of )(nx is equivalent to multiplying DFT of this sequence

by exp{ 2 / }j lk N

Example.

Let )(nx be equal to 1 2 3 4 5 3 7 4 and its DFT be equal to

)(kX then is 3 7 4 1 2 3 4 5 and its DFT

8mod)3( nx

is determined as ( )exp{ 2 3 /8}X k j k


In general case the DFT and IDFT require approximately

2N additions and multiplications of complex numbers. 2N

There exist the so-called fast algorithms which allow to reduce

the computational complexity of DFT to NN 2log

operations.

transform coding - kursused - arvutiteaduse instituut · 2015. 12. 15. · transform coding...

Documents