efficient use of spectrum
DESCRIPTION
Video Signal Compression. Data Encryption. Less sensitive to noise & distortions. Efficient use of spectrum. Digital Video. Integration of digital services. Figure 1. 576 lines. Picture Size. R: 83Mbit/s. G: 83Mbit/s. B: 83Mbit/s. Total: 249Mbit/s. 5.5MHz = 720 pixels. Raw image. - PowerPoint PPT PresentationTRANSCRIPT
Efficient use of spectrum
Less sensitive to noise &
distortions
Integration of digital services
Data Encryption
Digital Video
576
lin
es
5.5MHz = 720 pixelsRaw image
576 lines/frame720 pixels/line50 fields/second8 bit per pixelTotal: 576x720x25x3 x8 = 249Mbit/sec
R,G andB
R: 83Mbit/s G: 83Mbit/s B: 83Mbit/s Total: 249Mbit/s
Figure 2a
f
VSB
Chroma at 4.43 MHz
-1.75 MHz
Sound at 6 MHz
Chrominance can be represented with a considerable narrower bandwidth (resolution) than luminance
576
lin
es
5. 5MHz = 720 pixelsPAL system
576 lines/frame720 pixels/line50 fields/second8 bit per pixelTotal: 576x720x25x8 = 83Mbit/sec
Luminance Y
Y: 83Mbit/s U: V: Total:
Figure 2b
288
lin
es
2.75MHz = 360 pixelsPAL system
288 lines/frame360 pixels/line50 fields/second8 bit per pixelTotal: 288x360x25x8 = 21Mbit/sec
Chrominance U
Y: 83Mbit/s U: 21Mbit/s V: Total:
Figure 2c
288
lin
es
2.75MHz = 360 pixelsPAL system
288 lines/frame360 pixels/line50 fields/second8 bit per pixelTotal: 288x360x25x8 = 21Mbit/sec
Chrominance V
Y: 83Mbit/s U: 21Mbit/s V: 21Mbit/s Total: 125Mbit/s
Figure 2d
Medium Quality : 1.2 Mbit/s
Superior Quality : 6 Mbit/s
Actual size - 249Mbit/s
Result: Compression is necessary
U,V downsampled - 125Mbit/s
Redundancy in image contents
Adjacent pixels are similar
Intensity variations can be predicted
Sequential frames are similar
Lossy compression: Removal of redundant information, resulting in distortion that is insensitive to Human Perception
Figure 3a
Lenna
Pixels within this region have similar but not totally identical intensity.
Figure 3b
Figure 3b
Inte
nsi
ty
position
0 1 2 3 4 5-1-2-3-4-5
Autocorrelation function
0.2
0.4
0.6
0.8
1.0
Figure 4
Interpolation
1. Pixel intensities usually varies in a smooth manner except at edge (dominant/salient) points
2. Record pixels at dominant points only.
3. Reconstruct the pixels between dominant points with “Interpolation”.
4. A straightforward method: Joining dominant points with straight lines.
5. High compression ratio for smooth varying intensity profile.
6. Difficulty: How to identify dominant points?
Inte
nsi
ty
positionFigure 5a
Transmit only selected pixels predicted the rest
Prediction of current sample based on previous ones
Quantizer (Q)
Predictor (P)
nx
nx̂
nep
nxp
nepq
Input signal nx
Predicted signal nxp
Error signal nxnxne pp
Quantized error signal nepq
Reconstructed signal nenxnx pqp ˆ
Quantizer: representation of a continuous dynamic range with a finite number of discrete levels (will be discussed later)
Error = Quantization error
Function of Predictive Coding: Data Compression
Quantizer (Q)
Predictor (P)
nx
nx̂
nep
nxp
nepq
24319916212081400 ,,,,,,nxp
24119816311982410 ,,,,,,nx 8 bits The better the predictor, the higher is the
compression ratio 2111110 ,,,,,,nep 3 bits
Prediction error
A simple example:
0000000000 xeexx pqpp ˆ
45393228201470 ,,,,,,,nx
1 nxnx p ˆ
6 bits
61ˆ 61 71 01 71 xeexx pqpp
122628262142 xeexx pqpp ˆ
1836383123203 xeexx pqpp ˆ
LevelLevel xx yy
00 0 to 20 to 2 00
11 3 to 53 to 5 33
22 6 to 86 to 8 66
33 9 to 119 to 11 99
Quantizer (Q)
Predictor (P)
nx
nx̂
nep
nxp
nepq
2 bits Quantizer
27494104184284 xeexx pqpp ˆ
,.....,,,,ˆ,.....,,,, 3222096660 nene pqpq
Predictive Decoder
Quantizer (Q)
Predictor (P)
nx
nx̂
nep
nxp
nepq
Predictor (P)
nx̂
Reconstructed signal nenxnx pqp ˆ
Reconstruction error nxnx ˆ
nenxnenx pqppp
Quantization error nene pqp
nepqˆ
Option: the quantized levels are transmitted instead of the actual errors
Q-1
Q-1
Predictive Decoder
LevelLevel xx yy
00 0 to 20 to 2 00
11 3 to 53 to 5 33
22 6 to 86 to 8 66
33 9 to 119 to 11 99
2 bits Quantizer
45393228201470 ,,,,,,,nx 6 bits
61016121 xxee ppqpq ˆˆˆ
00000000 xxee ppqpq ˆˆˆ
1212626222 xxee ppqpq ˆˆˆ
1831236323 xxee ppqpq ˆˆˆ
2741849434 xxee ppqpq ˆˆˆ
Error = Quantization error
Quantizer (Q)
Predictor (P)
nx
nx̂
nep
nxp
nepq
Predictor (P)
nx̂ nepqˆ
Q-1
Q-1
Predictive Decoder
Quantizer (Q)
Predictor (P)
nx
nx̂
nep
nxp
nepqˆ
Predictor (P)
nx̂
LevelLevel xx yy
00 positivepositive +S+S
11 negativenegative -S-S
1 bits Quantizer
45393228201470 ,,,,,,,nx 6 bits
Q-1
Q-1
S = Fix step size
Where
Prediction based on the linear combination of previously reconstructed samples
Current sample = lnxanxk
llp
ˆ1
k
lla
1
1
2
1
k
ll lnxanxE
02
neEa
MSPE pi
min
Optimal predictor design by minimizing the Mean Square Prediction Error
22 nxnxEneEMSPE pp
rRa 1
k
ll
k
ll xxaa
11
1 maxmax
2
1
k
ll lnxanxE
02
neEa
MSPE pi
min
Optimal predictor design by minimizing the Mean Square Prediction Error
22 nxnxEneEMSPE pp
rRa ~~~ 1
Tkaaaa ,.......,,~21
TknxnxEnxnxEnxnxEr ,........,,~ 21
knxknxnxknxnxknx
knxnxnxnxnxnx
knxnxnxnxnxnx
ER
21
22212
12111
~
Inte
nsi
ty
position Figure 5b
YA
e.g. Asin(n/T)+YIn
ten
sity
position n Figure 5d
AY
1. Select a basis - a set of fixed functions {f0(n), f1(n), f2(n), f3(n), ……………, fN(n)}
2. Assuming all types of signals can be approximated by a linear combination of these functions (i.e. A(n) = a0f0(n)+ a1f1(n)+ a2f2(n)+…+ aNfN(n)
3. Calculate the coefficients a0, a1, ….., aN
4. Represents the input signal with the coefficients instead of the actual data
5. Compression: Use less coefficients, e.g. a0, a1, ….., aK (K<N)
6. For example: the set of sine and cosine waves
Major Steps
1. Adopt the sine and cosine waves as a basis
2. Calculate the Fourier coefficients (Note: a sequence of N points will give N complex coefficients
3. Encoding (compression): Represents the signal with the first K coefficients, where K < N
4. Decoding (decompression): Reconstruct the signal with the K coefficients with inverse Fourier Transform.
5. Other Transforms (e.g. Walsh Transform) can be adopted
Sinusoidal Waves
1110
0100
1
0
NNWNW
NWW
W
W
N ,,
,,
Set of basis functions
denotes Dot Product between A and B
1
0
1
0
NWs
Ws
NX
X
S
,
,
BA,
Transform from the “s” domain to the “S” domain
1
0
N
n
knWnxkX ,
1
0
00N
n
nWnxX ,
1
0
11N
n
nWnxX ,
1
0
22N
n
nWnxX ,
1210 Nxxxxs ,........,,,
1210 NXXXXS ,........,,,
x(0) x(1) x(2) …….. x(N-2) x(N-1)
X(k)
W(0,k) W(1,k) W(2,k) W(N-2,k) W(N-1,k)
1
0
N
n
knWnxkX ,
A. Orthogonal Property
Delta function
1
0
1 N
n
kj kjknWjnWN
WW ,,,,
B. Orthonormal Property
otherwise
kjWW kj
0
1,
s
denotes Dot Product between A and B
1
0
1
0
NWS
WS
Nx
x
s'
'
,
,
BA,
Inverse Transform from the “S” domain to the “s” domain
are complex conjugateskk WW ',
1210 NXXXXS ,........,,,
1
0
00N
k
kWkXx ,*
1
0
11N
k
kWkXx ,*
1
0
22N
k
kWkXx ,
1210 Nxxxxs ,........,,,
1
0
N
k
knWkXnx ,*
X(0) X(1) X(2) …….. X(N-2) X(N-1)
x(n)
W*(n,0) W*(n,1) W*(n,2) W*(n,N-2) W*(n,N-1)
1
0
N
k
knWkXnx ,*
Note: X(k) is complex
1
0
N
n
knWnxkX ,
N
nkj
eknW2
,
N
nkj
N
nk 22sincos
1
02
N
n
knWnxkC
kX ,
N
knknW
2
12 cos,
otherwise
kkC
1
02 50.
Note: X(k) is real
x(n) =
W’(n,k) = cos[(2n+1)k
Note: Wk is real, therefore W’k = Wk
k=0
N-1 X(k)W’(n,k)C(k)
2
C(k) = 2-0.5 for k = 0
= 1 otherwise
Transform that are suitable for compression should exhibit the following properties:
Optimal Transform : Karhunen-Loeve Transform (KLT)
a. There exist an inverse transform
b. Decorrelation
c. Good Energy Compactness
x(0), x(1), x(2), x(3), x(4), x(5), x(6), x(7), ….., x(N-2), x(N-1)
A sample can be predicted from its neighbor(s)
X(0) X(1) X(2) X(3) X(4) X(5) X(6) X(7)
After DFT, a coefficient is less predictable from its neighbor(s)
Magnitude of frequency components
x(0), x(1), x(2), x(3), x(4), x(5), x(6), x(7), ….., x(N-2), x(N-1)
All samples are important
x(0), x(1), x(2), x(3), x(4), x(5), x(6), x(7), ….., x(N-2), x(N-1)
All samples are important
Any missing sample causes large distortion
X(0) X(1) X(2) X(3) X(4) X(5) X(6) X(7)
x(0) x(1) x(2) x(3) x(4) x(5) x(6) x(7)
DFT samples
X(0) X(1) X(2) X(3) X(4) X(5) X(6) X(7)
x(0) x(1) x(2) x(3) x(4) x(5) x(6) x(7)
X(0) X(1) X(2) X(3) X(4) X(5) X(6) X(7)
x(0) x(1) x(2) x(3) x(4) x(5) x(6) x(7)
The signal can be constructed with the first 3 samples with good approximation
All information is concentrated in a small number of elements in the
transformed domain
DCT has very good Energy Compactness and Decorrelation Properties
X(j,k) = m=0
M-1
x(m,n)W(m,j) W(n,k)C(j) 2
n=0
C(k) 2
W(n,k) = cos[(2n+1)k
C(k) , C(j) = 2-0.5 for k = 0 and j = 0, respectively
= 1 otherwise
N-1
W(m,j) = cos[(2m+1)j
x(0,0) x(0,1) x(0,2) x(0,N-1)
x(1,0) x(1,1) x(1,2) x(1,N-1)
x(M-1,0) x(M-1,1) x(M-2,2) x(M-1,N-1)
X(0,0) X(0,1) X(0,2) X(0,N-1)
X(1,0) X(1,1) X(1,2) X(1,N-1)
X(M-1,0) X(M-1,1) X(M-2,2) X(M-1,N-1)
2-D DCT
x(m,n) = j=0
M-1 X(j,k)W(m,j) W(n,k)C(j)
2
C(k) 2
W(n,k) = cos[(2n+1)k
C(k) , C(j) = 2-0.5 for k = 0 and j = 0, respectively
= 1 otherwise
k=0
N-1
W(m,j) = cos[(2m+1)j
x(0,0) x(0,1) x(0,2) x(0,N-1)
x(1,0) x(1,1) x(1,2) x(1,N-1)
x(M-1,0) x(M-1,1) x(M-2,2) x(M-1,N-1)
X(0,0) X(0,1) X(0,2) X(0,N-1)
X(1,0) X(1,1) X(1,2) X(1,N-1)
X(M-1,0) X(M-1,1) X(M-2,2) X(M-1,N-1)
2-D IDCT
Impo
rtan
ceIm
port
ance
E f n n
Given a signal
and E f n f n k R n k n ,
Assume f(n) is wide-sense stationary, i.e. its statistical properties are constant with changes in time
kRnknRandconstant ,
Define and
(O1)
(O2)
1
1
1
1
1
0
Nf
f
f
f
,...,, 110 Nfffnf
f(n), define the mean and autocorrelation as
1.....21
2....11
1....11
=
0.....21
2....01
1....10
2
NN
N
N
RNRNR
NRRR
NRRR
R
where R k k and 2 0 1 (O3)
Equation O1 can be rewritten as
C conv f E f fT
The covariance of f is given by
Tf ffER (O4)
(O5)T
fR
The signal is transform to its spectral coefficients
sk
N
sf k k s N
0
1
0 1*
Comparing the two sequences:
f n f f f andN N 0 1 1, ,..., , ,...., 0 1 -1
a. Adjacent terms are relatedb. Every term is important
a. Adjacent terms are unrelatedb. Only the first few terms are
important
The signal is transform to its spectral coefficients
sk
N
sf k k s N
0
1
0 1*
similar to f, we can define the mean, autocorrelation and covariance matrix for
R E T
f n f f f andN N 0 1 1, ,..., , ,...., 0 1 -1
a. Adjacent terms are related a. Adjacent terms are unrelated
Adjacent terms are uncorrelated if every term is only correlated to itself, i.e., all off-diagonal terms in the autocorrelation function is zero.
Define a measurement on correlation between samples:
f fjj i
N
i
N
jj i
N
i
N
R i j and R i j
, , 1
1
1
1
1
1
1
1
(O6)
We assume that the mean of the signal is zero. This can be achieved simply by subtracting the mean from f if it is non-zero.
The covariance and autocorrelation matrices are the same after the mean is removed.
f n f f f andN N 0 1 1, ,..., , ,...., 0 1 -1
b. Every term is important b. Only the first few terms are important
0
1
1
0
1
1N
r
r
r
r N
Note:
If only the first L-1 terms are used to reconstruct the signal, we have
f L r rr
L
0
1
(O7)
If only the first L-1 terms are used to reconstruct the signal, the error is
The energy lost is given by e eLT
L rr L
N
21
e f fL L r rr L
N
1
r rk
NT
rf k k f
*
0
1
but,
hence r rT T
rf f2
(O8)
(O9)
(O10)
Eqn. O10 is valid for describing the approximation error of a single sequence of signal data f. A more generic description for covering a collection of signal sequences is given by:
J E e e E
E f f R
L LT
L rr L
N
rT T
rr L
N
rT
f rr L
N
'
21
1 1
(O11)
An optimal transform mininize the error term in eqn. O11. However, the solution space is enormous and constraint is required. Noted that the basis functions are orthonormal, hence the following objective function is adopted.
J RrT
f r r rT
rr L
N
11
(O12)
The term r is known as the Lagrangian multiplier
The optimal solution can be found by setting the gradient of J to 0 for each value of r, i.e.,
rr
JJ
0
Eqn O13 is based on the orthonormal property of the basis functions.
(O13)
R f r r r
The solution for each basis function is given by
(O14)
ris an eigenvector of Rf and r is an eigenvalue
Grouping the N basis functions gives an overall equation
R fT T
N where 0 1,......., (O15)
R = Rf= (O16)
which is a diagonal matrix.The decorrelation criteria is satisfied
sk
N
sf k k s N
0
1
0 1*
The signal is transform to its spectral coefficients
Given a signal ,...,, 110 Nfffnf
R f r r r
The solution for each basis function is given by
Determine the autocorrelation function Rf
Redundancy in imagesRedundancy in images
Probability distribution of pixel values are uneven
Assuming the pixel intensity (gray scale) ranges from 0 to 255 units
Figure 6a
255
0
Pixel Intensity
Probability of occurrence
0 1 2 3 4 100 252 253 254 255
Figure 6b
0.4
0.2
0.6
0.8
1.0
Use less bits to represent pixel intensity that occurs more often
A simple example: 720 pixels
576
pix
els
8bit per pixelsTotal: 3.3Mbits
Image size = 720x576x8 = 3.3Mbits
8bit per pixels
Intensity Pr
Pixel Intensity0 1 2 3 4 100 252 253 254 255
0.0040.002
0.500
0.0981.000
Pr
0 - 254 0.00196
255 0.500
720 pixels
576
pix
els
8bit per pixelsTotal: 3.3Mbits
Pixel Intensity
0 1 2 3 4 100 252 253 254 255
0.0040.002
0.500
0.0981.000
Intensity Pr # of bits Bit String
Pr
0 - 254 0.00196 9 1XXXXXXXX 255 0.500 1 0
Total = (720X576)X(0.500 + 0.002X255X9) = 2.1Mbits
Sequential frames are similar
Figure 7
P1
P2
P3
Only about 5-10% of the content had been changed between frames
Still picture - JPEGStill picture - JPEG
Joint Photographic Expert Group
• International Standard Organization (ISO) standards.
• Based on Discrete Cosine Transform (DCT).
Motion picture - MPEGMotion picture - MPEG
Motion Picture Expert Group
Image
Image Vectors
DCT
Quantization
Zig-Zag Coding
Runlength Coding
Entropy Coding
Digitization
JPEG Compressed Format
Image
Image Vectors
DCT
Quantization
Zig-Zag Coding
Runlength Coding
Entropy Coding
Digitization
JPEG Compressed Format
Digitization
Figure 8
Figure 9
Image Digitization
Image
Image Vectors
DCT
Quantization
Zig-Zag Coding
Runlength Coding
Entropy Coding
Digitization
JPEG Compressed Format
Figure 10a
Figure 10b
Image vectors
Figure 10c
Image Vector - a magnified view
Image Vector - a magnified view
Figure 10d
Image Vector - a magnified view
x(0,0) x(0,1) x(0,2) x(0,3) x(0,4) x(0,5) x(0,6) x(0,7)
x(2,0) x(2,1) x(2,2) x(2,3) x(2,4) x(2,5) x(2,6) x(2,7)
x(3,0) x(3,1) x(3,2) x(3,3) x(3,4) x(3,5) x(3,6) x(3,7)
x(4,0) x(4,1) x(4,2) x(4,3) x(4,4) x(4,5) x(4,6) x(4,7)
x(1,0) x(1,1) x(1,2) x(1,3) x(1,4) x(1,5) x(1,6) x(1,7)
x(5,0) x(5,1) x(5,2) x(5,3) x(5,4) x(5,5) x(5,6) x(5,7)
x(6,0) x(6,1) x(6,2) x(6,3) x(6,4) x(6,5) x(6,6) x(6,7)
x(7,0) x(7,1) x(7,2) x(7,3) x(7,4) x(7,5) x(7,6) x(7,7)Figure
10e
Image
Image Vectors
DCT
Quantization
Zig-Zag Coding
Runlength Coding
Entropy Coding
Digitization
JPEG Compressed Format
Increasing horizontal frequency
Incr
easi
ng
vert
ical
fr
equ
ency
Figure 11a
Increasing horizontal frequency
Incr
easi
ng
vert
ical
fr
equ
ency
Figure 11b
Because of the energy compactness of DCT, most of the information is concentrated in the low frequency corner
200 185 170 25 1 3 1 3
198 180 160 171 10 7 3 10
165 150 125 5 12 11 10 9
30 25 8 13 5 3 9 0
210 190 195 120 7 15 5 8
2 9 1 0 3 6 2 1
5 5 7 2 7 1 1 5
4 9 2 11 9 2 3 0 Figure 11c
The DCT coefficients are normalised to 11 bits integer values
Before the transform, the pixel intensity range is converted from [0,255] to [-128, 127]
The process, known as ‘zero shift’, is performed by subtracting each pixel intensity by 128
Image
Image Vectors
DCT
Quantization
Zig-Zag Coding
Runlength Coding
Entropy Coding
Digitization
JPEG Compressed Format
Quantizerf fQf ˆ
0 d1 d2 d3
r1
r2
-r2
-r1
d4
-d4 -d3 -d2 -d1
r3
-r3
Uniform Symmetric Quantizers
Input
Output
di : Decision levels
ri : Representation levels
U
L
a
adffpffffE
22 ˆˆ
Mean Square Quantization Error (MSQE)
U
L
a
adffpffffE ˆˆ
Mean Absolute Quantization Error (MAQE)
Q1
Q2
Max-Lloyd Quantizer
A method to determine the decision and representation levels
Suppose
Then
jrfQ
U
L
a
adffpffMSQE
2ˆ
1
0
21J
l
d
d l
l
l
dffprf Q3
Max-Lloyd Quantizer
Consider two arbitrary adjacent reconstruction levels rk-1 and rk
What will be the optimal value for dk so that error is minimized?
001
122
1
k
k
k
k
d
d
d
d
kkkk
dffprfdffprfdd
2
1 kk
k
rrd
dk-1 dk dk+1
rk-1 rk
Q4
Max-Lloyd Quantizer
Similarly
001
2
k
k
d
d
kkk
dffprfrr
1
1
k
k
k
k
d
d
d
dk
dffp
dfffp
r Q5
Max-Lloyd Quantizer for uniform pdf
Consider a uniform probability density function
f0 A/2-A/2
1/A
p(f)
Aaafp
LU
11
Max-Lloyd Quantizer for uniform pdf
22
111
kkkk
k
ddrrdFrom Q4,
kkd
d
d
dk dd
dffp
dfffp
rk
k
k
k
12
11
1
From Q5,
kkkkkkk ddddddd 11112Hence,
Constant Step Size
Max-Lloyd Quantizer for uniform pdf
12
11 2
2
22
1
1 SSdgg
SSdfrf
dd
SS
SS
d
d
kkk
k
k
/
/
J
aadd LU
kk
1
Step size (SS)
12
1 22
2
22 Adff
A
A
A
f
/
/
Q6
Q7Variance =
Max-Lloyd Quantizer for uniform pdf
bb A
SSJ2
2 Q8For a b bits quantizer,
bdBA
A
MSQEb
b
f 62
122
12 2
2
2
22
/
/Q9SNR =
200 185 170 25 1 3 1 3
198 180 160 171 10 7 3 10
165 150 125 5 12 11 10 9
30 25 8 13 5 3 9 0
210 190 195 120 7 15 5 8
2 9 1 0 3 6 2 1
5 5 7 2 7 1 1 5
4 9 2 11 9 2 3 0
Assign different quantization step size for each coefficients
Figure 12
Consider a range of values from, lets say 0 to 255
0 - 7 0 00000 0 8 - 15 1 00001 816 - 23 2 00010 1624 - 31 3 00011 2432 - 39 4 00100 32
248 - 255 31 11111 248
If a step size = 8 is used, the range is divided into 256/8 = 32 levels
5 bits are required to represent each level in this range
Value Level Bit string Quantized value
Consider a range of values from, lets say 0 to 255
0 - 15 0 00000 016 - 31 1 00001 1632 - 47 2 00010 3248 - 63 3 00011 4864 - 79 4 00100 64
240 - 255 16 11111 240
If a step size = 16 is used, the range is divided into 256/16 = 16 levels
4 bits are required to represent each level in this range
Value Level Bit string Quantized value
16 le
vels
The larger the step size,
the smaller the number of quantized levels
the smaller the number of bits,
the larger the distortion in value
and the other way round
Human Visual System is more sensitive to low frequency intensity (spatial) variation in an image
Increasing horizontal frequency
Incr
easi
ng
vert
ical
fr
equ
ency
Figure 13
Human Visual System (HVS) is more sensitive to low frequency intensity (spatial) variation in an image
Decreasing sensitivity to HVS
Dec
reas
ing
sen
siti
vity
to
HV
S
Figure 14
200 185 170 25 1 3 1 3
198 180 160 171 10 7 3 10
165 150 125 5 12 11 10 9
30 25 8 13 5 3 9 0
210 190 195 120 7 15 5 8
2 9 1 0 3 6 2 1
5 5 7 2 7 1 1 5
4 9 2 11 9 2 3 0
Assign different quantization step size for each coefficients
Figure 15
1 1 1 4 8 12 16 20
1 4 8 12 16 22 22 25
4 8 12 16 20 24 25 30
8 12 16 20 22 28 30 32
1 1 4 8 12 20 22 24
12 14 18 24 25 30 35 40
10 16 20 28 30 35 40 43
12 20 25 30 32 40 45 48
DCT coefficients Q Step Size
200 185 170 25 1 3 1 3
198 180 160 171 10 7 3 10
165 150 125 5 12 11 10 9
30 25 8 13 5 3 9 0
210 190 195 120 7 15 50 8
2 9 1 0 3 6 2 1
5 5 7 2 7 1 1 5
4 9 2 11 9 2 3 0
Assign different quantization step size for each coefficients
Figure 16
200 185 170 24 0 0 0 0
198 180 160 168 0 0 0 0
164 144 120 0 0 0 0 0
32 24 0 0 0 0 0 0
210 190 192 120 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
DCT coefficients Quantized DCT coefficients
After Quantization, a lot of high frequency DCT coefficients are truncated to ‘0’
Non-zero coefficients carry most of the image contents and those that are sensitive to the HVS
Large number of ‘0’ value coefficients suggested runlength coding
For a continuous stream of numbers with identical values, it is only necessary to record 1. The value of the number2. The number of duplication
A sequence of 8 bytes of raw data s = [15, 15, 15, 15, 15, 15, 15, 15]
Runlength representation: [ 15 , 8 ]
Value Runlength
Only 2 bytes are needed to
represent ‘s’
The longer the string of duplicated numbers, the larger the Compression Ratio (CR)
Runlength representation: [ 15 , 4 ]
Value RunlengthCompression Ratio = 2
Runlength representation: [ 15 , 16 ]
Value RunlengthCompression Ratio = 8
s = [15, 15, 15, 15]
s = [15, 15, 15, 15, 15, 15, 15, 15,15,15,15,15,15,15,15,15]
200 185 170 24 0 0 0 0
198 180 160 168 0 0 0 0
164 144 120 0 0 0 0 0
32 24 0 0 0 0 0 0
210 190 192 120 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Runlength of ‘0’
4
4
4
5
6
8
8
8
CR
2
2
2
2.5
3
4
4
4
Figure 17
The compression ratio of horizontal scanning is always less than or equal to 4
A better approach is to adopted zig-zag scanning
Image
Image Vectors
DCT
Quantization
Zig-Zag Coding
Runlength Coding
Entropy Coding
Digitization
JPEG Compressed Format
200 185 170 24 0 0 0 0
198 180 160 168 0 0 0 0
164 144 120 0 0 0 0 0
32 24 0 0 0 0 0 0
210 190 192 120 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Quantized DCT coefficients
Runlength of ‘0’ = 47
CR = 23.5
Figure 18
Image
Image Vectors
DCT
Quantization
Zig-Zag Coding
Runlength Coding
Entropy Coding
Digitization
JPEG Compressed Format
Probability distribution of pixel values are uneven
Use less bits to represent pixel intensity that occurs more often
Remember this?
This can be generalised to ......
If probability distribution of data values are uneven
Less bits can be used to represent values that occurs more often and vice versa
In JPEG, DC and other coefficients are encoded separately
200 185 170 25 1 3 1 3
198 180 160 171 10 7 3 10
165 150 125 5 12 11 10 9
30 25 8 13 5 3 9 0
210 190 195 120 7 15 50 8
2 9 1 0 3 6 2 1
5 5 7 2 7 1 1 5
4 9 2 11 9 2 3 0Figure 19
DCT coefficientsDC
All other coefficients are ‘AC’ terms
DC coefficients of adjacent image blocks are similar.
DC coefficient represents the average intensity in an image block
8 pixels
8 pixels
Differential Pulse Code Modulation (DPCM) is applied to encode the ‘Quantized’ DC terms.
Consider a row of image block
200 190 198 202 205 200 195 220 225
Image blocks Quantized DC coefficients
-10 +8 +4 +3 -5 -5 +25 +5
DPCM
As adjacent DC terms are similar, the DPCM values are small in general, i.e., small values
occur more often
The DPCM values are divided in 16 classes according to their magnitude
Each class had different probability of occurence
Class DPCM difference values0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[0]
[-1] [+1]
[-3,-2] [+2,+3]
[-7, -6, ...., -4] [+4, +5, ...., +7]
[-15, -14, ....,-9, -8] [+8, +9, ....,+14, +15]
[-31, -30, ....,-17, -16] [+16, +17, ....,+30, +31]
[-63, -62, ....,-33, -32] [+32, +33, ....,+62, +63]
[-127, -126, ......., -64] [+64, ......., +126, +127]
[-255, -254, ....., -128] [+128, +129, ....., +255]
[-511, -511, ....., -256] [+256, +257, ....., +511]
[-1023, ..., -513, -512] [+512, +513, ..., +1023]
[-2047, ........., -1024] [+1024, ..........., +2047]
[-4095, ........., -2048] [+2048, ..........., +4095]
[-8191, ........., -4096] [+4096, ..........., +8191]
[-16383, ........, -8192] [+8192, .........., +16383]
[-32767, ......, -16384] [+16384, ........, +32767]
Small values, that occur more often, are grouped into classes that contain fewer
members
A class with fewer elements(s) require less bits to identify its members
As a result, small values require less bits to represent
Any DPCM value is addressed by its class and a string of additional bits to identify its position in the class
Class DPCM difference values6 [-63, -62, ....,-33, -32] [+32, +33, ....,+62, +63]
For example, in class 6, there are 64 members, 6 additional bits is required
000000 000001 011111 100000 111111
Representation of DPCM data
Class Additional bits
4 bits Adaptive
For most DC coefficients, the DPCM values are belonged to lower classes that require less
additional bits
Nonzero AC terms are represented in the same way as DPCM coefficients
Class Additional bits
4 bits Adaptive
Zero terms are encoded with zig-zag scanning followed by RLC
How are these two items combined?
200 0 -14 0 0 0 0 0
0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Quantized DCT coefficients V
0 1 5 6 14 15 27 28
3 8 12 17 25 30 41 43
9 11 18 24 31 40 44 53
10 19 23 32 39 45 52 54
2 4 7 13 16 26 29 42
20 22 33 38 46 51 55 60
21 34 37 47 50 56 59 61
35 36 48 49 57 58 62 63
Zig-zag scanning index (I)
1 2 3 4 5 6 7 8 9 10 11 12 62 63
0 0 0 0 -14 0 0 0 1 0 0 0 0 0
I
V
4 3 54RL
Class AC coefficient values0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[0]
[-1] [+1]
[-3,-2] [+2,+3]
[-7, -6, ...., -4] [+4, +5, ...., +7]
[-15, -14, ....,-9, -8] [+8, +9, ....,+14, +15]
[-31, -30, ....,-17, -16] [+16, +17, ....,+30, +31]
[-63, -62, ....,-33, -32] [+32, +33, ....,+62, +63]
[-127, -126, ......., -64] [+64, ......., +126, +127]
[-255, -254, ....., -128] [+128, +129, ....., +255]
[-511, -511, ....., -256] [+256, +257, ....., +511]
[-1023, ..., -513, -512] [+512, +513, ..., +1023]
[-2047, ........., -1024] [+1024, ..........., +2047]
[-4095, ........., -2048] [+2048, ..........., +4095]
[-8191, ........., -4096] [+4096, ..........., +8191]
[-16383, ........, -8192] [+8192, .........., +16383]
[-32767, ......, -16384] [+16384, ........, +32767]
1 2 3 4 5 6 7 8 9 10 11 12 62 63
0 0 0 0 -14 0 0 0 1 0 0 0 0 0
I
V
3 54RL
4Class 1
Class AC coefficient values4 [-15, -14, ....,-9, -8] [+8, +9, ....,+14, +15]
0000 0001 0111 1000 1111
4
0001
Class AC coefficient values1 [-1] [+1]
0 1
1 2 3 4 5 6 7 8 9 10 11 12 62 63
0 0 0 0 -14 0 0 0 1 0 0 0 0 0
I
V
3 54RL
4Class 1
4
0001 1
RL and Class are grouped into the RUN-SIZE Table
00 01 02 03 04 05 0F
N/A 11 12 13 14 15 1F
N/A 21 22 23 24 25 2F
N/A 31 32 33 34 35 3F
N/A 41 42 43 44 45 4F
N/A 51 52 53 54 55 5F
N/A F1 F2 F3 F4 F5 FF
0 1 2 3 4 5 F
0
1
2
3
4
5
F
RR
RR
SSSS00 - End of Block
•Each non-zero AC coefficient is represented by an 8-bit value ‘RRRRSSSS’
•RRRR is the runlength of ‘zeros’ between current and previous AC coefficients
•If the runlength exceeds 15, a term ‘F0’ will be inserted to represent a runlength of 16
•If all remaining coefficients are zero, a term ‘00’ (EOB) is inserted.
A Few Points to Note
Additional bits
RL
Class
4 3
4 1
EOB
EOB : End of Block
RS 44 31 00 Hexadecimal
RS 68 49 00 Decimal
1 2 3 4 5 6 7 8 9 10 11 12 62 63
0 0 0 0 -14 0 0 0 1 0 0 0 0 0
I
V
3 54RL
4Class 1
4
0001 1
0001 1Additional bits
68RS 49
Encoded AC format 0001 168 49
00
00
Number of bits: 8 + 4 + 8 + 1 + 8 = 29bits
Number of bits for the 63 AC coefficients = 63 x 11 = 693 bits
1 2 3 4 5 6 7 8 9 10 11 12 62 63
0 0 0 0 -14 0 0 0 1 0 0 0 0 0
I
V
3 54RL
4Class 1
4
The “Baboon” is one of the popular standard images that had been adopted for comparison purpose in image compression research. The difficult part is that the large amount of texture is pretty hard to compress with good fidelity. The easy part is the distortions are difficult to spot.
Hi!, I am the famous Baboon,
very nice to meet all of you.