ele 488 f06 ele 488 fall 2006 image processing and transmission (11-16-06, 11-21-06) jpeg block...
TRANSCRIPT
ELE 488 F06
ELE 488 Fall 2006Image Processing and Transmission
(11-16-06, 11-21-06)
JPEG block based transform coding Beyond basic JPEG Spatial correlation Subband decomposition & coding Wavelet transform Zero tree Successive Approximation Quantization
Digital Video
11/16
ELE 488 F06
Review of Up and Down Sampling (without filter)
• Down sampling
• Up sampling
+----+ x[n]-->| ↓ M |--> y[n]=x[nM], +----+
Y(ω) =M
1
1
0
M
r
X(M
r 2).
Example: M = 2 Y(ω) =2
1X(
2
) +
2
1X(
2
)
+----+ u[n]-->| ↑ L |--> v[nL]=u[n] +----+ y[Ln+k]=0,0<k<L V(ω) = U(ωL). Example: L = 2 V(ω) = U(2ω)
ELE 488 F06
0
X(ω)
2/2 0
X(ω/2)
2/2
X(ω/2-)Y(ω)
(A)
0
X(ω)
/22 0
V(ω)
/22
(C)(D)
(B)
Lowpass and Highpass Decimation
+----+ +----+ +------------+ -->| ↓2 |----->| ↑2 |----->| LPF or HPF |--> +----+ +----+ +------------+ x y v x
ELE 488 F06
Subband Decomposition + CodingEncoding / Decoding
H0 – lowpass, H1 – highpass
Y(z) = (1/2)[ H0(z)G0(z) + H1(z)G1(z) ]X(z)+ (1/2)[ H0(–z)G0(z) + H1(–z)G1(z) ]X(–z)
If G0(z) = H0(z) = H1(-z) = -G1(-z),
Y(z) = (1/2)[ H0(z)G0(z) + H1(z)G1(z) ] X(z), alias free
Specified by a single filter H0(z)
ELE 488 F06
From x[n] to {uk[n]}
Discrete Wavelet Transform (forward DWT)
Mother wavelet
Scaling function (level k)
Discrete Wavelet Transform (forward DWT)
ELE 488 F06
Examples of 1-D Wavelet Transform
• Note: low frequency components similar From Matlab Wavelet Toolbox Documentation
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
20
01
)
ELE 488 F06
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
20
04
)2-D Wavelet Transform via Separable Filters
From Usevitch (IEEE Sig.Proc. Mag. 9/01)
Note numbering of freq bands
ELE 488 F06
Embedded Zero-Tree Wavelet Coding (EZW)
• Exploits multi-resolution and self-similar nature of wavelet decomposition– Energy is compacted into a small
number of coeff.– Significant coeff
• w.r.t. a threshold• tend to cluster at the same
position in each frequency subband
• Two sets of info. to code: – Where are the significant
coefficients? (significance map)
– What values are the significant coefficients?
Usevitch (IEEE Sig.Proc. Mag. 9/01)
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
20
01
)
ELE 488 F06
Key Concepts in EZW • Parent-children relation among coeff.
– A coeff at level k spatially correlates with 4 child coeff at level (k-1) of same orientation
– A lowest band coeff correlates with 3 coeff.
• Coding significance map via zero-tree– Encode only high energy coefficients
• Need to send location info. large overhead
– Encode “insignificance map” with zero-trees
• Successive approximation quantization– Send most-significant-bits first and
gradually refine coeff. value– “Embedded” nature of coded bit-stream
• Improve image quality by adding extra refining bits
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
20
01
)
Usevitch (IEEE Sig.Proc. Mag. 9/01)
ELE 488 F06
EZW Algorithm and Example
• Initial threshold ~ 2 ^ floor(log2 xmax)– Put all coeff. in dominant list
• Dominant Pass (“zig-zag” across bands)
– Assign symbol to each coeff. and entropy encode symbols
• ps – positive significance• ns – negative significance• iz – isolated zero• ztr – zero-tree root
– Significant coeff.• move to subordinate list• put zero in dominant list
• Subordinate Pass– Output one bit for subordinate list
• According to position in up/low half of quantization interval
• Repeat with half threshold– Until bit budget reached From Usevitch (IEEE
Sig.Proc. Mag. 9/01)
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
20
01
)
ELE 488 F06
1st PassU
MC
P E
NE
E6
31
Slid
es
(cre
ate
d b
y M
.Wu
© 2
00
1)
xmax = 53, 32< 53 < 64, To = 32
Only 2 coef in [32,64)
(32+64)/2 = 48
48<53<64 1, 32<34<48 0
ELE 488 F06
2nd Pass
From Usevitch (IEEE Sig.Poc. Mag. 9/01)
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
20
01
)
To=32, T1=16,
16< 22 < 32,
(16+32)/2 = 24
ELE 488 F06
Beyond EZW
• Cons of EZW– Poor error resilience– Difficult for selective spatial decoding
• SPIHT (Set Partitioning in Hierarchal Trees)– Further improvement over EZW to remove redundancy
• EBCOT (Embedded Block Coding with Optimal Truncation)– Used in JPEG 2000– Address the shortcomings of EZW (random access, error
resilience, …)– Embedded wavelet coding in each block + bit-allocations
among blocks
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
20
01
/20
04
)
ELE 488 F06
JPEG 2000: A Wavelet-Based New Standard
• Targets and features
– Excellent low bit rate performance without sacrifice performance at higher bit rate
– Progressive decoding to allow from lossy to lossless
– Region-of-interest (ROI) coding
– Error resilience
• For details
– David Taubman: “High Performance Scalable Image Compression with EBCOT”, IEEE Trans. On Image Proc, July 2000.
– JPEG2000 Tutorial by Skrodras et al IEEE Sig. Proc Magazine Sept 01
– Links and tutorials: http://www.jpeg.org/JPEG2000.htm
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
20
01
/20
04
)
ELE 488 F06
Examples
JPEG2K vs.
JPEG
From Christopoulos (IEEE Trans. on CE 11/00)
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
20
01
)
ELE 488 F06
DCT vs. Wavelet
• 3dB improvement?– Wavelet compression claimed to have 3dB improvement
over DCT-based compression– Comparison done on JPEG Baseline
• Improvement not all due to transforms– Improvement comes mainly from better rate allocation,
advanced entropy coding, & smarter redundancy reduction via zero-tree
– DCT coder can be improved to decrease the gap
"A comparative study of DCT- and wavelet-based image coding" Z. Xiong, K. Ramchandran, M. Orchard, Y-Q. Zhang, IEEE Trans. on Circuits and Systems for Video Tech., Aug 1999, pp692-695.
UM
CP
EN
EE
63
1 S
lide
s (c
rea
ted
by
M.W
u ©
20
01
)
ELE 488 F06
• Motion pictures started with an argument:– “Do They Or Don’t They?”– Muybridge (1877) - 12 cameras to take 12 pictures of
running horse
• Limitation of human vision system– persistence and fusion
Motion Picture Television Digital Video
ELE 488 F06
Motion Picture
• Perception of motion - persistence and fusion• 1877: Muybridge - 12 cameras to take 12 pictures of running horse • 1882: Marey - one camera to take 12 pictures per second• 1888: Dickson - motion picture camera
– Kinetograph (patent 1893, 50ft perforated film, 40 fr/sec, battery, 1000lb)
– Kenetoscope , single viewer projector (47 ft film, 25c for 5 shows) • 1895: Lumiere - projector/camera (hand-crank, 16 fr/sec, 20lb)• 1919: De Forest - optical sound on film• 1926 Warner B: “Don Juan” (John Barrymore, NY Philharmonic)• 1900 - : Film industry• 1940’s: Television• 1980’s: Digital Video
Recording event Re-live past moments Editing Creation of fictitious events / objects Manipulate perceived REALITY
ELE 488 F06
From Television to Digital Video
• Broadcast Television (analog)– movie at home - why invent new technology?– mass market– influence of movie on development
• Key Steps– convert pictures to electric signal– send electric signal – convert electric signal to picture
• Comparison with motion picture• High Definition Television - analog digital, compression
• Video telephone - analog predecessor• Video conference - travel cost, people cost• Cable (narrowcast), satellite, interactive, ...
ELE 488 F06
Key Steps in Broadcast Television
• Convert picture to electric signal – video camera– initially only at TV studios, cost not as important– recording media (tape), editing, …– special equipment to convert movie
• Send electric signal – follow radio broadcast, needs spectrum allocation
from FCC– VHF (Ch 2 – 13), UHF (Ch 14 & up)
• Convert electrical signal to picture– cathode ray tube offers economic solution– flat panel: LCD, LED, plasma panel– future
ELE 488 F06
NTSC (National Television Systems Committee)
• 525 lines– 2 dots less than 1/2000 of distance from eye are not
separated (merge into one)– Assume view at distance 4 times the screen height. Dots
on screen less than (1/2000) x 4 x H = H/500 apart are not separated by eye. No need to have more than 500 lines
– NTSC set 525 lines (475 active)• Movies in 1940 has 4:3 aspect ratio (width to height)• 25 or more pictures per second to see continuous motion• 50 or more pictures per second to avoid flicker
– movies use 24 frames/sec, each shown twice• 30 frames/sec with 2:1 interlace (60 even-odd fields/sec)
ELE 488 F06
Bandwidth of Broadcast Television
• Without interlace (progressive scan), 60 frames/sec– 500 lines alternating black and white gives 250 full cycles– each horizontal line has 250 x 4/3 ~ 350 full cycles– 60 (frames/sec) x 500 (line) x 350 = 10,000,000 cycles/sec
or 10 MHz (video ONLY)
• With 2:1 interlace, 5 MHz for video
• FCC assigns 6 MHz per broadcast channel– real usable bandwidth is less, MUCH less– actual resolvable lines per vertical height ~250
• Color insertion - must compatible with B/W receiver– Change R-G-B to Y-Cb-Cr – Y is luminance (brightness), Cb and Cr are chrominances– B/W sets converts Y to picture, – Color sets converts Y-Cb-Cr to R-G-B, then display
ELE 488 F06
Digital Video
• What drives digital video?– Information technology:
• electronics, communication infrastructure, storage, functionality, …
– HDTV
• R-G-B component video– 640 x 480 (pixel) x 3 (color) x 8 (bits/color) x 30 = 221 Mb/sec
• Y-Cb-Cr with subsampled Cb and Cr– 640 x 480 (pixel) x 1.5 (color) x 8 (bits/color) x 30 = 110 Mb/sec
• Compression - MPEG (motion picture expert group)– MPEG-1: CD-ROM, 1.5Mb/sec, 1.2Mb/sec for video,
352x240 (CIF), progressive scan, motion compensation– MPEG-2: extension of MPEG-1, interlace, HD– MPEG-4: object/region based– H.2xx