05a compression
TRANSCRIPT
-
8/6/2019 05A Compression
1/102
05A-compression.fm 1 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ech
nik
.tu-d
arms
tadt.de
http:/
/www
.tk
.in
forma
tik
.tu-d
arms
tadt.de
R.
Steinm
etz
,M
.Mhlhuser Multimedia-Systems:
Compression
Prof. Dr.-Ing. Ralf SteinmetzProf. Dr. Max Mhlhuser
MM: TU Darmstadt - Darmstadt University of Technology,
Dept. of of Computer Science
TK - Telecooperation, Tel.+49 6151 16-3709,
Alexanderstr. 6, D-64283 Darmstadt, Germany, [email protected] Fax. +49 6151 16-3052
RS:TU Darmstadt - Darmstadt University of Technology,
Dept. of Electrical Engineering and Information Technology, Dept. of Computer Science
KOM - Industrial Process and System Communications, Tel.+49 6151 166151,
Merckstr. 25, D-64283 Darmstadt, Germany, [email protected] Fax. +49 6151 166152
GMD -German National Research Center for Information Technologyhttc - Hessian Telemedia Technology Competence-Center e.V
http://goback/ -
8/6/2019 05A Compression
2/102
05A-compression.fm 2 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ech
nik
.tu-d
arms
tadt.de
http:/
/www
.tk
.in
forma
tik
.tu-d
arms
tadt.de
R.
Steinm
etz
,M
.Mhlhuser
Scope
Usage Applications
Learning & Teaching Design User Interfaces
Services Content
Process-
ing
Docu-
mentsSecurity ...
Synchro-
nization
GroupCommuni-
cations
System
s Databases Programming
Media-Server Operating Systems Communications
Opt. Memories Quality of Service Networks
Basics Computer
Archi-tectures
Compression
Image &
GraphicsAnimation Video Audio
http://goback/ -
8/6/2019 05A Compression
3/102
05A-compression.fm 3 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ech
nik
.tu-d
arms
tadt.de
http:/
/www
.tk
.in
forma
tik
.tu-d
arms
tadt.de
R.
Steinm
etz
,M
.Mhlhuser
Contents
1. Motivation2. Requirements - General
3. Fundamentals - Categories
4. Source Coding
5. Entropy Coding:
6. Hybrid Coding: Basic Encoding Steps
7. JPEG
8. H.261 and related ITU Standards
9. MPEG-1
10. MPEG-2
11. MPEG-4
12. Wavelets13. Fractal Image Compression
14. Basic Audio and Speech Coding Schemes
15. Conclusion
http://goback/ -
8/6/2019 05A Compression
4/102
05A-compression.fm 4 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ech
nik
.tu-d
arms
tadt.de
http:/
/www
.tk
.in
forma
tik
.tu-d
arms
tadt.de
R.
Steinm
etz
,M
.Mhlhuser
1. Motivation
Digital video in computing means for
Text:
1 page with 80 char/line and 64 lines/page and 2 Byte/Char
80 x 64 x 2 x 8 = 80 kBit/page
Image:
24 Bit/Pixel, 512 x 512 Pixel/image 512 x 512 x 24 = 6 MBit/Image
Audio:
CD-quality, samplerate44,1 kHz, 16 Bit/sample
Mono: 44,1 x 16 = 706 kBit/s
Stereo: 1.412 MBit/s Video:
full frames with 1024 x 1024 Pixel/frame, 24 Bit/Pixel, 30 frames/s1024 x 1024 x 24 x 30 = 720 MBit/s
more realistic360 x 240 Pixel/frame = 60 MBit/s
Hence compression is NECESSARY
http://goback/ -
8/6/2019 05A Compression
5/102
05A-compression.fm 5 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ech
nik
.tu-d
arms
tadt.de
http:/
/www
.tk
.in
forma
tik
.tu-d
arms
tadt.de
R.
Steinm
etz
,M
.Mhlhuser
2. Requirements - General
high quality
compression
low delay
low complexity (e.g., ease of decoding)efficient implementation (e.g., memory req.)
intrinsic scalability
http://goback/ -
8/6/2019 05A Compression
6/102
05A-compression.fm 6 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ech
nik
.tu-d
arms
tadt.de
http:/
/www
.tk
.in
forma
tik
.tu-d
arms
tadt.de
R.
Steinm
etz
,M
.Mhlhuser
Requirements
DIALOGUE AND RETRIEVAL mode requirements:
Independence of frame size and video frame rate
Synchronization of audio, video, and other media
DIALOGUE mode requirements:
Compression and decompression in real-time(e.g. 25 frames/s)
End-to-end delay < 150ms
RETRIEVAL mode requirements:
Fast forward and backward data retrieval
Random access within 1/2 s
Software and/or hardware-assisted implementation requirements
http://goback/ -
8/6/2019 05A Compression
7/102
05A-compression.fm 7 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ech
nik
.tu-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Steinm
etz
,M
.Mhlhuser
3. Fundamentals - Categories
entropy encodingsource coding
- based on semantic of the data
- often lossy
channel coding
- adaptation to communication channel
- introduction of redundancy
hybrid
coding
- entropy
and
source
coding
entropy coding
- ignoring semantics of the data
- lossless
http://goback/ -
8/6/2019 05A Compression
8/102
05A-compression.fm 8 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ech
nik
.tu-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Steinm
etz
,M
.Mhlhuser
Categories and Techniques
EntropyCoding
Run-Length CodingHuffman Coding
Arithmetic Coding
Source
Coding
Prediction DPCM
DM
TransformationFFT
DCT
Layered Coding
Bit Position
Subsampling
Sub-Band Coding
Vector Quantization
Hybrid
Coding
JPEG
MPEG
H.261, H.263
proprietary: Quicktime, ...
http://goback/ -
8/6/2019 05A Compression
9/102
05A-compression.fm 9 15.March.01
Scope
Contents
h
ttp:/
/www
.kom
.e-t
ech
nik
.tu-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Steinm
etz
,M
.Mhlhuser
Categories & Techniques, Cont. (1)
Two principal possibilities
1. Entropy Coding: Eliminate Redundancy (thus, lossless)
2. Reduction Coding: Eliminate Irrelevance / Low-Relevance (lossy)
Preparatory Step: Decorrelation - Eliminate Interdependencies this is the essence of source coding
changes "representation" of media
goal usually: reduce dependencies between data
as such, is a preparatory step!!
usually, does not compress
Steps in hybrid coding (often): decorrelation - reduction - entropy coding
often: reduction by quantization
last step: additional compresion without harm
note: literature usually uses terms as in last slide!!
note: reduction coding is "smart deletion", not really "compression"
http://goback/ -
8/6/2019 05A Compression
10/102
05A-compression.fm 10 15.March.01
Scope
Contents
h
ttp:/
/www
.kom
.e-t
echn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Steinm
etz
,M
.Mhlhuser
Categories and Techniques, Cont. (2)
Major distinction: Symmetric / Asymmetric
Asym. (usually): more effort for compression
o.k. if compression non real-time, "only once" (movie!)
may involve number-crunchers (...owned by content provider)
Symmetric: "required" for real-time, e.g., videoconferencing
in reality, often not 100% symmetricFurther considerations include, e.g.,
Adjustable compression rate? ...quality?
"smooth" bit stream ("isochronous")?
terms: CBR (const. bit rate) vs. VBR (variable bit rate)
may be "over time": e.g., packet size BigSmallSmall BigSmallSmall...
may be simulated w/ loop-back filter plus buffer
"progressive" (mainly: non-continuous media): display-while-download
"streaming": ~ same for video (here, rather an issue of software)
more subtle issues "open" standard?
good "performance" (ratio, speed) for all kinds of media?
bullet-proof, well-understood?
...
http://goback/ -
8/6/2019 05A Compression
11/102
05A-compression.fm 11 15.March.01
Scope
Contents
h
ttp:/
/www
.kom
.e-t
echn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Steinm
etz
,M
.Mhlhuser
4. Source Coding
DPCM
DPCM = Differential Pulse-Code Modulation
Assumptions:
Consecutive samples or frames have similar values
Prediction is possible due to existing correlationFundamental Steps:
Incoming sample or frame (pixel or block) is predictedby means of previously processed data
Difference between incoming data and prediction is determined
Difference is quantized
Challenge: optimal predictor
Further predictive coding technique:
Delta modulation (DM): 1 bit as difference signal
http://goback/ -
8/6/2019 05A Compression
12/102
05A-compression.fm 12 15.March.01
Scope
Contents
h
ttp:/
/www
.kom
.e-t
echn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Steinm
etz
,M
.Mhlhuser
Source Coding: Transformation
Assumptions:
Data in the transformed domain is easier to compress
Related processing is feasible
Example:
FFT: Fast Fourier Transformation
DCT: Discrete Cosine Transformation
Inverse
Fourier Transformation
time domain frequencydomain
Fourier Transformation
http://goback/ -
8/6/2019 05A Compression
13/102
05A-compression.fm 13 15.March.01
Scope
Contents
h
ttp:/
/www
.kom
.e-t
echn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Steinm
etz
,M
.Mhlhuser
Source Coding: Sub-Band
Assumption:
Some frequency ranges are more important than others
Example:
Application:
vocoder for speech communication
MPEG audio
frequency spectrum of the signal
transformation / codingfrequency
http://goback/ -
8/6/2019 05A Compression
14/102
05A-compression.fm 14 15.March.01
Scope
Contents
h
ttp:/
/www
.kom
.e-t
echn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Steinm
etz
,M
.Mhlhuser
Entropy Coding: Principle
Entropy (in information theory): information content/ "density"
symbols/words equally likely: high entropy (full of information)
otherwise: lower entropy (suboptimal representation of info, less dense)
Entropy formula:
example: given 4 possible symbols (words) in source code
i) IF all equal p=1/4: H(P)=2; ii) IF p= 1/2, 1/4, 1/8, 1/8 --> H(P)= 1
6
/8 "Entropy coding" means:
mean length of file equals (~almost) entropy
in ii) above, with B=2 (binary):
p=! code length -log2 () = -(-1)=1; p=! 2bits, etc.
GOAL: find code w/ symbol length as close as possible to logB p()
grey levelsp
roba
bility
grey levels
pro
ba
bility
high
Entropy
low
Entropy
grey levelsp
roba
bility
grey levels
pro
ba
bility
high
Entropy
low
Entropy
P( p p B
H ) ( ) ( )log=P( p p B
H ) ( ) ( )log=
note: seems "little information" tous since it is very regular; this is not
covered by entropy formula, yet maybe used for compression (e.g. run length)
here: "little info" because"most of picture is in same gray"
http://goback/ -
8/6/2019 05A Compression
15/102
05A-compression.fm 15 15.March.01
Scope
Contents
h
ttp:/
/www
.kom
.e-t
echn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Steinm
etz
,M
.Mhlhuser
5. Entropy Coding:
Run-Length (only marginal relation to entropy)
Assumption:
Long sequences of identical symbols
Example:
Special variant: zero-length encoding
only repetition of zeroes count
in red part above, "symbol" not needed (i.e. "pays" for >2 repetitions)
... A B C E E E E E E D A C B...
compression
... A B C E ! 6 D A C B...
symbol
special flag
number ofoccurrences
http://goback/ -
8/6/2019 05A Compression
16/102
05A-compression.fm 16 15.March.01
Scope
Contents
h
ttp:/
/www
.kom
.e-t
echn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Steinm
etz
,M
.Mhlhuser
Entropy Coding: Huffman
Basics:
Assumption: some symbols occur more often than others
E.g., character frequencies of the English language
Idea: frequent symbols --> shorter bit strings (cf. Entropy!)
Example: Characters to be encoded: A, B, C, D, E
probability to occur: p(A)=0.3, p(B)=0.3, p(C)=0.1, p(D)=0.15, p(E)=0.15
probability symbol code
1
0
1
0
1
0
1
0
coding tree
30%
30%
10%
15%
15%
A
B
C
D
E
11
10
011
010
00
100%
40%
60%
25%
step 1: scan all leaves, assign(1,0) to the two with lowest
probability -> intermediate root
steps 2-n: scan current "tops"
(intermediate roots or leaves),
assign (1,0) to the two withlowest probability, -> ...
end: assign codes by descending
tree until leaves, bits encountered
represent code step 2
step 3
step 4
http://goback/ -
8/6/2019 05A Compression
17/102
05A-compression.fm 17 15.March.01
Scope
Contents
h
ttp:/
/www
.kom
.e-t
echn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Steinm
etz
,M
.Mhlhuser
Entropy Coding: Huffman
Table and example of application to data stream
note: decoder may auto-detect end-of-symbol in bit-stream
Other types of Entropy encoding
Arithmetic Encoding (1)
most direct application of entropy principles!
symbols occupy sub-interval of [0,1) according to their probabilities successive coding of symbols cuts out corresponding sub-interval
in sub-(sub-sub-...) interval chosen so far
last symbol --> last sub-interval
chose "arbitrary" ("short") no. in this subinterval --> transmit/store
requires consideration of "additional" symbol "end-of-word" (below: "!")
10 11 011 010 11 10 00 10 11 00
B A DC BA B E A
symbol code
AB
CD
E
11
10
011010
00
E
http://goback/ -
8/6/2019 05A Compression
18/102
05A-compression.fm 18 15.March.01
Scope
Contents
h
ttp:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Steinm
etz
,M
.Mhlhuser
Arithmetic Coding Example
symbols a,b,c,d,!; how to encode "bbd!"??
transmit/store one (arbitrary) value X in "red" sub-interval
decoder: X lies in [.1,.6) !1st symbol is "b"; X in [.11, 36) ! 2nd symbol "b"
process continues. until "!" is decoded
0 .1 .6 .7 .9 1
".1#"$$$$$$$$$$ 0.5 $$$$$$$$$$#".1#"$$0.2$$#".1#
0 .1 .6 .7 .9 1
0 .1 .11 .36 .6 .7 .9 1
0 .1 .6 .7 .9 1
0 .1 .6 .7 .9 1
divide [0,1) accordingto probabilities
restrict to b: [0.1,0.6)
b: sub-interval [.1,.6)
of [.1,.6), i.e. [.11,.36)
d: sub-interval [.7,.9)of [.11,.36)
!: sub-interval [.9,1)...
a b c d !
http://goback/ -
8/6/2019 05A Compression
19/102
05A-compression.fm 19 15.March.01
Scope
Contents
h
ttp:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Steinme
tz,
M.
Mhlhuser
LZW, Code Books, VLWs, VLIs
LZW (Lempl-Ziv-Welch):
e.g.: how to code "whoiswho"? might be thought of as follows: start: create table entry for w
then: create table entry for h, but also for "wh" (multi-symbol),
then: create table entry for o, but also for "who", (etc. until max-length)
multi-symbol entries which repeat "often" survive, others: over-written
Code-Books: used in many compression schemes 3 basic possibilities: (imagine in above example)
"fixed": all implementations of Codec have (1..n) pre-defined code-books
"pre-computed":
encoder-pass1: compute codebook, store/transmit upfront encoder-pass2: encode (compress) data using codebook
"dynamic": code-book grows / changes during compression (LZW)
needs "same procedure" for encoder, decoder
either: "pieces" of code-book are intertwined with code as they
are generated / changed or: rules are such that (dynamic) codebook contents can be derived
from encoded (compressed) data by decoder
VLWs / VLIs: variable length words / integers
similar to Huffman, but decoder can not detect end-of-symbol
e.g., 1="0", 2="01", 3="11", ... (useful?? see JPEG etc.)
http://goback/ -
8/6/2019 05A Compression
20/102
05A-compression.fm 20 15.March.01
Scope
Contents
h
ttp:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Steinme
tz,
M.
Mhlhuser
6. Hybrid Coding: Basic Encoding Steps
audio:
video:lossy
lossless (sometimes lossless)
lossy
lossless
e.g.
- resolution- frame rate
e.g.
- DCT- sub-band
coding
e.g.
- linear- DC, AC
values
e.g.
- runlength- Huffman
data
pre-paration
data
pro-cessing
quanti-
zation
entropy
encoding
source
data
compresse
data
http://goback/ -
8/6/2019 05A Compression
21/102
05A-compression.fm 21 15.March.01
Scope
Contents
h
ttp:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Steinme
tz,
M.
Mhlhuser
7. JPEG
JPEG: Joint Photographic Expert Group
International Standard:
For digital compression and coding of continuous-tone still images:
Gray-scale
Color
Since 1992Joint effort of:
ISO/IEC JTC1/SC2/WG10
Commission Q.16 of CCITT SGVIII
Compression rate of 1:10 yields reasonable results
http://goback/ -
8/6/2019 05A Compression
22/102
05A-compression.fm 22 15.March.01
Scope
Contents
h
ttp:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Steinme
tz,
M.
Mhlhuser
JPEG
Very general compression scheme
Independence of:
Image resolution
Image and pixel aspect ratio
Color representation
Image complexity and statistical characteristicsWell-defined interchange format of encoded data
Implementation in:
Software only
Software and hardwareMOTION JPEG for video compression
Sequence of JPEG-encoded images
http://goback/ -
8/6/2019 05A Compression
23/102
05A-compression.fm 23 15.March.01
Scope
Contents
h
ttp:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Steinme
tz,
M.
Mhlhuser
JPEG - Compression Steps
imagepre-
paration
imagepro-
cessing
quanti-
zation
entropy
encoding
source
image
com-
pressed
image
blockMCU
pixel
predictor
FDCT
runlength
Huffman
Arithm.
MCU: Minimum Coded UnitFDCT: Forward Discrete Cosine Transformation
http://goback/ -
8/6/2019 05A Compression
24/102
05A-compression.fm 24 15.March.01
Scope
Contents
h
ttp:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Steinme
tz,
M.
Mhlhuser
JPEG - Image Preparation
Planes:
1 N 255 components Ci (e.g., one plane per color) Different resolution of individual components possible
Pixel resolution:
8 or 12 bit per pixel in lossy modes
2 to 16 bit per pixel in lossless mode
C1
C2
CNYi
Xi
right
bottom
left
* * *
*
*
* *
*
*
*
*
line
topdata units
data units: samples in lossless mode, blocks with 8x8 pixels in other modes
http://goback/ -
8/6/2019 05A Compression
25/102
05A-compression.fm 25 15.March.01
Scope
Contents
h
ttp:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Steinme
tz,
M.
Mhlhuser
JPEG - Image Preparation
Example 4:2:2 YUV, 4:1:1 YUV, and YUV9 Coding
Luminance (Y): brightness
sampling frequency 13.5 MHz
Chrominance (U, V):
color differences sampling frequency 6.75 MHz
http://goback/ -
8/6/2019 05A Compression
26/102
05A-compression.fm 26 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Steinme
tz,
M.
Mhlhuse
r
JPEG - Image Preparation
Non-interleaved encoding:
Interleaved encoding:
Minimum Coded Unit (MCU):
Combination of interleaved data units of different components
top
rightleft
bottom
* * * * * * *
* * * * * * *
* * * * * * *
* * ** * ** * ** * ** * ** * ** * ** * *
* * ** * ** * ** * *
* * ** * *
* *** **
* *** **
C1 C2 C3
http://goback/ -
8/6/2019 05A Compression
27/102
05A-compression.fm 27 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Stein
me
tz,
M.
Mhlhuse
r
JPEG - Baseline Mode
Baseline mode is mandatory for all JPEG implementations:
Often restricted to certain resolution Often only three planes with predefined color set-up
Image preparation:
Step 1a: Pixel resolution --> multiples of p=8 bit
yields 8 x 8 pixel blocks (data units)
Step 1b: unsigned --> signed integer (prepare for "oscillation" --> sin/cos)
... other steps see below
Step 4a: zigzag linearization (see below)
Steps 4b, c, ...: several entropy coding algorithms applied
tables
1: imagepre-
paration
2: imagepro-
cessing
3. quanti-
zation
4. entropy
encodingsource
image
com-
pressed
image
FDCT tables tables8x8
blocks
I i i U d di f DCT
http://goback/ -
8/6/2019 05A Compression
28/102
05A-compression.fm 28 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Stein
me
tz,
M.
Mhlhuse
r
Intuitive Understanding of DCT
Fourier-Transform (& FFT "fast" algorithm) known from 1-dimensional:
cut waveform into pieces (blocks of samples) for each blocks:
interpret as periodic (infinite) oscillating waveform
represent as sum of sin/cos waves ai sin t; i=0...(N-1); same for cos
ai coefficients; a0 = DC (direct current= shift wrt. 0-axis),others: how much of the respective sin or cos wave is part of waveform
i increasing frequencies (usually N = no. of samples in block)
DCT in JPEG etc.:
same idea, but 2-dimensional cos-waves
cut out square blocks from picture (NxN) cos waves all have independent frequencies in horizontal/vertical direction
comparable to smooth hills, # of valleys may differ horiz/vert.
again: interpret sample as periodic (2D) waveform--> represent as sum of (2D) cos wave "hill areas"
why only cos?? trick: picture swapped around axes
--> 4fold size --> picture symmetric to axes --> sin parts become zero
4fold size no problem: 3 parts redundant
axes have double "weight" (pix. row/col. "0") --> factor Cu/Cv in formula
JPEG B li M d I P i
http://goback/ -
8/6/2019 05A Compression
29/102
05A-compression.fm 29 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Stein
me
tz,
M.
Mhlhuse
r
JPEG - Baseline Mode: Image Processing
Forward Discrete Cosine Transformation (FDCT):
with:
cu, cv = , for u, v= 0; else cu, cv = 1
Formula applied to each block for all 0 u, v 7:
Blocks with 8x8 pixel result in 64 DCT coefficients:
1 DC-coefficient S00: basic color of the block
63 AC-coefficients: (likely) zero or near-by zero values
Different significance of the coefficients:
DC: most important AC: less important
Svu
14---C
uC
vs
yx2x 1+( )u
16-------------------------------
2y 1+( )v16
-------------------------------coscos
y 0=
7
x 0=
7
=
1
2-------
JPEG B li M d I P i
http://goback/ -
8/6/2019 05A Compression
30/102
05A-compression.fm 30 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Stein
me
tz,
M.
Mhlhuse
r
JPEG - Baseline Mode: Image Processing
FDCT transforms:
blocks into blocks not pixels into pixels
Example:
Calculation of S00
# # # # # # # #
# # # # # # # ## # # # # # # #
# # # # # # # #
# # # # # # # #
# # # # # # # #
# # # # # # # #
# # # # # # # #
* * * * * * * *
* * * * * * * ** * * * * * * *
* * * * * * * *
* * * * * * * *
* * * * * * * *
* * * * * * * *
* * * * * * * *
JPEG Baseline Mode: Quantization
http://goback/ -
8/6/2019 05A Compression
31/102
05A-compression.fm 31 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
form
atik
.tu-d
arms
tadt.de
R.
Stein
me
tz,
M.
Mhlhuse
r
JPEG - Baseline Mode: Quantization
Use of quantization tables for the DCT-coefficients:
Map interval of real numbers to one integer number Allows to use different granularity for each coefficient
05
Sc
Co
http://www kom e-technik tu-darmstadt de
http://goback/http://goback/ -
8/6/2019 05A Compression
32/102
A-compressio
n.fm
3215.March.01
cope
ontents
http://www.kom.e technik.tu darmstadt.dehttp://www.tk.informatik.tu-darmstadt.de
R. Steinmetz, M. Mhlhuser
JPE
G:QuantizationEffect
(a)
http://goback/ -
8/6/2019 05A Compression
33/102
JPEG - Baseline Mode: Entropy Coding
-
8/6/2019 05A Compression
34/102
05A-compression.fm 34 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
forma
tik
.tu-d
arms
tadt.de
R.
Stein
me
tz,
M.
Mhlhuse
r
JPEG - Baseline Mode: Entropy Coding
63 AC coefficients:
Ordering in zig-zag form
reason: coefficients in lower right corner are likely to be zero Huffman coding of all coefficients:
Transformation into a codewhere amount of bits depends on frequency of respective value
Subsequent runlength coding of zeros
* *** ***** *** ****
* *** ****
* *** ****
* *** ***** *** ***** *** ***** *** ****
AC77AC70
DC
AC07AC01
JPEG: Details of (one possible) Entropy conding
http://goback/ -
8/6/2019 05A Compression
35/102
05A-compression.fm 35 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.d
e
http:/
/www
.tk
.in
forma
tik
.tu-d
arms
tadt.d
e
R.
Stein
me
tz,
M.
Mhlhuse
r
JPEG: Details of (one possible) Entropy conding
Treatment of "zig-zag sequence":
differential coding of DC: DCi stored as "change wrt. DCi-1" assumption: there will rarely be two non-zero AC values in sequence
--> regard seq. as iteration of non-zero AC-values and zero-runlengths--> sometimes, the zero-runlength will have "length zero"
code non-zero AC-values as VLIs --> need to transmit VLI-lengths
(remember: this is not Huffman --> end of code not found by decoder) create pairs (zero-runlength, VLI-length-of-following-non-zero-AC-value)
these pairs are Huffman encoded
the very first "pair" is not a pair, but the VLI-length of the (diff.) DC-value
the block is finally represented as iterationHuffman-encoded pair / VLI-encoded non-zero-AC / Huffman-.... / VLI... / ...preceded by "Huffman-encoded VLI-length / VLI-encoded diff.-DC"
The next two slides give an example of the DCT coding of a 8x8 block
JPEG: Sample Compression of 1 Block: 8x8 Matrices
http://goback/ -
8/6/2019 05A Compression
36/102
05A-compression.fm 36 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.d
e
http:/
/www
.tk
.in
forma
tik
.tu-d
arms
tadt.d
e
R.
Stein
me
tz,
M.
Mhlhuse
r
JPEG: Sample Compression of 1 Block: 8x8 Matrices
1. Typical Pixel Block:
3. Quantization Matrix:
139 144 149 153 155 155 155 155
144 151 153 156 159 156 156 156
150 155 160 163 158 156 156 156
159 161 162 160 160 159 159 159
159 160 161 162 162 155 155 155
161 161 161 161 160 157 157 157
162 162 161 163 162 157 157 157
162 162 161 161 163 158 158 158
16 11 10 16 24 40 51 61
12 12 14 19 26 58 60 55
14 13 16 24 40 57 69 56
16 17 22 29 51 87 80 62
18 22 37 56 68 109 103 77
24 35 55 64 81 104 113 92
49 64 78 87 103 121 120 101
72 92 95 98 112 100 103 99
2. DCT Coefficients:
4. Quantized Result:
235.6 1.0 -12.1 -5.2 2.1 -1.7 -2.7 1.3
-22.6 -17.5 -6.2 -3.2 -2.9 -0.1 0.4 -1.2
-10.9 -9.3 -1.6 1.5 0.2 -0.9 -0.6 -0.1
-7.1 -1.9 0.2 1.5 0.9 -0.1 0.0 0.3
-0.6 -0.8 1.5 1.6 -0.1 -0.7 0.6 1.3
1.8 -0.2 1.6 -0.3 -0.8 1.5 1.0 -1.0
-1.3 -0.4 -0.3 -1.5 -0.5 1.7 1.1 -0.8
-2.6 1.6 -3.8 -1.8 1.9 1.2 -0.6 -0.4
15 0 -1 0 0 0 0 0
-2 -1 0 0 0 0 0 0
-1 -1 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
JPEG: Sample Compression (contd.)
http://goback/ -
8/6/2019 05A Compression
37/102
05A-compression.fm 37 15.March.01
Scope
Contents
http:/
/www
.kom
.e-te
chn
ik.t
u-d
arms
tadt.d
e
http:/
/www
.tk
.in
forma
tik
.tu-d
arms
tadt.d
e
R.
Stein
me
tz,
M.
Mhlhuse
r
JPEG: Sample Compression (contd.)
assume: last DC value was 18 --> encoded difference is 3
--> only 3, -2, -1 occur as non-zero values.Their VLI-encoding is as follows:
3 11
-2 01
-1 0
This makes the iteration look as follows (VLIs still represented as integers):
(2)(3), (1,2)(-2), (0,1)(-1), (0,1)(-1), (0,1)(-1), (2,1)(-1), (0,0) (
-
8/6/2019 05A Compression
38/102
05A-compression.fm 38 15.March.01
Scope
Contents
http:/
/www
.kom
.e-te
chn
ik.t
u-d
arms
tadt.d
e
http:/
/www
.tk
.in
forma
tik
.tu-d
arms
tadt.d
e
R.
Stein
me
tz,
M.
Mhlhuse
r
JPEG 4 Modes of Compression
lossy sequential DCT-based mode
(baseline mode)
expanded lossy DCT-based mode
lossless mode
hierarchical mode
JPEG - Extended Lossy DCT-Based Mode
http://goback/ -
8/6/2019 05A Compression
39/102
05A-compression.fm 39 15.March.01
Scope
Contents
http:/
/www
.kom
.e-te
chn
ik.t
u-d
arms
tadt.d
e
http:/
/www
.tk
.in
forma
tik
.tu-d
arms
tadt.d
e
R.
Stein
me
tz,
M.
Mhlhuse
r
J G te ded ossy C ased ode
Pixel resolution 8 to 12 bit
Sequential image display: Top to bottom
Good for small images and fast processing
Progressive image display:
Coarse to fine
Good for large and complicated images
http://goback/ -
8/6/2019 05A Compression
40/102
JPEG - Lossless Mode
-
8/6/2019 05A Compression
41/102
05A-compression.fm 41 15.March.01
Scope
Contents
http:/
/www
.kom
.e-te
chn
ik.t
u-d
arms
tadt.d
e
http:/
/www
.tk
.in
forma
tik
.tu-d
arms
tadt.d
e
R.
Stein
me
tz,
M.
Mhlhuse
r
Image preparation:
On pixel basis (2-16 bit/pixel)Image processing:
Selection of a predictor for each pixel
Entropy coding:
Same as lossy mode
Code of chosen predictor and its difference to the actual value
c b
a x
predictioncode012345
67
no predictionx=Ax=Bx=Cx=A+B+Cx=A+((B-C)/2)
x=B+((A-C)/2)x=(A+B)/2
JPEG - Hierarchical Mode
http://goback/ -
8/6/2019 05A Compression
42/102
05A-compression.fm 42 15.March.01
Scope
Contents
http:/
/www
.kom
.e-te
chn
ik.t
u-d
arms
tadt.d
e
http:/
/www
.tk
.in
for
ma
tik
.tu-d
arms
tadt.d
e
R.
Stein
me
tz,
M.
Mhlhuser
Coding of each image with several resolutions:
Image scaling Differential encoding
First, coded with lowest resolution - image A
Coded with increasing horizontal & vertical resolution - image A
Difference between both images is computed - B = A - A (*)
Iteration for higher resolutions
Features:
Requires more storage and higher data rate
Fast decoding process
Used for scalable video Similar to Photo-CD (Kodak, proprietary)
(*) note for all scalable approaches:
relate higher-res version B (or B) to receivers de-codedlower-res version A (to avoid accumulation of quantization errors)
8. H.261 and related ITU Standards
http://goback/ -
8/6/2019 05A Compression
43/102
05A-compression.fm 43 15.March.01
Scope
Contents
http:/
/www
.kom
.e-te
chn
ik.t
u-d
arms
tadt.d
e
http:/
/www
.tk
.in
for
ma
tik
.tu-d
arms
tadt.d
e
R.
Stein
me
tz,
M.
Mhlhuser
Video codec for audiovisual services at p x 64kbit/s
("p-times-sixtyfour", where p means "multiples-of"): CCITT standard from 1990
For ISDN
With p=1,..., 30
Technical issues:
Real-time encoding/decoding
Max. signal delay of 150ms
Constant data rate
Implementation in hardware (main goal) and software
H.261 - Image Preparation
http://goback/ -
8/6/2019 05A Compression
44/102
05A-compression.fm 44 15.March.01
Scope
Contents
http:/
/www
.kom
.e-te
chn
ik.t
u-d
arms
tadt.d
e
http:/
/www
.tk
.in
for
ma
tik
.tu-d
arms
tadt.d
e
R.
Steinme
tz,
M.
Mhlhuser
Fixed source image format
Image components: Luminance signal (Y)
Two color difference signals (Cb,Cr)
Subsampling according to CCIR 601 (4:1:1)
Quarter Common Intermediate Format (QCIF) resolution: Mandatory Y: 176 x 144 pixel ("pruning" 180-->176)
At 29.97 frames/s appr. 9.115 Mbps (uncompressed)
but: encoder may leave out up to 3 frames (--> ~8 fps)
Common Intermediate format (CIF) resolution: Optional
Y: 352 x 288 pixel
At 29.97 frames/s appr. 36.46 Mbps (uncompressed) i.e. ~ 570 * 64kbps
Layered structure:
Block of 8 x 8 pixels
Macroblock of: 4 Y blocks, 1 Cr block, 1 Cb block Group of blocks (GOBs) of 3 x 11 macroblocks
Picture:
QCIF picture: 3 GOBs
CIF picture: 12 GOBs
CIF: 360*288
QCIF
H.261 - Image Compression
http://goback/ -
8/6/2019 05A Compression
45/102
05A-compression.fm 45 15.March.01
Scope
Contents
http:/
/www
.kom
.e-te
chn
ik.t
u-d
arms
tadt.d
e
http:/
/www
.tk
.in
for
ma
tik
.tu-d
arms
tadt.d
e
R.
Steinme
tz,
M.
Mhlhuser
Intraframe coding: yields "reference frame" f0
DCT w/ same quantization factor for all AC values this factor may be adjusted by loopback filter (see below)
Interframe coding, motion estimation:
interframes: f1,f2,f3,... relative to f0 (differential encoding)
in H.261: intraframes rare (bandwidth!, main application videophone)
Search of similar macroblock (16x16) in previous image
Position of this macroblock defines motion vector Search range is up to the implementation:
max. 15 pixel
but: motion vector may also always be 0 ("bad" software encoder)
Frame 1 Frame 2
note about motion vector mv:- mv points "backwards" in time
(pos. of object in f- mv related to block,
not moving object
H.261 - Image Compression
http://goback/ -
8/6/2019 05A Compression
46/102
05A-compression.fm 46 15.March.01
Scope
Contents
http:/
/www
.kom
.e-te
chn
ik.t
u-d
arms
tadt.d
e
http:/
/www
.tk
.in
for
ma
tik
.tu-d
arms
tadt.d
e
R.
Steinme
tz,
M.
Mhlhuser
Interframe coding, further steps:
Results: Difference between similar macroblocks
Motion vector
Difference of macroblocks:
DCT if value higher than a specific threshold (hybrid DPCM/DCT!)
No further processing if value less than this threshold
Motion vector:
Components are coded yielding code words of variable length
Quantization:
Linear Adaptation of step size ("loopback filter") => ~ constant data rate
("leaky bucket": constant 64kbps "drop out";loopback filter: adjust quantization factor if bucket filledabove threshold1 or below threshold 2, respectively)
Further ITU Video Schemes (H.263, H.3xx)
http://goback/ -
8/6/2019 05A Compression
47/102
05A-compression.fm 47 15.March.01
Scope
Contents
http:/
/www
.kom
.e-te
chn
ik.t
u-d
arms
tadt.d
e
http:/
/www
.tk
.in
for
ma
tik
.tu-d
arms
tadt.d
e
R.
Steinme
tz,
M.
Mhlhuser
H.263
extension to H.261 max. bitrate: H.263 approx. 2.5 x H.261; lowest bitrates suitable f. modem
Source Image Formats
Format PixelsH.261 H.263
Encoder Decoder Encoder Decoder
SQCIF 128 x 96 optional required
QCIF 176 x 144 required required
CIF 352 x 144 optional optional
4CIF 704 x 576not defined optional
16CIF 1408 x 1152
http://goback/ -
8/6/2019 05A Compression
48/102
H.320, H.32x Family
-
8/6/2019 05A Compression
49/102
05A-compression.fm 49 15.March.01
Scope
Contents
http:/
/www
.kom
.e-te
chn
ik.t
u-d
arms
tadt.d
e
http:/
/www
.tk
.in
forma
tik
.tu-d
arms
tadt.d
e
R.
Steinme
tz,
M.
Mhlhuser
H.320 specifies (as overview) videophone for ISDN
H.310 adapt MPEG 2 for communication over B-ISDN (ATM)
H.321
define videoconferencing terminal for B-ISDN (instead of N-ISDN)
H.322 adapt H.320 for guaranteed QoS LANs (like ISO-Ethernet)
H.323
videoconferencing over non-guaranteed LANs
H.324 Terminal for low bit rate communication (over V.34 Modems)
http://goback/ -
8/6/2019 05A Compression
50/102
-
8/6/2019 05A Compression
51/102
MPEG - Video: Preparation Step
-
8/6/2019 05A Compression
52/102
05A-compression.fm 52 15.March.01
Scope
Contents
http:/
/www
.kom
.e-te
chn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
forma
tik
.tu-d
arms
tadt.de
R.
Stei
nme
tz,
M.
Mhlhus
er
Fixed image format
Color subsampling: Y, Cr, Cb 4:2:0
Resolution:
Should be at most 768 x 576 pixel 8 bit/pixel in each layer (i.e., for Y, Cr, Cb)
14 pixel aspect ratios
8 frame rates
No user defined MCU like JPEG
No progressive mode like JPEG
MPEG - Video: Processing Step
http://goback/ -
8/6/2019 05A Compression
53/102
05A-compression.fm 53 15.March.01
Scope
Contents
http:/
/www
.kom
.e-te
chn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
forma
tik
.tu-d
arms
tadt.de
R.
Stei
nme
tz,
M.
Mhlhus
er
4 types of frames:
I-frames (intra-coded frames): Like JPEG
Real-time decoding demands
P-frames (predictive coded frames):
Reference to previous I- or P-frames Motion vector
MPEG does not define how to determine the motion vector
difference of similar macroblocks is DCT coded
DC and AC coefficients are runlength coded
B-frames (bi-directional predictive coded frames):
Reference to previous and subsequent (I or P) frames
Interpolation between macro blocks
D-frames (DC-coded frames):
Only DC-coefficients are DCT coded For fast forward and rewind
MPEG - Video Coding
http://goback/ -
8/6/2019 05A Compression
54/102
05A-compression.fm 54 15.March.01
Scope
Contents
http:/
/www
.kom
.e-te
chn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
forma
tik
.tu-d
arms
tadt.de
R.
Steinme
tz,
M.
Mhlhus
er
Sequence of I-, P-, and B-frames:
Sequence:
Defined by application
E.g., I B B P B B P B B I B B P B B P B B
Order of transmission is different: I P B B ...
I
P
B
BP
B
B
IReferences
t
I-Frames (Intracoded)
P-Frames (Predictive Coded)
B-Frames (Bidirectionally Coded)
(D-Frames (DC Coded))
MPEG - Video: Implications
http://goback/ -
8/6/2019 05A Compression
55/102
05A-compression.fm 55 15.March.01
Scope
Contents
http:/
/www
.kom
.e-te
chn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
forma
tik
.tu-d
arms
tadt.de
R.
Steinme
tz,
M.
Mhlhus
er
Random access
at I-frames at P-frames: i.e. decode previous I-frame first
at B-frame: i.e. decode I and P-frames first
Editing
decoded data
loss of quality (encode -> decode -> encode -> ...)
application of all video editing functions
encoded data (previous to entropy encoding)
preservation of quality
transition effects as function in the DCT domain morphing, non-block conform overlay very difficult
encoded data
preservation of quality
today: too complex, if possible, i.e. need for entropy decoding
MPEG - Audio Coding: Fundamentals
http://goback/ -
8/6/2019 05A Compression
56/102
05A-compression.fm 56 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
forma
tik
.tu-d
arms
tadt.de
R.
Steinme
tz,
M.
Mhlhus
er
Masking threshold in the frequence domain
narrowband random noise
depends on frequency
0.02 0.05 0.1 0.2 0.5 1 2 5 10 20frequency (kHz)
SoundPressureLevel(dB)
0
20
40
60
80
fm
= 0.25 1 4 kHz
av
absolute thresholdof hearing
masking
patterns
MPEG - Audio Coding: Fundamentals
http://goback/ -
8/6/2019 05A Compression
57/102
05A-compression.fm 57 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
fo
rma
tik
.tu-d
arms
tadt.de
R.
Steinme
tz,
M.
Mhlhus
er
Masking in Time Domain
after and before the event
depends on (to some extent) amplitude
-50 50 100 150 ms 0 50 100 150 200
SLT
0
20
40
60
Dt tv
masker
simultaneous-pre- post-masking-
MPEG - Audio Coding
http://goback/ -
8/6/2019 05A Compression
58/102
05A-compression.fm 58 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.de
http:/
/www
.tk
.in
fo
rma
tik
.tu-d
arms
tadt.de
R.
Steinme
tz,
M.
Mhlhus
er
Yields: heavily asymmetric codecs!!
Audio channel:
Between 32 and 448 kbit/s
In steps of 16 kbit/s
Definition of 3 "layers" of quality:
"higher layer" means "more complex" & "can handle lower layers"
Layer 1: max. 448 Kbit/s (ca. 1:4 compression, e.g. used as PASC in DCC) Layer 2: max. 384 Kbit/s (ca. 1:6-8, common, e.g. as MUSICAM in DAB)
Layer 3: max. 320 Kbit/s (ca. 1:10-12, the famous MP3)
sub-bandcoding
quanti-zation
entropy
psychoacousticalmodel
32coder &
framepacking
controls: how many bits reservedfor which sub-band
MPEG - Audio Coding
S li tibl t di f CD DA d DAT
http://goback/ -
8/6/2019 05A Compression
59/102
05A-compression.fm 59 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.
de
http:/
/www
.tk
.in
fo
rma
tik
.tu-d
arms
tadt.de
R.
Steinme
tz,
M.
Mhlhus
er
Sampling compatible to encoding of CD-DA and DAT:
Sampling rates: 32 kHz, 44,1 kHz, 48 kHz
Sampling precision:
16 bit/sample
Audio channels:
Mono (single, 1 channel)
Stereo (2 channels)
dual channel mode (independent, e.g., bilingual)
optional: joint stereo (exploits redundancy and irrelevancy)
Application Example: DAB Digital Audio Broadcasting uses MPEG layer 2 (compression also known as MUSICAM =
(Masking pattern adapted Universal Subband Integrated Coding And Multiplexing)
delays, for VLSI implementation:
max. 30 ms encoding
max. 10 ms decoding
SW codec delays vary for different layers, implementations, computers(rule-of-thumb may be 50/100/150 ms for layer 1/2/3, which makesMP3 rather inappropriate for real-time conversation)
MPEG - Audio and Video Data Streams
A di D t St L
http://goback/ -
8/6/2019 05A Compression
60/102
05A-compression.fm 60 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.
de
http:/
/www
.tk
.in
fo
rma
tik
.tu-d
arms
tadt.
de
R.
Steinme
tz,
M.
Mhlhus
er
Audio Data Stream Layers:
1. Frames
2. Audio access units
3. Slots
Video Data Stream Layers:
1. Video sequence layer
2. Group of pictures layer
3. Single picture layer
4. Slice layer
5. Macroblock layer
6. Block layer
10. MPEG-2
Follow-Up MPEG Standards
http://goback/ -
8/6/2019 05A Compression
61/102
05A-compression.fm 61 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.
de
http:/
/www
.tk
.in
fo
rma
tik
.tu-d
arms
tadt.
de
R.
Steinme
tz,
M.
Mhlhus
er
Follow Up MPEG Standards
MPEG-2:
Higher data rates for high-quality audio/video
Multiple layers and profiles
MPEG-3
Initially HDTV
MPEG-2 scaled up to subsume MPEG-3
MPEG-4:
Initially, lower data rates for e.g. mobile communication
then: focus coding & additional functionalities based on image contents
MPEG-7 (EC = "experimental core" status):
Content description
Basis for search and retrieval
See section on databasesMPEG-21 (upcoming):
Framework for multimedia business, delivery... whats missing?
maybe eCommerce focus --> e.g., security, watermarking?
MPEG-2
From MPEG 1 to MPEG 2
http://goback/ -
8/6/2019 05A Compression
62/102
05A-compression.fm 62 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.
de
http:/
/www
.tk
.in
fo
rma
tik
.tu-d
arms
tadt.
de
R.
Ste
inme
tz,
M.
Mhlhus
er
From MPEG-1 to MPEG-2
Improvement in quality from VCR to TV to HDTV
No CD-ROM based constraints
higher data rates
MPEG-1: about 1.5 Mbit/s
MPEG-2: 2-100 Mbit/s
Evolution
1994: International Standard
Also later known as H.262
Prominent role for digital TV in DVB (digital video broadcasting) commercial MPEG-2 realizations available
MPEG-2 Video
Inclusion of interlaced video format
http://goback/ -
8/6/2019 05A Compression
63/102
05A-compression.fm 63 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.
de
http:/
/www
.tk
.in
fo
rma
tik
.tu-d
arms
tadt.
de
R.
Ste
inme
tz,
M.
Mhlhus
er
Inclusion of interlaced video format
Increase resolution, more than CCIR 601Defined as:
5 profiles (simple, main,..)
4 levels (with increasing resolution,...)
Other additional features DCT coefficients may be coded with a non-linear quantization function
MPEG-2 Video: Scaling
Motivation
http://goback/ -
8/6/2019 05A Compression
64/102
05A-compression.fm 64 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt.
de
http:/
/www
.tk
.info
rma
tik
.tu-d
arms
tadt.
de
R.
Ste
inme
tz,
M.
Mhlhuser
Motivation
analog: continuous decrease in quality if errors occur digital: need for tolerance whenever error occur, i.e scaling
Option: Spatial scaling
reduction of resolution
approach
image sampled with half resolution, then MPEG algorithms applied,output processed with better FEC (base layer)
Image decoded, substracted from original, to difference MPEG algorithmsapplied, output processed with worse FEC (enhanced layer)
Option: Signal to Noise (SNR) scaling noise introduced by
quantization errors and visible block structures
approach
Base layer: DCT output, more significant bits encoded with better FEC
Enhanced layer: DCT output, less significant bits encoded with worse FEC
MPEG-2 Video Profiles und Levels
http://goback/ -
8/6/2019 05A Compression
65/102
05A-compression.fm 65 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt
.de
http:/
/www
.tk
.info
rma
tik
.tu-d
arms
tadt
.de
R.
Ste
inme
tz,
M.
Mhlhuser High Level
1920 pixels/line
1152 lines
80 Mbit/s
100 Mbit/s
High-1440Level
1440 pixels/line1152 lines
60 Mbit/s 60 Mbit/s 80 Mbit/s
Main Level720 pixels/
line576 lines
15 Mbit/
s
15 Mbit/
s
15 Mbit/
s
20 Mbit/s
Low Level352 pixels/
line288 lines
4 Mbit/s 4 Mbit/s
SimpleProfile
MainProfile
SNRScalable
Profile
SpatialScalable
Profile
HighProfile
http://goback/ -
8/6/2019 05A Compression
66/102
05A-compression.fm 66 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt
.de
http:/
/www
.tk
.info
rma
tik
.tu-d
arms
tadt
.de
R.
Ste
inme
tz,
M.
Mhlhuser
LEVELSand
PROFILES
No B-frames B-frames B-frames B-frames B-frames
4:2:0 4:2:0 4:2:0 4:2:04:2:0 or
4:2:2
NotScalable
NotScalable
SNRScalable
SNR
Scalable orSpatial
Scalable
SNR
Scalable orSpatial
Scalable
MPEG-2 Audio
(two modest) extension to MPEG-1 audio:
http://goback/ -
8/6/2019 05A Compression
67/102
05A-compression.fm 67 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt
.de
http:/
/www
.tk
.info
rma
tik
.tu-d
arms
tadt
.de
R.
Ste
inme
tz,
M.
Mhlhuser
( )
1) "low sample rate extension" LSE: 1/2 of all MPEG-1 rates: 16, 22.05, 24kHz
quantization down to 8 bits/sample
2) "multichannel extension": more channels, i.e. up to
5 full bandwidth channels (surround system) left and right front
center (in front)
left and right back
"matrixing": rule for backward compatible conversion --> stereo (x, y = 0.71)
option: +1 "low freq. extension" (LFE) channel for subwoofer
"multilingual extension": 7 more, i.e. up to 12 channels
(multiple languages, commentary)
compatibility with MPEG-1:
all MPEG-1 audio format can be processed by MPEG-2
only 3 MPEG-2 audio codecs do not provide backward compatibility
Left for Stereo Left_f xCenter yLeft_b+ +=
Right for Stereo Right_f xCenter yRigtht_b+ +=
MPEG-2 System
Steps
http://goback/ -
8/6/2019 05A Compression
68/102
05A-compression.fm 68 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tadt
.de
http:/
/www
.tk
.info
rma
tik
.tu-d
arms
tadt
.de
R.
Ste
inme
tz,
M.
Mhlhuser
p
1. Audio and video combined to Packetized Elementary Stream (PES)2. PES(es) combined to Program Stream or Transport Stream
Program stream:
Error-free environment
Packets of variable length One single stream with one timing reference
Transport stream:
Designed for noisy (lossy) media channels
Multiplex of various programs with one or more time bases Packets of 188 byte length
Conversion between Program and Transport Streams possible
11. MPEG-4
Goals
http://goback/ -
8/6/2019 05A Compression
69/102
05A-compression.fm 69 15.March.01
Scope
Contents
http:/
/www
.kom
.e-
tec
hn
ik.t
u-d
arms
tadt
.de
http:/
/www
.tk
.info
rma
tik
.tu-d
arms
tadt
.de
R.
Ste
inme
tz,
M.
Mhlhuser
MPEG-4 (ISO 14496) originally:
Targeted at systems with very scarce resources
To support applications like
Mobile communication
Videophone and E-mail Max. data rates and dimensions (roughly):
Between 4800 and 64000 bits/s
176 columns x 144 lines x 10 frames/s
Largely covered by H.263, therefore re-orientation: Goal to provide enhanced functionality
to allow for analysis and manipulation of image contents
MPEG-4: Schedule for Standardization
1993 Work started
1997: Committee Draft
1998: Final Committee Draft
1998: Draft International Standard
1999-2000: International Standard
MPEG-4: Goals (cont.)
1: support composite multimedia i.e. find standardized ways to
http://goback/ -
8/6/2019 05A Compression
70/102
05A-compression.fm 70 15.March.01
Scope
Contents
http:/
/www
.kom
.e-
tec
hn
ik.t
u-d
arms
tadt
.de
http:/
/www
.tk
.info
rma
tik
.tu-d
arms
tadt
.de
R.
Ste
inme
tz,
M.
Mhlhuser
Represent units of aural, visual or audiovisual content "audio/visual objects" or AVOs
object coding independent of other objects, surroundings and background Compose these objects together
i.e. creation of compound objects that form audiovisual scenes
Multiplex and synchronize the data associated with AVOs
for transportation over network channels providing QoS (Quality-of-Service)
2: support synthetic objects
computer-gen. (VR), synthesized (txt2speech), model-based ("face")
3: support truly interactive applications (more than play/pause/rewind..)
Interact with the audiovisual scene generated at the decoders site
Rhubarb
Audioobject 1 video objects
12
3
Audioobject 2
Rhubarb
MPEG-4: Scope
Definition of
http://goback/ -
8/6/2019 05A Compression
71/102
05A-compression.fm 71 15.March.01
Scope
Contents
http:/
/www
.kom
.e-
tec
hn
ik.t
u-d
arms
tadt
.de
http:/
/www
.tk
.info
rma
tik
.tu-d
arms
tadt
.de
R.
Ste
inme
tz,
M.
Mhlhuser
System Decoder Model specification for decoder implementations
Description language
binary syntax of an AVOs bitstream representation
scene description information
Corresponding concepts, tools and algorithms,especially for
content-based compression of simple and compound audiovisual objects
manipulation of objects
transmission of objects
random access to objects
animation
scaling
error robustness
e e r
MPEG-4: Scope (cont.)
Targeted bit rates for video and audio:
http://goback/ -
8/6/2019 05A Compression
72/102
05A-compression.fm 72 15.March.01
Scope
Contents
http:/
/www
.kom
.e-
tec
hn
ik.t
u-d
arms
tadt
.de
http:/
/www
.tk
.info
rma
tik
.tu-d
arms
tadt
.de
R.
Ste
inme
tz,
M.
Mhlhu
ser
VLBV core Very Low Bit-rate Video
5 - 64 Kbit/s
image sequences with up to CIF resolution and up to 15 frames/s
Higher-quality video
64 Kbit/s - 4 Mbit/s quality like digital TV
Natural audio coding
2 - 64 Kbit/s
e e r
MPEG-4: Video and Image Encoding
Encoding / decoding of
http://goback/ -
8/6/2019 05A Compression
73/102
05A-compression.fm 73 15.March.01
Scope
Contents
http:/
/www
.kom
.e-
tec
hn
ik.t
u-d
arms
tadt
.de
http:/
/www
.tk
.info
rma
tik
.tu-d
arms
tadt
.de
R.
Ste
inme
tz,
M.
Mhlhu
ser
Rectangular imagesand video
coding similar toMPEG-1/2
motion prediction
texture coding Images and video of
arbitrary shape
as done inconventional
approach 8x8 DCT or shape-adaptive DCT
plus coding of shape and transparency information
Encoder
Must generate timing information
speed of the encoder clock = time base
desired decoding times and/or expiration times
by using time stamps attached to the stream
Can specify the minimum buffer resources needed for decoding
e er
MPEG-4: Composition of Scenes
Scene description includes:
http://goback/ -
8/6/2019 05A Compression
74/102
05A-compression.fm 74 15.March.01
Scope
Contents
http:/
/www
.kom
.e-tec
hn
ik.t
u-d
arms
tadt
.de
http:/
/www
.tk
.informa
tik
.tu-d
arms
tadt
.de
R.
Ste
inme
tz,
M.
Mhlhu
se
Tree to define hierarchical relationships between objects
Objects positions in space and time
by converting the objects local coordinate system into a global coordinatesystem
Attribute value selection
e.g. pitch of sound, color, texture, animation parameters
Description based on some VRML concepts
VRML = Virtual Reality Modelling Language
Interaction with scenes e.g. change viewing point, drag object, start/stop streams, select
language
RhubarbRhubarb
primitive AVO
compound object
compound object
05A-compression
Scope
Contents
http://www.kom.e-technik.tu-darmstadt.dehttp://www.tk.informatik.tu-darmstadt.de
R. Steinmetz, M. Mhlhuser
http://goback/http://goback/ -
8/6/2019 05A Compression
75/102
n.fm7515.March.01
MPE
G-4:Exampleofa
Composi
tion
ee er
MPEG-4: Scaling
Three approaches:
-
8/6/2019 05A Compression
76/102
05A-compression.fm 76 15.March.01
Scope
Contents
http:/
/www
.kom
.e-tec
hn
ik.t
u-d
arms
tadt
.d
http:/
/www
.tk
.informa
tik
.tu-d
arms
tadt
.d
R.
Ste
inme
tz,
M.
Mhlhu
se
Spatial scalability decoder displays textures and visual objects at a reduced spatial resolution
by decoding only a subset of the total bit stream
32 levels max. for textures and still images
3 levels max. for video sequences
Temporal scalability decoder displays video at a reduced temporal resolution
by decoding only a subset of the total bit stream
3 levels max.
Quality scalability
bitstream is parsed into a number of bit stream layers of different bit-rates
either during transmission or in the decoder
subset of the layers still yields a meaningful signal
Spatial and temporal scaling both for
Conventional rectangular display and Objects with arbitrary shape
de
de e
r
MPEG-4: Synthetic Objects
Visual objects:
http://goback/ -
8/6/2019 05A Compression
77/102
05A-compression.fm 77 15.March.01
Scope
Contents
http:/
/www
.kom
.e-
tec
hn
ik.t
u-d
arms
tadt.d
http:/
/www
.tk
.informa
tik
.tu-d
arms
tadt.d
R.
Ste
inme
tz,
M.
Mhlhu
se
Human face start object: neutral-expression face
animated via FDPs and/or FAPs
FAP (facial anim param): animate current display
FDP (facial def. param): alternative shape/texture
Mesh + texture mapping: for 2D & 3D meshes 2D mesh may also be used for human face anim., see above
only triangular 2D meshes, vertices may be moved (mv!), texture is warped
e.g. virtual background
Texture coding for view-dependent applications
texture, e.g. virt. background; decoder/encoder loop for "minimal" Xmission
Audio objects:
Text-to-speech
speech generation from given text and prosodic parameters
face animation control Score driven synthesis
music generation from a score
more general than MIDI
Special effects
de
de e
r
MPEG-4: Layered Networking Architecture
Display / Recording
http://goback/ -
8/6/2019 05A Compression
78/102
05A-compression.fm 78 15.March.01
Scope
Contents
http:/
/www
.kom
.e-
tec
hn
ik.t
u-d
arms
tadt.d
http:/
/www
.tk
.informa
tik
.tu-d
arms
tadt.d
R.
Ste
inme
tz,
M.
Mhlhu
s
Access Units
Adaptation Layer
FlexMux Layer
Elementary Streams
Multiplexed Streams
Network or Local Storage
e.g. video or audio framesor scene description commands
A/V object data+ stream type info, sync. info, QoS req.,...
e.g. multiple elementary streams
with similar QoS requirements
Flexible Multiplexing
Transport Multiplexing
- only interface specifiedTransMux Layer
- layer itself can be any network,e.g. RTP/UDP/IP, AAL5/ATM
CoDecCoDecCoDecCoDec
Media
Coding / Decoding
de
de
ser
MPEG-4: Layered Networking Architecture (cont.)
DMIF Delivery Multimedia Integration Framework
Allows to establish multiple party sessions
http://goback/ -
8/6/2019 05A Compression
79/102
05A-compression.fm 79 15.March.01
Scope
Contents
http:/
/www
.kom
.e-
tec
hn
ik.t
u-d
arms
tadt.d
http:/
/www
.tk
.informa
tik
.tu-d
arms
tadt.d
R.
Ste
inme
tz,
M.
Mhlhu
s Allows to establish multiple party sessions
interaction with
remote interactive peers
broadcast systems
storage systems
establishment of channels with specific QoSs and bandwidths Controls
FlexMux layer
TransMux layer
de
de
ser
MPEG-4: Error Handling
Mobile communication:
Low bit-rate (< 64 Kbps)
http://goback/ -
8/6/2019 05A Compression
80/102
05A-compression.fm 80 15.March.01
Scope
Contents
http:/
/www
.kom
.e-
tec
hn
ik.t
u-d
arms
tadt.
http:/
/www
.tk
.informa
tik
.tu-d
arms
tadt.
R.
Ste
inme
tz,
M.
Mhlhu
s Low bit rate (< 64 Kbps)
Error-prone
MPEG-4 concepts for error handling:
Resynchronization
enables receiver to tune in again
based on markers within bitstream Data recovery
enables receiver to reconstruct lost data
encode data in an error-resilient manner
Error concealment enables receiver to bridge gaps in data
e.g. by repeating parts of old frames
.de
.de
ser
12. Wavelets
Motivation
http://goback/ -
8/6/2019 05A Compression
81/102
05A-compression.fm 81 15.March.01
Scope
Contents
http:/
/www
.kom
.e-
tec
hn
ik.t
u-d
arms
tadt.
http:/
/www
.tk
.informa
tik
.tu-d
arms
tadt.
R.
Ste
inme
tz,
M.
Mhlhu
s
JPEG / DCT problems:
DCT not applicable to whole image, but only to small blocks block structure becomes visible at high compression ratios
Scaling as add-on additional effort
DCT function is fixed can not be adapted to source data
Improvements by using Wavelets:
Transformation of the whole image
overcomes visible block structures and introduces inherent scaling
Better identification of which data is relevant to human perception
higher compression ratio
.de
.de
ser
Wavelets: Compression / Decompression
Compressor
http://goback/ -
8/6/2019 05A Compression
82/102
05A-compression.fm 82 15.March.01
Scope
Contents
http:/
/www
.kom
.e-
tec
hn
ik.t
u-d
arms
tadt
http:/
/www
.tk
.inf
orma
tik
.tu-d
arms
tadt
R.
Ste
inme
tz,
M.
Mhlhu
The same overall structure as for DCT-based algorithms
But: important differences in the transformation step
Quantizer Encoder
Inverse WaveletTransformation
DeQuantizer Decoder
Forward WaveletTransformation
Decompressor
t.de
t.de
ser
Wavelets: Fundamental Idea
Image is transformed into the frequency domain (as in JPEG)
http://goback/ -
8/6/2019 05A Compression
83/102
05A-compression.fm 83 15.March.01
Scope
Contents
http:/
/www
.kom
.e-
tec
hn
ik.t
u-d
arms
tadt
http:/
/www
.tk
.inf
orma
tik
.tu-d
arms
tadt
R.
Ste
inme
tz,
M.
Mhlhu But: based on Wavelet functions instead of cosine functions
Advantage: Wavelets are 0 outside a limited interval
Wavelet automatically relates only to a part of the image Image needs not be splitted into blocks
"Frequencies"??? :
Use Wavelet family: {2-j/2*(2-j*x-k)}, j,k Z, being a Wavelet
cosine:
......
Wavelet e.g.:
t.de
t.de
ser
Wavelets: Transformation Steps
"Discrete Wavelet Transformation" (Mallat, 1989)
http://goback/ -
8/6/2019 05A Compression
84/102
05A-compression.fm 84 15.March.01
Scope
Contents
http:/
/www
.kom
.e-
tec
hn
ik.t
u-d
arms
tadt
http:/
/www
.tk
.inf
orma
tik
.tu-d
arms
tadt
R.
Ste
inme
tz,
M.
Mhlhu Split image recursively by using high and low pass filters
c1
d11
d12
d13
L
H
L
H
L
H
. . .
read bycolumn (vert. op.)
L Low Pass + downsampling
H High Pass + downsampling
line (horiz.
read by
lowerfrequencies
transformedimage withreduced size
higherfrequencies
operations
t.de
t.de
ser
Wavelets: Transformation Steps (cont.)
In each step i:
Three images d i (x=1,2,3):
http://goback/ -
8/6/2019 05A Compression
85/102
05A-compression.fm 85 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tad
http:/
/www
.tk
.inf
orma
tik
.tu-d
arms
tad
R.
Ste
inme
tz,
M.
Mhlhu x containing the high frequency parts of the image
representing "details" of the image
submitted to Wavelet transformation
or thrown away in case of scaling
One image ci: containing the lower frequency parts of the image
representing the original image with less details / at a lower resolution
submitted to step i+1
Up to here: 4 images with 1/4 resolution each --> no compression!but again: decorrelation: many coefficients in d-images (close to) zero
Afterwards:
Quantization
Entropy encodingas with DCT
t.de
t.de
ser
Wavelets: DWT compared with DCT
Advantages of DWT over DCT:
No block artefacts
http://goback/ -
8/6/2019 05A Compression
86/102
05A-compression.fm 86 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tad
http:/
/www
.tk
.inf
orma
tik
.tu-d
arms
tad
R.
St
einme
tz,
M.
Mhlhu
Inherent scaling
based on the dxi for i=1,2,3,...
Lower time complexity for the transformation
DCT: O(n*logn),
DWT: O(n) (n=number of values to be transformed) Higher flexibility: Wavelet function can be freely chosen
(but: howto choose?)
t.de
t.de
ser
Wavelets: Further Issues
Edge detection reduces high frequencies:
First extract detected edges
http://goback/ -
8/6/2019 05A Compression
87/102
05A-compression.fm 87 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tad
http:/
/www
.tk
.informa
tik
.tu-d
arms
tad
R.
St
einme
tz,
M.
Mhlhu
Then apply wavelets to such a filtered image
Application to video:
In-2In-1
Image n
Imt
Compute
differences
...In-1 - In-2
In - In-1
...t
Wavelet
compressor
t.de
t.de
ser
13. Fractal Image Compression
Fractal Geometry was first applied to image generation
http://goback/ -
8/6/2019 05A Compression
88/102
05A-compression.fm 88 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tad
http:/
/www
.tk
.informa
tik
.tu-d
arms
tad
R.
St
einme
tz,
M.
Mhlhu
remember "Mandelbrot" images recursive construction of images
infinite granularity (i.e. zoom-in), but compact "image data" (formula)(such forms are called fractals)
Zi
= RealConst. * Zi-1
+ ComplexConst
d
t.de
d
t.de
user
Use of Fractals for Compression??? Overview (1)
observation: self-similarities in natural images(clouds, dunes, beaches: zoom-in reveals similar forms as large image)
idea can nat ral images be described / fractal geometr ??
http://goback/ -
8/6/2019 05A Compression
89/102
05A-compression.fm 89 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tad
http:/
/www
.tk
.informa
tik
.tu-d
arms
tad
R.
St
einme
tz,
M.
Mhlh idea: can natural images be described w/ fractal geometry??
first published by Barnsley & Sloan (88), first impl. 89 by Arnaud Joquin
Key #1: Iterated Function Systems IFS:
input (sub-)picture subject to math. transform. of type picture moved, rotated / mirrored, and contracted
--> all transformations are "contractions"
Key #2: Banachs Fixed Point Theorem: apply a set Wimg={Wi} of contractions to an image
after infinitely many applications, a specific image appears
... called "attractor" or "fractal"
this process is independent of initial "start" image!!
human perception: iteration can stop "pretty soon" (finite no. of iterations)
Q: how to find Wimg such that attractor is image-to-be-compressed?
+ f
e
y
x
dc
ba
ad
t.de
ad
t.de
user
Use of Fractals for Compression??? Overview (2)
Key #3: Collages Theorem:
in order to find Wimg
as above: search Wimg
such that image is(almost) transformed into itself!
http://goback/ -
8/6/2019 05A Compression
90/102
05A-compression.fm 90 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tad
http:/
/www
.tk
.informa
tik
.tu-d
arms
tad
R.
St
einme
tz,
M.
Mhlh img img(almost) transformed into itself!
First algorithm published (Joaquin):
partition image into (small, non-overlapping) "range blocks"
search (larger, overlapping) "domain blocks" which can be"contracted" into range blocks
for each range block, find domain block and contraction(lots of possibilities!!)
details / simplifications of Joaquin approach see below
ad
t.de
ad
t.de
user
To apply self-similarity: Image Generation
Examples
(from TUD + Univ. Bochum)
for recursive
http://goback/ -
8/6/2019 05A Compression
91/102
05A-compression.fm 91 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tad
http:/
/www
.tk
.informa
tik
.tu-d
arms
tad
R.
Steinme
tz,
M.
Mhlh for recursive
contruction of images
Sirpinky triangle
to produce self-
similar structures infinite steps appliedto different sourceimages lead to sameresult
known asSirpinski-triangle
"Grenzwert" alsoknown as attractor
ad
t.de
ad
t.de
huser
To Find Self-Similarities
affine function allows for
translation
rotation
http://goback/ -
8/6/2019 05A Compression
92/102
05A-compression.fm 92 15.March.01
Scope
Contents
http:/
/www
.kom
.e-t
ec
hn
ik.t
u-d
arms
tad
http:/
/www
.tk
.informa
tik
.tu-d
arms
tad
R.
Steinme
tz,
M.
Mhlh rotation
scaling
brightness (/color) adaptation
IFS:Iterative Function System
ideally completely self-similar
example see right
PIFS:Partitioned Iterative Function System
real images arenot completly self-similar
Wimg?
ad
t.de
ad
t.de
huser
Theoretical Basis
Banachs Fixed Point Theorem:
Let F be a metrical space
Let W: FF be a contractive mapping
http://goback/ -
8/6/2019 05A Compression
93/102
05A-compression.fm 93 15.March.01
Scope
Contents
http:/
/www
.kom
.e
-tec
hn
ik.t
u-d
arms
t
http:/
/www
.tk
.informa
tik
.tu-d
arms
t
R.
Steinme
tz,
M.
Mhlh Let W: FF be a contractive mapping
i.e. there exists an s, 0
-
8/6/2019 05A Compression
94/102
05A-compression.fm 94 15.March.01
Scope
Contents
http:/
/www
.kom
.e
-tec
hn
ik.t
u-d
arms
t
http:/
/www
.tk
.informa
tik
.tu-d
arms
t
R.
Steinme
tz,
M.
Mhlh Decompression: Apply Wimg iteratively to any image easy
Stop when error falls below some bound
Error can be calculated by "Collage Theorem"
tad
t.de
tad
t.de
huser
How to Find Wimg? Joaquins Approach
Systematic search based on"Partitioned Iterative Function System (PIFS)"
Partition image into "range blocks" Ri
http://goback/ -
8/6/2019 05A Compression
95/102
05A-compression.fm 95 15.March.01
Scope
Contents
http:/
/www
.kom
.e
-tec
hn
ik.t
u-d
arms
http:/
/www
.tk
.in
forma
tik
.tu-d
arms
R.
Steinme
tz,
M.
Mhl Partition image into range blocks Ri
8*8 pixel blocks
non-overlapping
Consider all "domain blocks" Dj of double size
16*16 pixel blocks overlapping
Find for each Ri the most similar Dj consider rotations (0o/90o/180o/270o) and mirroring
adapt brightness and contrast of Dj to that of Ri
translation, rotation, mirroring, brightness/contrast adaptationdefine a (partial) affine function
Combine partial functions to Wimg Compression rate? Example: for each (8*8) range block:
contraction factor fixed
3 bit for transformation
16 bit for domain block coordinates
12 bit for brightness/contrast adaptation
--> factor is 8x8x8 : 31= 512:31 (cf. JPEG example)
http://goback/ -
8/6/2019 05A Compression
96/102
-
8/6/2019 05A Compression
97/102
stad
t.de
stad
t.de
hlh
user
14. Basic Audio and Speech Coding Schemes
Voice encoder/decoder: "vocoder"
Background ITU driven activities
-
8/6/2019 05A Compression
98/102
05A-compression.fm 98 15.March.01
Scope
Contents
http:/
/www
.kom
.e
-tec
hn
ik.t
u-d
arms
http:/
/www
.tk
.in
forma
tik
.tu-d
arms
R.S
teinme
tz,
M.
Mh
g ITU driven activities
G.711: PCM
with 64 kbps
G.722 differential PCM (DPCM)
48, 56, 64 kbps
G.723
Multipulse-maximum Likelihood Quatizer (MP-MLQ): 6,3 kbps
Algebraic Codebook Excitation Linear Prediction (ACELP) 5,3 kbps
application: speech
stad
t.de
stad
t.de
hlh
user
Schemes for Speech Coding