image and video compression - school of industrial engineering and
TRANSCRIPT
Image and Video CompressionEE398
Bernd GirodInformation Systems Laboratory
Department of Electrical EngineeringStanford University
Winter 2004/05
Image and Video Compression Enables . . .
Bernd Girod: EE398 Image and Video Compression Introduction no. 2
Fax machinesDigital still camerasDigital camcordersDigital television broadcastingDigital video/versatile disk (DVD)Personal video recorder (PVR, aka TiVo)World Wide WebInternet video streamingVideo conferencing. . .
Motivating Image Compression
Bernd Girod: EE398 Image and Video Compression Introduction no. 3
Binary image (facsimile Group 3)8.5 x 11 in document scanned at 7.7 lines/mm (“fine mode”), 1664pixels/line, with 1 bit/pixel3.255 Mbits for 1 page = 5.65 minutes over 9600 baud connection
Photos on 35 mm filmScanned at 12µ resolution (3656x2664 pixels) with 8 bits per color and 3 colors233 Mbits for 1 photo, 2/3 of 48 Mbyte compact flash card
Motivating Video Compression
Bernd Girod: EE398 Image and Video Compression Introduction no. 4
Y Cr CbSampling rate 13.5 MHz 6.75 MHz 6.75 MHzQuantization 8 bit 8 bit 8 bitRaw bit rate 216 MbpsW/o blanking intervals 166 Mbps
Digital video studio standard ITU-R Rec. 601
Some interesting bit-ratesTerrestial TV broadcasting channel ~20 MbpsComputer hard disk 20...40 MbpsDVD (max. 17 GB/length of movie) 10...20 MbpsEthernet/Fast Ethernet <10/100 MbpsDSL downlink 384...2048 kbpsDial-up modem 14.4…56 kbpsWireless cellular data 9.6...112 kbps
Introductory lecture
Bernd Girod: EE398 Image and Video Compression Introduction no. 5
Goal: provide a first introduction of some key concepts of image compression, without rigorous treatmentLossless vs lossy compressionMeasuring distortion and compressionStatistical and visual redundancyNeed for structured compression schemes
EE398 Organisation
Digital Image: Quantized Array of Samples
Bernd Girod: EE398 Image and Video Compression Introduction no. 6
Rounding to nearest integer
[ ] [ ]1 2 1 2, 2 ,Bx n n x n n′=
Real-valued intensities, range
0…1 or –1/2…+1/2
1 1 2 20 , 0 n N n N≤ < ≤ <
2n2 1N −
1 1N −
00 1
1
2
2
1n[ ]1 2,x n n
Multiple image components
Bernd Girod: EE398 Image and Video Compression Introduction no. 7
Color images typically represented by three values per sample location, for red, green and blue primary components
General multi-component image
Examples:Color printing: cyan, magenta, yellow, black dyes, sometimes moreHyperspectral satellite imaging: 100s of channels
[ ] [ ] [ ]1 2 1 2 1 2, , , , , R G Bx n n x n n x n n
[ ]1 2, , 1, 2, , Cx n n c C= K
Lossless compression
Bernd Girod: EE398 Image and Video Compression Introduction no. 8
Minimize number of bits required to represent original digital image samples w/o any loss of information.All B bits of each sample must be reconstructed perfectly.Achievable compression usually rather limited.Applications
Binary images (facsimile)Medical imagesMaster copy before editingPalettized color images
Lossy compression
Bernd Girod: EE398 Image and Video Compression Introduction no. 9
Some deviation of decompressed image from original(“distortion”) is often acceptable:
Human visual system might not perceive loss, or tolerate it.Digital input to compression algorithm is imperfect representation of real-world scene
Much higher compression than with lossless.Lossy compression used widely for natural images (e.g. JPEG) and motion video (e.g. MPEG).
Lossy compression: measuring distortion
Bernd Girod: EE398 Image and Video Compression Introduction no. 10
Most commonly employed: Mean Squared Error
. . . or, equivalently, Peak Signal to Noise Ratio
AdvantagesEasy calculationMathematical tractability in optimization problems
DisadvantageNeglects properties of human vision
[ ] [ ]( )1 2
1 2
1 12
1 2 1 20 01 2
1 ˆMSE= , ,N N
n nx n n x n n
N N
− −
= =
−∑ ∑
( )2
10
2 1PSNR=10log dB
MSE
B −
Measures of compression
Bernd Girod: EE398 Image and Video Compression Introduction no. 11
Image represented by “bit-stream” c of length ||c||.Compare no. of bits w/ and w/o compression
Alternatively
For lossy compression, bit-rate more meaningful than compression ratio, as B is somewhat arbitrary.
1 2compression ratio = N N Bc
1 2
bit-rate = bits/pixelN N
c
Typical bit-rates after compression
Bernd Girod: EE398 Image and Video Compression Introduction no. 12
Substantially dependent on image content: consider typical natural imagesLossless compression: (B-3) bpp (bits per pixel)Assume viewing on computer monitor, 90 pixels/inch.Lossy compression,
high quality: 1 bppmoderate quality: 0.5 bppusable quality: 0.25 bpp
Perceived distortion depends on sampling density and contrast
How does compression work?
Bernd Girod: EE398 Image and Video Compression Introduction no. 13
Exploit statistical redundancy.Take advantage of patterns in the signal.Describe frequently occuring events efficiently.Lossless coding: only statistical redundancy
Introduce acceptable deviations.Omit “irrelevant” detail that humans cannot perceive.Match the signal resolution (in space, time, amplitude) to the applicationLossy coding: exploit statistical and visual redundancy
Statistical redundancy
Bernd Girod: EE398 Image and Video Compression Introduction no. 14
Trivial example: Given two B-bit integers (e.g., representing two adjacent pixels)
Assume that only takes on values {0,1}Compression to 1 bpp
Further assume, that Compression to 0.5 bpp
Hope: bit-rate increases only slightly, as long as the above assumptions hold with high probability
{ }1 2, 0,1, , 2 1Bx x ∈ −K
1 2,x x
1 2x x=
Bernd Girod: EE398 Image and Video Compression Introduction no. 15
Joint histogram of two horizontally adjacent pixels
( ‘Lena’, 512 x 512 pixels, 8 bpp)
100 200 300 400 500
100
200
300
400
500
Amplitude of current pixel
Amplitude of adjacent pixel
PMF
Visual Redundancy
Bernd Girod: EE398 Image and Video Compression Introduction no. 16
For images to be viewed by humans, no need to represent more than the visible resolution in
spacetimebrightnesscolor
Required resolution might depend on image content (“masking”)For some applications, only a specific region of the image might be relevant, e.g., in
medical imagingmilitary imaging
Exploitating limitations of color vision
Bernd Girod: EE398 Image and Video Compression Introduction no. 17
Human visual system has much lower acuity for color hue and saturation than for brightnessUse color transform to facilitate exploiting that property
Note
Cb and Cr often sub-sampled 2:1 relative to Y.
0.299 0.587 0.1140.169 0.331 0.50.5 0.419 0.0813
Y R
Cb G
Cr B
x xx xx x
⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎜ ⎟ ⎜ ⎟ ⎜ ⎟= − − ⋅⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟− −⎝ ⎠ ⎝ ⎠ ⎝ ⎠
RGB components (γ-predistorted)
Chrominancecomponents
( ) ( )0.564 0.713Cb B Y Cr R Yx x x x x x= − = −
Luminancecomponent
Compression as a global mapping
Bernd Girod: EE398 Image and Video Compression Introduction no. 18
( )compressor
M=c x ( )1
decompressor
ˆ M −=x c
image x bit-stream c
reconstructed ˆimage x
Lossless compression
Lossy compression
1 2
lookup table2 entriesN N B
lookup table
2 entriescfixed length cLookup table
interpretation
1 1M M− −=
( ) ( )( )1arg min ,M D M −
′′= =
cc x x c
Typical structured compression system
Bernd Girod: EE398 Image and Video Compression Introduction no. 19
( )transform
T=y x ( )quantizer
Q=q y ( )encoder
C=c qsamples yimage x indices q
( )1
inversetransform
ˆ ˆT −=x y ( )1
dequantizer
ˆ Q−=y q ( )1
decoderC−=q c
indices qˆsamples yreconstructed
ˆimage x
bit-stream c
Transform T(x) usually invertibleQuantization not invertible, introduces distortionCombination of encoder and decoder lossless
( )Q y( )C q ( )1C− c
Outline EE398
Bernd Girod: EE398 Image and Video Compression Introduction no. 20
Entropy and lossless coding techniquesRun-length coding, fax standardsArithmetic codingRate-distortion limits and quantizationLossless and lossy predictive codingTransform coding, JPEG standard Subband coding, wavelets, JPEG-2000Motion compensated coding, MPEG standards
EE398 Organisation
Bernd Girod: EE398 Image and Video Compression Introduction no. 21
Regularly check class home page: http://www.stanford.edu/class/ee398
AssistantsGeneral TA: David Rebollo-MonederoISE lab TA: Shantanu RaneCourse assistant: Kelly Yilmaz
EE398 Organisation (cont.)
Bernd Girod: EE398 Image and Video Compression Introduction no. 22
Homeworks7 problem sets, require computer + MatlabHanded out Tuesdays, due one week later, 2:45 p.m.
Grading Homeworks: 50%Term project: 50%No mid-term, no final
EE398B Term Projects - General
Bernd Girod: EE398 Image and Video Compression Introduction no. 23
Work in groups of 2-3 students, 40-50 hours per personProject proposal required, deadline: February 3Class-room presentations of projects: March 8/10Project report due: March 10Project grade based on
Technical quality, significance, and originality of results 50%Project report 25%Class-room presentation 25%
ISEP laboratory
Bernd Girod: EE398 Image and Video Compression Introduction no. 24
Created by equipment grants from Hewlett-Packard, Xerox, and IntelExclusively a teaching laboratoryLocation: Packard room 06620 Linux PCs, 2 Windows PCs, scanners, printers etc.Access:
door combination for lab entry will be provided by ISE Lab TAStanford ID chip card for after-hour entry of Packard buildingAccount on ise machine will be provided to all enrolled in class
Further reading
Bernd Girod: EE398 Image and Video Compression Introduction no. 25
Slides available as hand-outs and as pdf files on the webRecommended text books
D. S. Taubman, M. W. Marcellin, „JPEG2000 – Image Compression Fundamentals, Standards, and Practice,“ Kluwer Academic Publishers, 2002.Y. Wang, J. Ostermann, Y.-Q. Zhang, „Video Processing and Communications,“ Prentice-Hall, 2002.
Selected papers will be posted on Web site
Further reading (cont.)
Bernd Girod: EE398 Image and Video Compression Introduction no. 26
Reference books on image and video compressionA. N. Netravali, B.G. Haskell, "Digital Pictures - Representation and Compression", 2nd edit., New York, London: Plenum Press, 1995. Comprehensive standard book covering television standards and digital image compression. Has been augmented compared to the 1stedition from 1988, particularly to discuss the more recent standards, the greater part of the book, however, reflects the state-of-the-art of the mid-80s. Nevertheless, a must-have for image system engineers.
W. Pennebaker, J. Mitchell, "JPEG Still Image Data Compression Standard", Van Nostrand Reinhold, New York, 1993. THE source to read about JPEG, but also a nice presentation of basic material you need to understand the rationale behind it.
J. Mitchell, W. Pennebaker, C. Fogg, D. LeGall, "MPEG Video Compression Standard," Chapman & Hall, New York, 1996. Discusses MPEG-2 in detail, and some of the source coding principles, but those rather briefly. A book for practicioners.
B. Haskell, A. Puri, A. Netravali, "Digital Video: An Introduction to MPEG-2," Chapman & Hall, New York, 1996.Comprehensive coverage of MPEG-2, and also includes a chapter about MPEG-4. Some sections from Netravali & Haskell's "Digital Pictures" are included to provide background.
V. Bhaskaran, K. Konstantinides, "Image and Video Compression Standards," Kluwer Academic Publishers, 1995. Introduction to standards JPEG, MPEG-1 and MPEG-2, H.261. Emphasizes the foundations of the standards, rather than details. Predictive coding, transform coding, motion estimation and compensation, entropy coding.
A. K. Jain, "Fundamentals of Digital Image Processing", Prentice-Hall, 1989. Very readable and sound book that is popular as a text book for image processing classes. A lot of image processing material beyond compression.
Fundamental books that are not image/video specific:A. Gersho and R.M. Gray, "Vector Quantization and Signal Compression," Kluwer Academic Press, 1992. Principles and algorithms for
digital source coding, with applications to images, speech, and audio. The most comprehensive and substantial source coding reference, but very readable nevertheless. State-of-the-art.
N. S. Jayant, P. Noll, "Digital Coding of Waveforms," Prentice-Hall, 1984. In-depth coverage of algorithms for digital source coding, with emphasis on principles. Specific applications mostly to speech, but also to image. Scalar quantization, predictive coding, subband coding, transform coding. 15 years old, but nevertheless a valuable addition to the source coder‘s library.