image and video compression - school of industrial engineering and

27
Image and Video Compression EE398 Bernd Girod Information Systems Laboratory Department of Electrical Engineering Stanford University Winter 2004/05

Upload: others

Post on 03-Feb-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Image and Video CompressionEE398

Bernd GirodInformation Systems Laboratory

Department of Electrical EngineeringStanford University

Winter 2004/05

Image and Video Compression Enables . . .

Bernd Girod: EE398 Image and Video Compression Introduction no. 2

Fax machinesDigital still camerasDigital camcordersDigital television broadcastingDigital video/versatile disk (DVD)Personal video recorder (PVR, aka TiVo)World Wide WebInternet video streamingVideo conferencing. . .

Motivating Image Compression

Bernd Girod: EE398 Image and Video Compression Introduction no. 3

Binary image (facsimile Group 3)8.5 x 11 in document scanned at 7.7 lines/mm (“fine mode”), 1664pixels/line, with 1 bit/pixel3.255 Mbits for 1 page = 5.65 minutes over 9600 baud connection

Photos on 35 mm filmScanned at 12µ resolution (3656x2664 pixels) with 8 bits per color and 3 colors233 Mbits for 1 photo, 2/3 of 48 Mbyte compact flash card

Motivating Video Compression

Bernd Girod: EE398 Image and Video Compression Introduction no. 4

Y Cr CbSampling rate 13.5 MHz 6.75 MHz 6.75 MHzQuantization 8 bit 8 bit 8 bitRaw bit rate 216 MbpsW/o blanking intervals 166 Mbps

Digital video studio standard ITU-R Rec. 601

Some interesting bit-ratesTerrestial TV broadcasting channel ~20 MbpsComputer hard disk 20...40 MbpsDVD (max. 17 GB/length of movie) 10...20 MbpsEthernet/Fast Ethernet <10/100 MbpsDSL downlink 384...2048 kbpsDial-up modem 14.4…56 kbpsWireless cellular data 9.6...112 kbps

Introductory lecture

Bernd Girod: EE398 Image and Video Compression Introduction no. 5

Goal: provide a first introduction of some key concepts of image compression, without rigorous treatmentLossless vs lossy compressionMeasuring distortion and compressionStatistical and visual redundancyNeed for structured compression schemes

EE398 Organisation

Digital Image: Quantized Array of Samples

Bernd Girod: EE398 Image and Video Compression Introduction no. 6

Rounding to nearest integer

[ ] [ ]1 2 1 2, 2 ,Bx n n x n n′=

Real-valued intensities, range

0…1 or –1/2…+1/2

1 1 2 20 , 0 n N n N≤ < ≤ <

2n2 1N −

1 1N −

00 1

1

2

2

1n[ ]1 2,x n n

Multiple image components

Bernd Girod: EE398 Image and Video Compression Introduction no. 7

Color images typically represented by three values per sample location, for red, green and blue primary components

General multi-component image

Examples:Color printing: cyan, magenta, yellow, black dyes, sometimes moreHyperspectral satellite imaging: 100s of channels

[ ] [ ] [ ]1 2 1 2 1 2, , , , , R G Bx n n x n n x n n

[ ]1 2, , 1, 2, , Cx n n c C= K

Lossless compression

Bernd Girod: EE398 Image and Video Compression Introduction no. 8

Minimize number of bits required to represent original digital image samples w/o any loss of information.All B bits of each sample must be reconstructed perfectly.Achievable compression usually rather limited.Applications

Binary images (facsimile)Medical imagesMaster copy before editingPalettized color images

Lossy compression

Bernd Girod: EE398 Image and Video Compression Introduction no. 9

Some deviation of decompressed image from original(“distortion”) is often acceptable:

Human visual system might not perceive loss, or tolerate it.Digital input to compression algorithm is imperfect representation of real-world scene

Much higher compression than with lossless.Lossy compression used widely for natural images (e.g. JPEG) and motion video (e.g. MPEG).

Lossy compression: measuring distortion

Bernd Girod: EE398 Image and Video Compression Introduction no. 10

Most commonly employed: Mean Squared Error

. . . or, equivalently, Peak Signal to Noise Ratio

AdvantagesEasy calculationMathematical tractability in optimization problems

DisadvantageNeglects properties of human vision

[ ] [ ]( )1 2

1 2

1 12

1 2 1 20 01 2

1 ˆMSE= , ,N N

n nx n n x n n

N N

− −

= =

−∑ ∑

( )2

10

2 1PSNR=10log dB

MSE

B −

Measures of compression

Bernd Girod: EE398 Image and Video Compression Introduction no. 11

Image represented by “bit-stream” c of length ||c||.Compare no. of bits w/ and w/o compression

Alternatively

For lossy compression, bit-rate more meaningful than compression ratio, as B is somewhat arbitrary.

1 2compression ratio = N N Bc

1 2

bit-rate = bits/pixelN N

c

Typical bit-rates after compression

Bernd Girod: EE398 Image and Video Compression Introduction no. 12

Substantially dependent on image content: consider typical natural imagesLossless compression: (B-3) bpp (bits per pixel)Assume viewing on computer monitor, 90 pixels/inch.Lossy compression,

high quality: 1 bppmoderate quality: 0.5 bppusable quality: 0.25 bpp

Perceived distortion depends on sampling density and contrast

How does compression work?

Bernd Girod: EE398 Image and Video Compression Introduction no. 13

Exploit statistical redundancy.Take advantage of patterns in the signal.Describe frequently occuring events efficiently.Lossless coding: only statistical redundancy

Introduce acceptable deviations.Omit “irrelevant” detail that humans cannot perceive.Match the signal resolution (in space, time, amplitude) to the applicationLossy coding: exploit statistical and visual redundancy

Statistical redundancy

Bernd Girod: EE398 Image and Video Compression Introduction no. 14

Trivial example: Given two B-bit integers (e.g., representing two adjacent pixels)

Assume that only takes on values {0,1}Compression to 1 bpp

Further assume, that Compression to 0.5 bpp

Hope: bit-rate increases only slightly, as long as the above assumptions hold with high probability

{ }1 2, 0,1, , 2 1Bx x ∈ −K

1 2,x x

1 2x x=

Bernd Girod: EE398 Image and Video Compression Introduction no. 15

Joint histogram of two horizontally adjacent pixels

( ‘Lena’, 512 x 512 pixels, 8 bpp)

100 200 300 400 500

100

200

300

400

500

Amplitude of current pixel

Amplitude of adjacent pixel

PMF

Visual Redundancy

Bernd Girod: EE398 Image and Video Compression Introduction no. 16

For images to be viewed by humans, no need to represent more than the visible resolution in

spacetimebrightnesscolor

Required resolution might depend on image content (“masking”)For some applications, only a specific region of the image might be relevant, e.g., in

medical imagingmilitary imaging

Exploitating limitations of color vision

Bernd Girod: EE398 Image and Video Compression Introduction no. 17

Human visual system has much lower acuity for color hue and saturation than for brightnessUse color transform to facilitate exploiting that property

Note

Cb and Cr often sub-sampled 2:1 relative to Y.

0.299 0.587 0.1140.169 0.331 0.50.5 0.419 0.0813

Y R

Cb G

Cr B

x xx xx x

⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎜ ⎟ ⎜ ⎟ ⎜ ⎟= − − ⋅⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟− −⎝ ⎠ ⎝ ⎠ ⎝ ⎠

RGB components (γ-predistorted)

Chrominancecomponents

( ) ( )0.564 0.713Cb B Y Cr R Yx x x x x x= − = −

Luminancecomponent

Compression as a global mapping

Bernd Girod: EE398 Image and Video Compression Introduction no. 18

( )compressor

M=c x ( )1

decompressor

ˆ M −=x c

image x bit-stream c

reconstructed ˆimage x

Lossless compression

Lossy compression

1 2

lookup table2 entriesN N B

lookup table

2 entriescfixed length cLookup table

interpretation

1 1M M− −=

( ) ( )( )1arg min ,M D M −

′′= =

cc x x c

Typical structured compression system

Bernd Girod: EE398 Image and Video Compression Introduction no. 19

( )transform

T=y x ( )quantizer

Q=q y ( )encoder

C=c qsamples yimage x indices q

( )1

inversetransform

ˆ ˆT −=x y ( )1

dequantizer

ˆ Q−=y q ( )1

decoderC−=q c

indices qˆsamples yreconstructed

ˆimage x

bit-stream c

Transform T(x) usually invertibleQuantization not invertible, introduces distortionCombination of encoder and decoder lossless

( )Q y( )C q ( )1C− c

Outline EE398

Bernd Girod: EE398 Image and Video Compression Introduction no. 20

Entropy and lossless coding techniquesRun-length coding, fax standardsArithmetic codingRate-distortion limits and quantizationLossless and lossy predictive codingTransform coding, JPEG standard Subband coding, wavelets, JPEG-2000Motion compensated coding, MPEG standards

EE398 Organisation

Bernd Girod: EE398 Image and Video Compression Introduction no. 21

Regularly check class home page: http://www.stanford.edu/class/ee398

AssistantsGeneral TA: David Rebollo-MonederoISE lab TA: Shantanu RaneCourse assistant: Kelly Yilmaz

EE398 Organisation (cont.)

Bernd Girod: EE398 Image and Video Compression Introduction no. 22

Homeworks7 problem sets, require computer + MatlabHanded out Tuesdays, due one week later, 2:45 p.m.

Grading Homeworks: 50%Term project: 50%No mid-term, no final

EE398B Term Projects - General

Bernd Girod: EE398 Image and Video Compression Introduction no. 23

Work in groups of 2-3 students, 40-50 hours per personProject proposal required, deadline: February 3Class-room presentations of projects: March 8/10Project report due: March 10Project grade based on

Technical quality, significance, and originality of results 50%Project report 25%Class-room presentation 25%

ISEP laboratory

Bernd Girod: EE398 Image and Video Compression Introduction no. 24

Created by equipment grants from Hewlett-Packard, Xerox, and IntelExclusively a teaching laboratoryLocation: Packard room 06620 Linux PCs, 2 Windows PCs, scanners, printers etc.Access:

door combination for lab entry will be provided by ISE Lab TAStanford ID chip card for after-hour entry of Packard buildingAccount on ise machine will be provided to all enrolled in class

Further reading

Bernd Girod: EE398 Image and Video Compression Introduction no. 25

Slides available as hand-outs and as pdf files on the webRecommended text books

D. S. Taubman, M. W. Marcellin, „JPEG2000 – Image Compression Fundamentals, Standards, and Practice,“ Kluwer Academic Publishers, 2002.Y. Wang, J. Ostermann, Y.-Q. Zhang, „Video Processing and Communications,“ Prentice-Hall, 2002.

Selected papers will be posted on Web site

Further reading (cont.)

Bernd Girod: EE398 Image and Video Compression Introduction no. 26

Reference books on image and video compressionA. N. Netravali, B.G. Haskell, "Digital Pictures - Representation and Compression", 2nd edit., New York, London: Plenum Press, 1995. Comprehensive standard book covering television standards and digital image compression. Has been augmented compared to the 1stedition from 1988, particularly to discuss the more recent standards, the greater part of the book, however, reflects the state-of-the-art of the mid-80s. Nevertheless, a must-have for image system engineers.

W. Pennebaker, J. Mitchell, "JPEG Still Image Data Compression Standard", Van Nostrand Reinhold, New York, 1993. THE source to read about JPEG, but also a nice presentation of basic material you need to understand the rationale behind it.

J. Mitchell, W. Pennebaker, C. Fogg, D. LeGall, "MPEG Video Compression Standard," Chapman & Hall, New York, 1996. Discusses MPEG-2 in detail, and some of the source coding principles, but those rather briefly. A book for practicioners.

B. Haskell, A. Puri, A. Netravali, "Digital Video: An Introduction to MPEG-2," Chapman & Hall, New York, 1996.Comprehensive coverage of MPEG-2, and also includes a chapter about MPEG-4. Some sections from Netravali & Haskell's "Digital Pictures" are included to provide background.

V. Bhaskaran, K. Konstantinides, "Image and Video Compression Standards," Kluwer Academic Publishers, 1995. Introduction to standards JPEG, MPEG-1 and MPEG-2, H.261. Emphasizes the foundations of the standards, rather than details. Predictive coding, transform coding, motion estimation and compensation, entropy coding.

A. K. Jain, "Fundamentals of Digital Image Processing", Prentice-Hall, 1989. Very readable and sound book that is popular as a text book for image processing classes. A lot of image processing material beyond compression.

Fundamental books that are not image/video specific:A. Gersho and R.M. Gray, "Vector Quantization and Signal Compression," Kluwer Academic Press, 1992. Principles and algorithms for

digital source coding, with applications to images, speech, and audio. The most comprehensive and substantial source coding reference, but very readable nevertheless. State-of-the-art.

N. S. Jayant, P. Noll, "Digital Coding of Waveforms," Prentice-Hall, 1984. In-depth coverage of algorithms for digital source coding, with emphasis on principles. Specific applications mostly to speech, but also to image. Scalar quantization, predictive coding, subband coding, transform coding. 15 years old, but nevertheless a valuable addition to the source coder‘s library.

Reading for this chapter

Bernd Girod: EE398 Image and Video Compression Introduction no. 27

Taubman+Marcellin, Chapters 2.1 and 2.2Girod, Gray, Kovacevic, Vetterli, "Image and Video Coding," SP Magazine. March 1998.