image processing architecture, © 2001-2004 oleh tretiakpage 1lecture 11 ecec 453 image processing...

41
Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 1 Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends Oleh Tretiak Drexel University

Upload: rodney-mckenzie

Post on 03-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 1Lecture 11

ECEC 453Image Processing Architecture

Lecture 11, 2/19/2004

MPEG and FriendsOleh Tretiak

Drexel University

Page 2: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 2Lecture 11

Lecture Outline Basic Video Coding Features of MPEG-1 Features of H261 MPEG-2 Introduction to MPEG-4

Page 3: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 3Lecture 11

Picture of LayersGOP-1GOP-NGOP-2IBBPBB ... PSlice-1Slice-NSlice-2Sequence LayerGOP layerPicture layermb-1mb-2mb-n012333YCrCbSlice layerMacroblock layerBlock layer

Page 4: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 4Lecture 11

Video Compression: Picture Types

Group of Pictures: Three types I — intraframe coding only P — predictive coding B — bi-directional coding

IPB12345678

Page 5: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 5Lecture 11

Typical MPEG coding parameters Typical sequence

IPBBPBBPBBPBBPBB (16 frames)

Picture Average size

Comp-ression

I 156000 6.5P 62000 16.4B 15000 67.6

Compression (GOP) = BitsPerFrameU ×NFramesPerGOP

BitsPerCodedGOPBitsPerCodedGOP=NI frames×(Bits/ Iframe)+NPframes×(Bits/Pframe)+

+NBframes×(Bits/Bframe)

Bits / Iframe =BitsPerFrameU/CI , Bits/ Pframe=BitsPerFrameU/CP

Bits /Bframe=BitsPerFrameU/CB

Compression (GOP) = NFramesPerGOP

NIframes / CI + NPframes /CP +NBframes/CB

= 161/ 6.5 + 5 / 16 .4 +10 / 67 .6

=26.4

Page 6: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 6Lecture 11

MPEG2 features Schemes for ‘frame’ and field coding. There are two fields in a frame, T (top) B (bottom) Either can be first

Frame prediction for frame pictures What’s there to say?

Field prediction for field pictures Target macroblock is in one field Prediction pixels come from one field Can be the same of different parity as target field

Field prediction for frame pictures Dual prime for P-pictures 16x8 macroblock for field pictures

Motion vectors coded at half-pel resolution

Page 7: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 7Lecture 11

MPEG2 - Alternate Scan

Zig-zag scan Alternate scan

Page 8: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 8Lecture 11

MPEG-4Multimedia Standard

Thumbnail Description

Page 9: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 9Lecture 11

What Is Left for MPEG-4? Initial goals

Coding standards for lower-than-MPEG-1 rates Hidden agenda: Incorporate new coding methods

Wavelet, fractal Revised agenda: Object-based coding

MPEG-4 Architecture Input to coder consist of audio, video, and stored objects Decoder combines encoded objects with local objects Example: send text by sending character codes, receiver uses

character generator.

Page 10: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 10Lecture 11

EncoderStoredObjects

Muxand

Demux

Audio-VideoObjects

Muxand

Demux

DecoderStoredObjectsCompositor

Schematic Overview of MPEG-4

Page 11: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 11Lecture 11

MPEG-4 Ideas Video Object Plane (VOP)

A VOP can be a natural image from video camera or from a graphics database

A VOP can consist of several visual object. Visual objects do not have to have rectangular outline (arbitrary shape)

A scene consists of several VO’s and VOP’s with appropriate compositing

Different VOP’s can have their own motion In principle, a visual scene can be decomposed into video

objects by segmentation. Color and texture can be attributes of visual objects A viewer can manipulate VO’s.

Page 12: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 12Lecture 11

Animation Objects Facial animation Body animation 2-D animation meshes

Page 13: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 13Lecture 11

2-D Animation Mesh

Page 14: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 15Lecture 11

Sprite coding

Background Plane

Sprite

Sprite

Composite

Page 15: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 16Lecture 11

Teleconferencing Standards Digital video areas

Broadcast television Recorded programs Two-way communications

Page 16: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 17Lecture 11

Review: Action in the Video Arena The sponsors: ITU/T SG 15 and ISO/IEC MPEG The players: H.x standards and MPEG-x standards Standards, ITU-T (Telecom Guys)

H.261 (1990) H.263 (draft March 1995) New standards in the works

Standards, ISO/IEC (Entertainment Video) MPEG family

Page 17: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 18Lecture 11

Review: Video Telephone System

H.320

H.200/AV.250 -Series

H.221H.261

Page 18: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 19Lecture 11

Review: H.261 Features Common Interchange Format

Interoperability between 25 fps and 30 fps countries 252 pix/line, 288 line, 30 fps noninterlace Terminal equipment converts frame and line numbers Y Cb Cr components, color sub-sampled by a factor of 2 in both

directions Coding

DCT, 8x8, 4 Y and 2 chrominance per masterblock I and P frames only, P blocks can be skipped Motion compensation optional, only integer compensation (Optional) forward error correction coding

Page 19: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 20Lecture 11

H.324/H.263 H.324: Like H.320

H.261/H.263

G.723.1

H.245signaling

H.253, H.234encryption

H.223

Page 20: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 21Lecture 11

Parts of H.324 H.263: Video coding for low rate communications G.723.1: Audio and speech for multimedia, 5.3 and 6.3 kbps H.223: Multiplexing protocol H.245: Control protocol. Can be used to specify standard, LAN,

and ATM networks

Page 21: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 22Lecture 11

Features of H.263 Intended for lower rates than H.261, including 28.8 kbit/sec

modem Includes QCIF(176 x144) and sub-QCIF format (128 x 96 in Y

channel) Optional error correction for mobile channels Half-pixel accuracy motion compensation Differential encoding of motion vectors Improved coding of DCT coefficients Optional advanced coding options

better SNR at the same rate, lower rate at the same SNR 50% more complex than basic H.261

Page 22: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 23Lecture 11

Picture Formats for H.263

Image Size

Format Y Cb, Cr

sub-QCIF 128 x 96 64 x 48

QCIF 176 x 144 88 x 72

CIF 352 x 288 176 x 144

ACIF 704 x 576 352 x 288

16CIF 1408 x 1152 704 x 576

Page 23: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 24Lecture 11

All JPEG, ~ 12 Kbytes551x369 389x261

231x155327x219

Page 24: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 25Lecture 11

Experimental Procedure Original image subsampled (using ® Photoshop) to various

resolutions (pixel number from max to max/8) Each subsampled image JPEG coded to various quality levels

with ® Matlab A group of images with ~ 12 Kbytes per image is compared Result: Subsampling + JPEG coding is better, at given total bits,

than just JPEG coding

Page 25: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 26Lecture 11

Future of Low-Rate Video Solution looking for a user ‘Picturephone’ - not popular

Liked by inventors, surveys of the public less then enthusiastic Videoconferencing: some success, but limited acceptance What is needed to make it successful?

Page 26: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 27Lecture 11

Video Coding Trials MPEG-1 encoder

http://bmrc.berkeley.edu/frame/research/mpeg/mpeg_encode.html Set encoder parameters

Picture sequence Motion compensation search range Motion compensation algorithm Quantizer parameters for I, P, B

Three trials ibbpbbpbbp I=8, P=10, B=25 795096 ibbpbbpbbp 31 31 31 311856 ippppppppp 31 31 31 209952

Page 27: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 28Lecture 11

‘High quality’ PATTERN: ibbpbbpbbp

RANGE: +/-10, HALF PSEARCH: LOGARITHMIC, BSEARCH: CROSS2 QSCALE: I=8, P=10, B=25

I FRAME SUMMARY Blocks: 330 ( 94083 bits) ( 285 bpb) Compression: 21:1 ( 1.1150 bpp)

P FRAME SUMMARY I Blocks: 89 ( 19554 bits) ( 219 bpb) P Blocks: 890 (111443 bits) ( 125 bpb) Skipped: 11 Compression: 46:1 ( 0.5182 bpp)

B FRAME SUMMARY I Blocks: 1 ( 148 bits) ( 148 bpb), B Blocks: 1883 ( 38486 bits) ( 20 bpb) B types: 173 ( 14 bpb) forw 291 ( 15 bpb) back 1419 ( 22 bpb) bi Skipped: 96 Compression: 309:1 ( 0.0775 bpp)

Total Compression: 76:1 ( 0.3137 bpp) 795096 bits/sec @ 30 fps

Page 28: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 29Lecture 11

Show MPEG

Page 29: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 30Lecture 11

‘Low Quality’ QSCALE: 31 31 31 Compression 195:1 ( 0.1230 bpp) Total Frames Per Second: 0.714286 (235 mi per frame) CPU Time: 1.388889 fps (458 mips) Total Output Bit Rate (30 fps): 311856 bits/sec

Show movie

Page 30: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 31Lecture 11

All P frames Sequence: ippppppppp QSCALE: 31 31 31 P FRAME SUMMARY

I Blocks: 75 ( 6641 bits) ( 88 bpb) P Blocks: 1559 ( 39500 bits) ( 25 bpb) Skipped: 1336

Total Compression: 289:1 ( 0.0828 bpp) Total Frames Per Second: 1.428571 (471 mi/frame) CPU Time: 2.702703 fps (891 mips) Total Output Bit Rate (30 fps): 209952 bits/sec Show video

Page 31: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 32Lecture 11

Digital Versatile Disk Digital Video (Versatile?) Disc (DVD) is a medium for the

distribution of from 4.7 to 17 billion bytes of digital data on a 120-mm (4.75 inch) disc. This huge volume of data (today's CD can store 680 million bytes of data) can be used to store up to nine hours of studio quality video and multi-channel surround-sound audio, highly interactive multimedia computer programs, 30 hours of CD-quality audio, or anything else that can be represented as digital data.

Same size as CD (compact disc)

Page 32: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 33Lecture 11

Physical parameters

CD: 1.6 µm track spacing, 0.83 µmbit spacing

DVD: 0.74 µm track spacing, 0.5 µmbit spacing

Page 33: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 34Lecture 11

DVD: Thickness profile

Page 34: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 35Lecture 11

Comparison: DVD vs. CD

DVD CD

Diameter 120mm 120 mm

Thickness 0.6 mm 1.2 mm

Track Pitch 0.74 µm 1.6 µm

Minimum Pit Length 0.40 µm 0.834 µm

Laser Wavelength 640 nm 780 nm

Data Capacity (per layer) 4.7 GB .68 GB

Layers 1,2,4 1

Page 35: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 36Lecture 11

DVD production

Page 36: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 37Lecture 11

DVD Player to Replace VHS

Estimated productions cost: $3.50 VHS, $1.00 DVD

Page 37: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 38Lecture 11

Next Generation ‘DVD’ Consortium sets new DVD standard (Blu-Ray) 20 February 2002

By using a 405 nm semiconductor laser, the new video-recording format enables 27 (23?) Gbyte - equivalent to thirteen hours of TV broadcasting - to be contained on a single-sided, single-layer 12 cm DVD.

Increased recording density is achieved using a 0.85 numerical aperture lens in combination with the 405 nm laser. A 0.1 mm optical transmittance protection layer is also used to minimize aberration caused by disc-tilt and give a better readout.

The companies involved are: Hitachi, LG Electronics, Matsushita, Pioneer, Philips, Samsung, Sharp, Sony and Thomson Multimedia. Notably absent from the consortium are Toshiba, one of the first companies to commercialize DVDs, and JVC which has a vested interest in the conventional video format.

Page 38: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 39Lecture 11

News Nine Blu-ray Disc Founder Companies Begin Licensing

of Disc February 14, 2003 (9:32 a.m. EST)   /PRNewswire-FirstCall/ -- Hitachi, Ltd., LG Electronics Inc., Matsushita

Electric Industrial Co., Ltd., Pioneer Corporation, Royal Philips Electronics, Samsung Electronics Co. Ltd., Sharp Corporation, Sony Corporation, and Thomson today announced the start of licensing of the rewritable format of "Blu-ray Disc", the large capacity optical disc utilizing blue-violet laser. Licensing will commence as of February 17, 2003. The introduction of products based on "Blu-ray Disc", the first optical disc format capable of recording High Definition broadcasts, will enable the enjoyment of even greater picture quality within the home.

http://www.blu-ray.com/

Page 39: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 40Lecture 11

DVD(?) Format War HD DVD

HD-DVD, also known as AOD (Advanced Optical Disc) is the name of a competing next-generation optical disc format developed by Toshiba and NEC. The format is similar to Blu-ray and also utilizes blue-laser technology to achieve a higher storage capacity. The rewritable versions of the discs will be able to hold 20GB on a single-layer disc and 32GB on a dual-layer disc, while the read-only discs only will be able to hold 15GB on a single-layer disc and 30GB on a dual-layer disc. The read-only version of the format has been approved by the DVD Forum as the successor to the current DVD technology.

Page 40: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 41Lecture 11

Comparison

Parameters BD DVD HD-DVD

Recording capacity 27GB 4.7GB 20GB

Number of layers single-layer single-layer single-layer

Laser wavelength 405nm 650nm 405nm

Numerical aperture (NA) 0.85 0.60 0.65

Protection layer 0.1mm 0.6mm 0.6mm

Data transfer rate 36Mbps 11Mbps 36Mbps

Video compression MPEG-2 MPEG-2 MPEG-4 AVC

Page 41: Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 11 ECEC 453 Image Processing Architecture Lecture 11, 2/19/2004 MPEG and Friends

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 42Lecture 11

Red Laser Pixonics Inc. Backward-compatible technology (new disc plays on

standard DVD palyer). Pixonics boasts that 3.5 hours of high-definition

programming can be stored on a DVD-9 disc with a 9 gigabyte capacity.