concepts of multimedia processing and transmission it 481, lecture #11 dennis mccaughey, ph.d. 20...

Concepts of Multimedia Concepts of Multimedia Processing and TransmissionProcessing and Transmission

IT 481, Lecture #11Dennis McCaughey, Ph.D.

20 November, 2006

08/28/2006IT 481, Fall 20062

Broadcast EnvironmentBroadcast Environment

08/28/2006IT 481, Fall 20063

The MPEG-4 Layered ModelThe MPEG-4 Layered Model

Compression Layer

Sync Layer

Delivery Layer

Media AwareDelivery UnawareMPEG-4 VisualMPEG-4 Audio

Media UnawareDelivery UnawareMPEG-4 Systems

Media UnawareDelivery AwareMPEG-4 DMIF

ElementaryStream Interface

(ESI)

DMIFApplication Interface

(DAI)

08/28/2006IT 481, Fall 20064

MPEG-4: Delivery Integration of Three MPEG-4: Delivery Integration of Three Major TechnologiesMajor Technologies

Internet,ATM,etc…

Cable,Satellite,

etc…

CD,DVD,etc…

The Broadcast Technology

The Disk TechnologyThe Interactive

Network Technology

08/28/2006IT 481, Fall 20065

MPEG-4: DMIF Communication MPEG-4: DMIF Communication ArchitectureArchitecture

08/28/2006IT 481, Fall 20066

MPEG-4: DMIF Communication MPEG-4: DMIF Communication ArchitectureArchitecture

DMIF (Delivery Multimedia Integration Framework) – It is a session protocol for the management of multimedia

streaming over generic delivery technologies. – In principle it is similar to FTP. The only (essential!) difference is

that FTP returns data, DMIF returns pointers to where to get (streamed) data

When FTP is run, – Very first action it performs is the setup of a session with the

remote side. – Later, files are selected and FTP sends a request to download

them, the FTP peer will return the files in a separate connection. When DMIF is run,

– Very first action it performs is the setup of a session with the remote side.

– Later, streams are selected and DMIF sends a request to stream them, the DMIF peer will return the pointers to the connections where the streams will be streamed, and then also establishes the connection themselves.

08/28/2006IT 481, Fall 20067

DMIF Computational ModelDMIF Computational Model

08/28/2006IT 481, Fall 20068

DMIF Service ActivationDMIF Service Activation

The Originating Application request the activation of a service to its local DMIF Layer – – a communication path between the Originating Application and

its local DMIF peer is established in the control plane (1) The Originating DMIF peer establishes a network session with

the Target DMIF peer – – a communication path between the Originating DMIF peer and

the Target DMIF Peer is established in the control plane (2) The Target DMIF peer identifies the Target Application and

forwards the service activation request – – a communication path between the Target DMIF peer and the

Target Application is established in the control plane (3) The peer Applications create channels (requests flowing

through communication paths 1, 2 and 3). – The resulting channels in the user plane (4) will carry the actual

data exchanged by the Applications. DMIF is involved in all four steps above.

08/28/2006IT 481, Fall 20069

DAIDAI

Compared to FTP, DMIF is both a framework and a protocol. – The functionality provided by DMIF is expressed

by an interface called DMIF-Application Interface (DAI), and translated into protocol messages. These protocol messages may differ based on the network on which they operate.

– The DAI is also used for accessing broadcast material and local files, this means that a single, uniform interface is defined to access multimedia contents on a multitude of delivery technologies.

08/28/2006IT 481, Fall 200610

DNIDNI

The DMIF Network Interface (DNI)- is introduced to emphasize what kind of information DMIF peers need to exchange;

It is an additional module ("Signaling mapping" in the figure) takes care of mapping the DNI primitives into signaling messages used on the specific Network.

Note that DNI primitives are only specified for information purposes, and a DNI interface need not be present in an actual implementation,.

08/28/2006IT 481, Fall 200611

MPEG-4 Video Bitstream Logical MPEG-4 Video Bitstream Logical StructureStructure

VOP1Video Object P lane


GOV1Group Of VOPs

GOV2Group Of VOPs

VOL1Video Object Layer



GOV1Group Of VOPs

GOV2Group Of VOPs

VOL2Video Object Layer

VO1Video Object

VO2Video Object

VS1Video Session

VS2Video Session

MPEG-4Video Bitstream

Logical Structure

Layer 1 Layer 2

08/28/2006IT 481, Fall 200612

Motion CompensationMotion Compensation

Three steps– Motion Estimation– Motion-compensation-based-prediction– Coding of the prediction error

MPEG-4 defines a bounding box for each VOP Macroblocks entirely within the VOP are referred

to as interior macroblocks Macroblocks straddling the VOP boundary are

called boundary macroblocks Motion compensation for interior macroblocks is

the same as MPEG-1&2 Motion compensation for boundary macroblocks

requires padding– Help match every pixel in the target VOP– Enforce rectangularity for block DCT encodeing

08/28/2006IT 481, Fall 200613

MPEG-4: Motion EstimationMPEG-4: Motion Estimation

Block-based techniques in MPEG-1 and MPEG-2 have been adopted to MPEG-4 VOP structure– I-VOP: Intra VOP– P-VOP: Predicted VOP based on previous VOP– B-VOP: Bidirectional Interpolated VOP predicted

based on past and future VOP Motion estimation (ME) only necessary for

P-VOPs and B-VOPs– Differentially coded from up to three Motion

Vectors– Variable length coding used for encoding MVs

08/28/2006IT 481, Fall 200614

MPEG-4 Texture CodingMPEG-4 Texture Coding

VOP texture information is in luminance and chrominance for I-VOP

For P-VOP and B-VOP, texture information represents residual information remaining after motion compensation

Standard 8x8 block-based DCT used– Coefficients quantized, predicted, scan and

variable length encoded– DC and AC coefficient prediction based on

neighboring blocks to reduce energy of quantized coefficients

08/28/2006IT 481, Fall 200615

Bounding Box & Boundary MacroblocksBounding Box & Boundary Macroblocks

Boundary Macroblock

Interior Macroblock

Bounding Box for the VOP

VOP

08/28/2006IT 481, Fall 200616

PaddingPadding

For all boundary macroblocks in the reference VOP– Horizontal Repetitive Padding– Vertical Repetitive Padding

For all exterior macroblocks outside the VOP, but adjacent to one or more boundary macroblocks– Extended padding

08/28/2006IT 481, Fall 200617

Horizontal Repetitive Padding AlgorithmHorizontal Repetitive Padding Algorithm

begin

for all rows in Boundary macroblocks in the reference VOP

if there exists a boundary pixel in the row

for all interval outside the VOP

if interval is bounded by only one boundary pixel b

assign the value b to all pixels in interval

elseif interval is bounded by two boundary pixels b1 and b2

assign the value (b1+ b2)/2 to all pixels in intervalend

08/28/2006IT 481, Fall 200618

Vertical Repetitive Padding AlgorithmVertical Repetitive Padding Algorithm

Horizontal algorithm applied to the columns

08/28/2006IT 481, Fall 200619

Original Pixels Within the VOPOriginal Pixels Within the VOP

45 52 6055

40 50 80 90

42 48 50

7060

08/28/2006IT 481, Fall 200620

Horizontal Repetitive PaddingHorizontal Repetitive Padding

45 52 606055 60

40 50 806565 90

42 48 505050 50

60 60 706060 70

08/28/2006IT 481, Fall 200621

Vertical Repetitive PaddingVertical Repetitive Padding

45 52 606055 60

40 50 806565 90

42 48 505050 50

51 54 605555 60

60 60 706060 70

08/28/2006IT 481, Fall 200622

Shape Adaptive Texture Coding for Shape Adaptive Texture Coding for Boundary macroblocksBoundary macroblocks

X X X X X X

X X X X X X

X

DCT-2

DCT-3

DCT-5

DCT-3

DCT-4

DCT-1DCT-6

DCT-5

DCT-4

DCT-2

DCT-1

08/28/2006IT 481, Fall 200623

ConsiderationsConsiderations

Total number of DCT coefficients equals the number of grayed pixels which is less than 8x8– Fewer computations than an 8x8 DCT

During decoding translations must be reversed so a binary mask of the original shape must be provided

08/28/2006IT 481, Fall 200624

MPEG-4 Shape CodingMPEG-4 Shape Coding

Binary shape coding Grayscale shape coding

08/28/2006IT 481, Fall 200625

Static Texture CodingStatic Texture Coding

08/28/2006IT 481, Fall 200626

Sprite CodingSprite Coding

08/28/2006IT 481, Fall 200627

Global Motion CompensationGlobal Motion Compensation

08/28/2006IT 481, Fall 200628

MPEG-4 ScalabilityMPEG-4 Scalability

Spatial and temporal scalability implemented using VOLs (video object layers)– Base and enhancement layers

08/28/2006IT 481, Fall 200629

ScalabilityScalability

There are several scalable coding schemes in MPEG-4 Visual:

Spatial Scalability– Spatial scalability supports changing the texture quality

(SNR and spatial resolution). Temporal Scalability Object-Based Spatial Scalability.

– Extends the 'conventional' types of scalability towards arbitrary shape objects, so that it can be used in conjunction with other object-based capabilities.

– This makes it possible to enhance SNR, spatial resolution, shape accuracy, etc, only for objects of interest or for a particular region, which can even be done dynamically at play-time.

08/28/2006IT 481, Fall 200630

Base and Enhancement Layer Behavior Base and Enhancement Layer Behavior (Spatial Scalability) (Spatial Scalability)

08/28/2006IT 481, Fall 200631

Two Enhancement Types in MPEG-4 Two Enhancement Types in MPEG-4 Temporal Scalability Temporal Scalability

1. Type I: The enhancement-layer improves the resolution of only a portion of the base-layer 2. Type II: The enhancement-layer improves the resolution of the entire base-layer.

In enhancement type I, only a selected region of the VOP (i.e. just the car) is enhanced, while the rest (i.e. the landscape) is not.

In enhancement type II, enhancement is applicable only at entire VOP level.

08/28/2006IT 481, Fall 200632

Subset of MPEG-4 Video Profiles and LevelsSubset of MPEG-4 Video Profiles and Levels

Profile and LevelTypical scene

size Bitrate (bit/sec)

Maximumnumber of objects

Total mblk memory (mblk units)

Simple Profile

L1 QCIF 64 k 4 198

L2 CIF 128 k 4 792

L3 CIF 384 k 4 792

Core ProfileL1 QCIF 384 k 4 594

L2 CIF 2 M 16 2376

Main Profile

L2 CIF 2 M 16 2376

L3 ITU-R 601 15 M 32 9720

L4 1920x1088 38.4 M 32 48960

08/28/2006IT 481, Fall 200633

MPEG-4 Natural & Synthetic Video CodingMPEG-4 Natural & Synthetic Video Coding

Synthetic 2D and 3D objects represented by meshes and surface patches– Synthetic VOs are animated by transforms and

special-purpose animation techniques– Representation of synthetic VOs based on

Virtual Reality Modeling Language (VRML) standard

For natural objects, a large portion of materials used in movie and TV production is blue-screened, making it easier to capture objects against a blue background

08/28/2006IT 481, Fall 200634

Integration of Face Animation with Integration of Face Animation with Natural VideoNatural Video

Three types of facial data: Facial Animation Parameters (FAP), Face Definition Parameters (FDP) and FAP Interpolation Table (FIT)– FAP allows the animation of a 3D facial model

available at the receiver– FDP allows one to configure the 3D facial model

to be used at the receiver– FIT allows one to define the interpolation rules

for the FAP at the decoder

08/28/2006IT 481, Fall 200635

Integration of Face Animation and Text-Integration of Face Animation and Text-to-Speech (TTS) Synthesisto-Speech (TTS) Synthesis

Synchronization of a FAP stream with TTS synthesizer possible only if encoder sends timing information

DecoderTTS Stream Propriety

SpeechSynthesizer

Phoneme/Bookmark to

FAP Converter

FaceRenderer

Compositor

Audio

Video

DVB-HDVB-H

08/28/2006IT 481, Fall 200637

DVB-H in a DVB-T NetworkDVB-H in a DVB-T Network

NOKIA

08/28/2006IT 481, Fall 200638

DVB-H ReceiverDVB-H Receiver

08/28/2006IT 481, Fall 200639

DVB-H System DVB-H System (Sharing a Mux with MPEG-2 Services)(Sharing a Mux with MPEG-2 Services)

08/28/2006IT 481, Fall 200640

Detail MPE-FECDetail MPE-FEC

08/28/2006IT 481, Fall 200641

DVB-T/H TransmitterDVB-T/H Transmitter

NOKIA

08/28/2006IT 481, Fall 200642

DVB-H Standards FamilyDVB-H Standards Family

NOKIA

08/28/2006IT 481, Fall 200643

ReferencesReferences

“MPEG-4 Natural Video Coding - An overview” Touradj Ebrahimi* and Caspar Horne**

J. Henriksson, “DVB-H, Standards Principles and Services”, Nokia HUT Seminar T-111.590 Helsinki Finland 2.24.2005

concepts of multimedia processing and transmission it 481, lecture #11 dennis mccaughey, ph.d. 20...

Documents