1 motivation video communication over heterogeneous networks –diverse client devices –various...

15
1 Motivation Motivation Video Communication over Heterogeneous Networks Diverse client devices Various network connection bandwidths Limitations of Scalable Video Coding Schemes Limited layers supported No video format changes Video Transcoding Provides Dynamic Solutions Channel bandwidth adaptation Video coding format adaptation

Upload: virgil-quinn

Post on 24-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

1

MotivationMotivation

• Video Communication over Heterogeneous Networks– Diverse client devices– Various network connection

bandwidths

• Limitations of Scalable Video Coding Schemes– Limited layers supported– No video format changes

• Video Transcoding Provides Dynamic Solutions– Channel bandwidth adaptation– Video coding format adaptation

2

Challenges in Video TranscodingChallenges in Video Transcoding

• Improve Efficiency of Video Transcoding– Large data volume

– High computational complexity

• Optimize Visual Quality for a Given Bit Rate– Human vision system (HVS) based video transcoding is desirable

D ecoding(Partia lly)

V ideoM anipu la tion

EntropyEncoding

01010111 1011011... ...

Input C om pressedV ideo S tream

O utput C om pressedV ideo S tream

V ideo T ranscoder

3

Proposed SolutionsProposed Solutions

• Exploit Foveation Property of the HVS in Video Transcoding

• Develop Fast Algorithms for Video Transcoding– DCT-domain foveation filtering technique

– Fast algorithms for DCT-domain inverse motion compensation• Local bandwidth constrained DCT-domain inverse motion compensation

• Look-up-table based DCT-domain inverse motion compensation

U niform R eso lu tionC om pressed V ideo

Foveated V ideoS tream

D ecoding(P artia lly)

V ideoM anipu la tion& Foveation

V ideoR e-encod ing

01010111 1011011... ...

Foveation E m bedded V ideo T ranscoder

4

FoveationFoveation

• The Human Eye Samples Visual Field Non-uniformly– The highest sampling resolution is at Fovea

– The sampling resolution decreases rapidly as away from Fovea

• Retinal Images are Inherently Non-uniform in Spatial Resolution

Eccentricity (left eye)

Eccentricity (deg)

Cel

ls p

er d

egre

e

5

Foveation ModellingFoveation Modelling• Foveated Contrast Threshold [Geisler & Perry 98]

• Foveated Cut-off Frequency fc

• Spatial Frequencies Beyond

the Cut-off Frequency is

Invisible (Foveated Image)

) (),(2

20 e

eefaexpCTefCT

• f: Spatial frequency (cyc/degree)• e: Retinal eccentricity(degree)• a: Spatial frequency decay constant

• e2: Half-resolution eccentricity• CT0: Minimum contrast threshold• CT: Contrast threshold

. CT . , e. a 64

1

76

1,321060 02

02

2 1ln

)( CTeea

efc

Loc

al c

ut-o

ff f

requ

ency

(cy

c/de

g)

Pixel position relative to foveation point (unit: pixel)

Image size: 512 x 512Unit of v: image height

6

JPEG-coded Uniform Image (168KB) JPEG-coded Foveated Image (136KB)

Foveated ImagesFoveated Images

Foveation point is marked by ‘X’

7

Foveated Contrast Sensitivity Function (FCSF)Foveated Contrast Sensitivity Function (FCSF)

• Foveated Contrast Sensitivity Function (FCSF)

• Shape the Compression Distortion According to FCSF

2

2

0

exp11

),(e

eefa

CTCTefFCSF

Image size: 512 x 512Viewing distance: 3 times the image height

Nor

mal

ized

con

tras

t sen

siti

vity

of

hum

an e

ye

Distance from foveation point (unit: pixel)

8

Video Transcoding ArchitectureVideo Transcoding Architecture

• Open-Loop Video Transcoding– Simple and fast

– Error drift

I P PP

Transcoding Error Propagation

R in R outD elay VLD

VLD B it A llocationAnalysis

VLCRequantization

VLC: Variable Length Coding

VLD: Variable Length Decoding

9

Drift Free Video TranscodersDrift Free Video Transcoders

• Cascaded Pixel Domain Video Transcoding– Low efficiency

– Long delay

• Fast Pixel Domain Video Transcoding– Save motion estimation, one frame memory and one IDCT operation

• Fast DCT-Domain Video Transcoding– No IDCT-DCT operations; Lower data volume

– DCT-domain inverse motion compensation is complex (Research topic)

Fast Pixel Domain Video Transcoder Fast DCT Domain Video Transcoder

R inR outC om plete

D ecoding (Q 1)Encoding

(Q 2)

IDCT

M C

DCT

R in R outQ 2

IQ 2

IQ 1

FM

-+

+

+

R in R outQ 2

IQ 2

IQ 1

-+

+

+

DCT-Dom ainInverse M otionCom pensation

10

Foveation Embedded DCT Domain Video TranscodingFoveation Embedded DCT Domain Video Transcoding

VLD : Variable Length Decoding

FM : Fram e Mem oryVLC : Variable Length CodingQ : Quantization

IQ : Inverse Quantization

MVR : Motion Vector Refinem entIMC : Inverse Motion Com pensationMVs : Motion Vectors

R 2(n)R 1(n)

+VLD IQ 1

+

Q 2 VLC

IQ 2

FM

D E C O D E R E N C O D E R

FM

MV

s

+

+

+D C T -D om ain

IM C

D C T -D om ainFoveation

D C T -D om ainM VR

D C T -D om ainIM C

+

In tra-coded fram e

MV

s

FoveationPoint

Selection

M Vs

11

Foveation FilteringFoveation Filtering

• Pixel Domain Foveation Filtering Technique [Lee, 99]

– High computational complexity

12

DCT-Domain Foveation FilteringDCT-Domain Foveation Filtering

• DCT-Domain Block Mirror Filtering [Rao, 90]

• Pros– Significantly simplified

– Combine with inverse quantization

– Easy to parallelize

• Cons– Blocking artifacts

fhfhDCT ~~ˆ

Filter Kernel

DCT of f

h~h

f

f~

1-D signa l

0

Blo

ckm

irrorin

g

0

H. R. Sheikh, S. Liu, B. L. Evans and A. C. Bovik,“Real-Time Foveation Techniques for H.263 Video Encoding in Software”, ICASSP 2001.

13

Multipoint Video ConferencingMultipoint Video Conferencing

Internet

M ultipo int C ontro l U nit(M C U )

BU F VLD

BU F VLD

BU F VLD

. .

.

. .

.

V ideocom biner

FoveationE m bedded D C T

dom ain videotranscoder

BU F

BU F

BU F

. .

.

to conferee #1

to conferee #2

to conferee #N

from conferee #1

from conferee #2

from conferee #N

H. R. Sheikh, S. Liu, Z. Wang and A. C. Bovik,“Foveated Multipoint Videoconferencing at Low Bit Rates”, ICASSP 2002, accepted.

14

Simulation ResultsSimulation Results

Uniform resolution video at 256 kb/s Foveated video at 256 kb/s

Foveation point is at the center of the upper-left quadrant

15

Foveation Point SelectionFoveation Point Selection

• Interactive Methods– Mouse, eye tracker

– Reverse channel is assumed

– End to end delay is assumed short enough

• Automatic Methods– Fixation points analysis (Very challenging)

– Application oriented methods

• DCT-Domain Human Face Detection [Wang & Chang, 97]

– Skin color region segmentation

– Face template constraint

– Spatial Verification