a metric for no reference video quality assessment for hd tv delivery based on saliency maps

17
A METRIC FOR NO-REFERENCE VIDEO QUALITY ASSESSMENT FOR HD TV DELIVERY BASED ON SALIENCY MAPS H. BOUJUT*, J. BENOIS-PINEAU*, T. AHMED*, O. HADAR** & P. BONNET*** *LaBRI UMR CNRS 5800, University of Bordeaux, France **Communication Systems Engineering Dept., Ben Gurion University of the Negev, Israel ***Audemat WorldCast Systems Group, France ICME 2011 – Workshop on Hot Topics in Multimedia Delivery (HotMD’11) 2011-07-11

Upload: christian-timmerer

Post on 25-May-2015

1.055 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: A metric for no reference video quality assessment for hd tv delivery based on saliency maps

A METRIC FOR NO-REFERENCE VIDEO QUALITY ASSESSMENT FOR HD TV

DELIVERY BASED ON SALIENCY MAPS

H. BOUJUT*, J. BENOIS-PINEAU*, T. AHMED*, O. HADAR** & P. BONNET***

*LaBRI UMR CNRS 5800, University of Bordeaux, France**Communication Systems Engineering Dept., Ben Gurion University of the Negev, Israel

***Audemat WorldCast Systems Group, France

ICME 2011 – Workshop on Hot Topics in Multimedia Delivery (HotMD’11)2011-07-11

Page 2: A metric for no reference video quality assessment for hd tv delivery based on saliency maps

OVERVIEW

Introduction

Focus Of Attention and Saliency Maps

Our approach: Weighted Macro Block Error Rate (WMBER) based on saliency maps a no reference video quality metric

Prediction of subjective quality metrics from objective quality metrics

Evaluation and results

Conclusion and future work

Page 3: A metric for no reference video quality assessment for hd tv delivery based on saliency maps

INTRODUCTIONMotivation

VQA for HD broadcast applicationsMeasure the influence of transmission loss on perceived quality

Video quality assessment protocolFull Reference (FR)

– SSIM (Z. Wang, A. Bovik)– A novel perceptual metric for video compression (A. Bhat, I. Richardson) PCS’09– Evaluation of temporal variation of video quality in packet loss networks (C. Yim, A. C. Bovik, 2011) Image

Communication 26 (2011)

Reduced Reference (RR)– A Convolutional Neural Network Approach for Objective Video Quality Assessment (P. Le Callet, C. Viard-Gaudin, D.

Barba) IEEE Transactions on Neural Networks 17.

No Reference (NR)– No-reference image and video quality estimation: Applications and human-motivated design (S. Hemami, A. Reibman)

Image Communication 25 (2010)

In this work:NR VQA with visual saliency in H.264/AVC framework

Contributions:Visual saliency map during compression processWMBER NR quality metricPrediction of subjective quality metrics from objective quality metrics

Page 4: A metric for no reference video quality assessment for hd tv delivery based on saliency maps

FOCUS OF ATTENTION AND SALIENCY MAPS

FOA is mostly attracted by salient areas which stand out from the visual scene.FOA is sequentially grabbed over the salient areas.Salient stimuli are mainly due to:

High colorContrastMotionEdge orientation

Original Frame Saliency mapTractor sequence (TUM/VQEG)

Page 5: A metric for no reference video quality assessment for hd tv delivery based on saliency maps

SALIENCY MAPS (1/2)

Video Frames

Global Motion Estimation

(GME)

Temporal filtering

Temporal Saliency Map

Video Frames

Spatial Filtering Normalization Spatial

Saliency Map

FusionSpatio-

temporal Saliency Map

Several methods for saliency map extraction already exist in the literature.All methods work in the same way [O. Brouard, V. Ricordel and D. Barba, 2009], [S. Marat, et al.,

2009]:Extraction of the spatial saliency map (static pathway)Extraction of the temporal saliency map (dynamic pathway)Fusion of the spatial and the temporal saliency maps (fusion)

Temporal saliency map Spatial saliency map Spatio-temporal saliency map

Page 6: A metric for no reference video quality assessment for hd tv delivery based on saliency maps

SALIENCY MAPS (2/2)

In this work we re-used the saliency map extraction method published at IS&T Electronic Imaging 2011 :

Based on the saliency map model from O. Brouard, V. Ricordel and D. Barba.Use partial decoding of H.264 stream to reach real-time performances.A fusion method to combine spatial and temporal saliency maps has been proposed.

We propose a new fusion method

Page 7: A metric for no reference video quality assessment for hd tv delivery based on saliency maps

SALIENCY MAP FUSION (1/2)

We use the multiplication fusion method and the logarithm fusion method , both weighted with a 5 visual deg. 2D Gaussian 2DGauss(s) to compare with our proposed fusion method.

Spatio-temporal saliency map

Page 8: A metric for no reference video quality assessment for hd tv delivery based on saliency maps

To produce spatio-temporal saliency map, we also propose a new fusion method

Similar fusion properties as Gives more weight to regions which have both:

– High spatial saliency– High temporal saliency

Do not provide null spatio-temporal saliency when temporal saliency is very low.

SALIENCY MAP FUSION (2/2)

Page 9: A metric for no reference video quality assessment for hd tv delivery based on saliency maps

WMBER VQ METRIC BASED ON SALIENCY MAPS (1/3)

Weighted Macro Block Error Rate (WMBER) is a No Reference metricVisual attention is focused on the saliency map

Video transmission artifacts may change the saliency mapWe propose to extract the saliency maps on the already broadcasted disturbed video stream.

WMBER also relies on MB error detection in the bit streamDC/AC and MV error detectionError propagation according to H.264 decoding process

WMBER is based on:MB error detectionWeighted by Saliency maps

Original transmission error Propagation of transmission errors

Page 10: A metric for no reference video quality assessment for hd tv delivery based on saliency maps

WMBER VQ METRIC BASED ON SALIENCY MAPS (2/3)

Decoder

MB error map

GME

Decoded Frame

Gradient energy

Saliency Map

&

X Σ

/Σ WMBER

Page 11: A metric for no reference video quality assessment for hd tv delivery based on saliency maps

WMBER VQ METRIC BASED ON SALIENCY MAPS (3/3)

When MB errors covers the whole frame and the energy of the gradient is high:

WMBER is high (near 1.0)

When there no MB errors or the energy of the gradient is low:WMBER is low (near 0.0)

The WMBER of a video sequence is the average WMBER of the frames.

Page 12: A metric for no reference video quality assessment for hd tv delivery based on saliency maps

SUBJECTIVE EXPERIMENT

Subjective experimentAccording to:

– VQEG Report on Validation of the Video Quality Models for High Definition Video Content (June 2010).

– ITU-R Rec. BT.500-11

20 HDTV (1920x1080 pixels) video sources (SRC) from :– The Open Video Project: www.open-video.org– NTIA/ITS– TUM/Taurus Media Technik– French HDTV

Measure the influence of transmission loss on perceived quality

– 2 loss models:– IP model (ITU-T Rec. G.1050)– RF (Radio Frequency) model

– 8 loss profiles were compared– 160 Processed Video Streams (PVS)

35 participants were gatheredMOS values were computed for each SRC and PVS.

Experiment room

Profile Loss Burst

IP Model

0 0.05% No

1 1% No

2 1% Yes

3 5% No

4 5% Yes

RF Model

5 0.01% No

6 0.1% No

7 1% No

Page 13: A metric for no reference video quality assessment for hd tv delivery based on saliency maps

SUBJECTIVE EXPERIMENT RESULTS

Source Profile 0 Profile 1 Profile 2 Profile 3 Profile 4 Profile 5 Profile 6 Profile 70

1

2

3

4

5

6

Sequence 0Sequence 1Sequence 2Sequence 3Sequence 4Sequence 5Sequence 6Sequence 7Sequence 8Sequence 9Sequence 10Sequence 11Sequence 12Sequence 13Sequence 14Sequence 15Sequence 16Sequence 17Sequence 18Sequence 19

Loss profiles

MO

S

Page 14: A metric for no reference video quality assessment for hd tv delivery based on saliency maps

We propose to use a supervised learning method to predict MOS values from WMBER or MSE

This prediction method is called: Similarity-weighted averageRequires a training data set of n known pairs (xi, yi) to predict y from x.

Here (xi, yi) pairs are WMBER or MSE values associated with MOS values.

y is the predicted MOS from a given WMBER/MSE x.The prediction is performed using (known as a weighted mean classifier):

PREDICTION OF SUBJECTIVE QUALITY METRICS FROM OBJECTIVE QUALITY METRICS

Page 15: A metric for no reference video quality assessment for hd tv delivery based on saliency maps

EVALUATION AND RESULTS

We compare 6 objective video quality metrics:MSEWMBER using the 5 v/deg 2D Gaussian (WMBER2DGauss)

WMBER using the multiplication fusion (WMBERmul)

WMBER using the log sum fusion (WMBERlog)

WMBER using the square sum fusion (WMBERsquare)

WMBER using the spatial saliency map (WMBERsp)

All metrics are computed for each 160 PVS + 20 SRC.6 data sets are built:

180 pairs Objective Metric/MOS

Each data set is split in 2 equal parts:Training set and Evaluation setThe Pearson Correlation Coefficient (PCC) is used for the evaluationCross validation MSE WMBER

2DGaussWMBER

logWMBER

spWMBER

mulWMBER square

IP Model

0.999 0.84 0.748 0.883 0.714 0.86

RF Model

0.987 0.877 0.763 0.9 0.786 0.895

0.1

0.3

0.5

0.7

0.9

1.1

PCC

Page 16: A metric for no reference video quality assessment for hd tv delivery based on saliency maps

CONCLUSION AND FUTURE WORK

We were interested in the problem of objective video quality assessment over lossy channels.

We followed the recent trends in the definition of spatio-temporal saliency maps for FOA.– New no reference metric : the WMBER based on saliency maps.– We bought a new solution for saliency maps fusion: the Square sum fusion.

We proposed a supervised learning method to predict subjective quality metric MOS from objective quality metrics.

– Similarity weighted average.– Gives better results than the conventional approach: polynomial fitting.

We intend to improve the saliency model to better consider:Transmission artifactsMasking effect in the neighborhood of high saliency areas.

We plan to evaluate the WMBER on the IRCCyN/IVC Eyetracker SD 2009_12 Database.

Page 17: A metric for no reference video quality assessment for hd tv delivery based on saliency maps

Thank you for your attention. Any questions ?